You are designing a machine learning model that needs to handle very large datasets which typically cause memory errors on standard instances. You choose Amazon SageMaker for its scalability and integration features. What is the MOST effective way to handle large datasets in a SageMaker environment while optimizing for cost and performance?