You are building a machine learning data pipeline that performs distributed data preprocessing on large volumes of training data using Amazon SageMaker Processing jobs. The preprocessing script is I/O-intensive, needs high throughput access to a shared dataset (~20 TB) via a POSIX-compliant file system, and runs across multiple instances in parallel. The solution should minimize operational overhead and be cost-effective. Which of the following storage options are most appropriate for this scenario? (Select two )