machine-learning video for a marketing analytics company is building a customer segmentation model to identify groups of users based on their purchasing
A marketing analytics company is building a customer segmentation model to identify groups of users based on their purchasing behavior and engagement. The dataset includes user demographic data, transaction history, and website interaction logs. The demographic data and transaction history are stored in Amazon S3, while website interaction logs are stored in an on-premises PostgreSQL database. The dataset includes a mix of categorical features (e.g., "region," "customer tier") and numerical features (e.g., "purchase amount," "session duration"). To improve the model's performance, the ML engineer must transform and preprocess the data. The solution must minimize operational overhead while ensuring the dataset is ready for model training. Which solution will meet these requirements with the LEAST operational overhead?