Watch this video on YouTube
When deploying a machine learning model for real-time inference on Amazon EC2, which configuration best optimizes cost without compromising performance during unpredictable spikes in request volume?