This is a dedicated watch page for a single video.
You are an ML engineer at a retail company using an Amazon SageMaker model to generate real-time product recommendations. During peak shopping periods, traffic to the recommendation endpoint can spike dramatically. The company needs to ensure the endpoint handles these spikes without compromising response time or customer experience, while also optimizing costs by scaling down during low demand. Which scaling policy is MOST SUITABLE for this scenario, and why?