An ML engineer has built a deep learning model and now wants to deploy it using the SageMaker Hosting Services. For inference, the engineer wants a cost-effective option that guarantees low latency but still comes at a fraction of the cost of using a GPU instance for the endpoint. As an AWS ML Specialist, which of the following would you recommend for the given use case?