Full AWS Practitioner Certification Question

An AI startup is deciding which type of inference (batch or real-time) to use for processing user requests. The startup expects millions of daily requests, each requiring a near-instant response. Which approach meets these requirements most effectively?