This is a dedicated watch page for a single video.
A company uses a fleet of Amazon EC2 instances to ingest JSON data from on-premises sources at rates up to 1 MB/s. When an EC2 instance reboots, data in flight is lost. The data science team requires near-real-time querying of the ingested data. Which solution provides scalable, near-real-time data querying with minimal data loss?