data-engineer video for a media company uses an ad-hoc Kinesis Firehose-based solution to ingest raw data in JSON format and then deliver it to an Amazon S3
A media company uses an ad-hoc Kinesis Firehose-based solution to ingest raw data in JSON format and then deliver it to an Amazon S3 bucket. The data engineering team at the company uses Apache Spark SQL to analyze this data via Amazon EMR, which is configured to use AWS Glue Data Catalog as the metastore. An AWS Glue crawler runs every four hours to update the schema of the data catalog. The team has noticed that it sometimes obtains outdated data. You have been hired by the company as an AWS Certified Data Engineer Associate to build a solution for ensuring that the team always has access to the current data. Which of the following represents the best solution to meet this requirement?