AWS Exams GCP Exams Azure Exams GitHub Exams Jira Exams ISC2 Exams

Video: A data ingestion task requires a one-TB databricks video

Question 1
« Back   Next data-engineer-professional Certification Question »

Full Certification Question

A data ingestion task requires a one-TB JSON dataset to be written out to Parquet with a target part-file size of 512 MB. Because Parquet is being used instead of Delta Lake, built-in file-sizing features such as Auto-Optimize & Auto-Compaction cannot be used. Which strategy will yield the best performance without shuffling data?