A data ingestion task requires a one-TB databricks video

 ·  PT1H46M27S  ·  EN

data-engineer-professional video for a data ingestion task requires a one-TB JSON dataset to be written out to Parquet with a target part-file size of 512 MB.

Full Certification Question

A data ingestion task requires a one-TB JSON dataset to be written out to Parquet with a target part-file size of 512 MB. Because Parquet is being used instead of Delta Lake, built-in file-sizing features such as Auto-Optimize & Auto-Compaction cannot be used. Which strategy will yield the best performance without shuffling data?