data-engineer-pro video for scenario: You are building a system to analyze seismic data. Your extract, transform, and load (ETL) process runs as a series of
Scenario: You are building a system to analyze seismic data. Your extract, transform, and load (ETL) process runs as a series of MapReduce jobs on an Apache Hadoop cluster. The ETL process takes several days due to computationally expensive steps. You discover that a sensor calibration step was missed. Question: How should you adjust your ETL process to ensure sensor calibration is applied systematically in the future?