This is a dedicated watch page for a single video.
A data engineer has created a new job with a single notebook task intended to run daily at 5 PM. The task intends to run the notebook named show_regular.py which contains 5 cells where each cell outputs data in the form of a PySpark DataFrame. The size of the outputs are as follows: Cell 1: 5.2 MB Cell 2: 6.1 MB Cell 3: 4.4 MB Cell 4: 5.3 MB Cell 5: 2.5 MB The job fails on its initial run and the data engineer is clueless about the failure. Which of the data constraint forced the job to fail?