A junior data engineer has been asked to databricks video

 ·  PT1H46M27S  ·  EN

data-engineer-professional video for a junior data engineer has been asked to develop a streaming data pipeline with a grouped aggregation using DataFrame df.

Full Certification Question

A junior data engineer has been asked to develop a streaming data pipeline with a grouped aggregation using DataFrame df. The pipeline needs to calculate the average humidity and average temperature for each non-overlapping five-minute interval. Incremental state information should be maintained for 10 minutes for late-arriving data. Streaming DataFrame df has the following schema: " device_id INT, event_time TIMESTAMP, temp FLOAT, humidity FLOAT ". Code block: df . _____________ . groupBy ( window ( "event_time" , "5 minutes" ). alias ( "time" ), "device_id" ) . agg ( avg ( "temp" ). alias ( "avg_temp" ), avg ( "humidity" ). alias ( "avg_humidity" ) ) . writeStream . format ( "delta" ) . outputMode ( "append" ) . saveAsTable ( "sensor_avg" ) Choose the response that correctly fills in the blank within the code block to complete this task.