AWS Exams GCP Exams Azure Exams GitHub Exams Jira Exams ISC2 Exams

Video: The data engineering team has a large Delta databricks video

Question 1 Be Honest
« Back   Next databricks data-engineer-professional Question »
Answer

Full Certification Question

The data engineering team has a large Delta Lake table named ‘user_posts’ which is partitioned over the ‘year’ column. The table is used as an input streaming source in a streaming job. The streaming query is displayed below with a blank: spark . readStream . table ( "user_posts" ) ________________ . groupBy ( "post_category" , "post_date" ) . agg ( count ( "psot_id" ). alias ( "posts_count" ), sum ( "likes" ). alias ( "total_likes" )) . writeStream . option ( "checkpointLocation" , "dbfs:/path/checkpoint" ) . table ( "psots_stats" ) They want to remove previous 2 years data from the table without breaking the append-only requirement of streaming sources. Which option correctly fills in the blank to enable stream processing from the table after deleting the partitions ?