AWS Exams GCP Exams Azure Exams GitHub Exams Jira Exams ISC2 Exams

Video: A Kafka stream that acts as an upstream databricks video

Question 1 Be Honest
« Back   Next databricks Cloud data-engineer-professional Question »
Answer

Full Certification Question

A Kafka stream that acts as an upstream system in an ETL framework tends to produce duplicate values within a batch. The streaming query reads the data from the source and writes to the downstream delta table using the default trigger interval. If the upstream system emits the data every 20 minutes, which of the following strategies can be used to remove the duplicates before saving the data to the downstream delta table while keeping the costs low?