A Generative AI Engineer is creating a batch inference workflow that processes legal documents nightly. Each document is passed to a chain that extracts key clauses and summarizes them. The goal is to scale the pipeline, track usage, and support asynchronous processing across thousands of records using Databricks. Which approach should the engineer use to design this batch inference system efficiently?