This is a dedicated watch page for a single video.
You're tasked with establishing a data pipeline to replicate time-series transaction data for querying within BigQuery by your data science team for analysis. Every hour, thousands of transactions undergo status updates. The initial dataset size is 1.5 PB, increasing by 3 TB daily. The data is highly structured, and machine learning models will be developed based on it. Your aim is to optimize performance and usability for your data science team. Which two strategies should you prioritize? (Select two.)