You are designing a data processing pipeline for an e-commerce company that collects transactional data in JSON format from multiple sources and stores it in Cloud Storage. You need to transform the data to enforce a consistent schema and remove personally identifiable information (PII) before loading it into BigQuery for analytical queries. What is the best data manipulation methodology for this use case?