A company needs to process and transform large volumes of unstructured text data from various sources into a clean, consistent format suitable for training a generative AI model. Which Google Cloud service is commonly used for large-scale, parallel data processing and transformation pipelines like this?