You have developed a Transformer model in TensorFlow for text translation, using a training dataset that contains millions of documents in a Cloud Storage bucket. To expedite the training process through distributed training and minimize the need for code modifications and cluster configuration management, what should you do?