You are training a TensorFlow-based machine learning model on a Compute Engine VM using the n2-standard-32 machine type. The training process currently takes around two days to complete . The model includes custom TensorFlow operations that require partial execution on CPU . Your objective is to reduce training time while keeping the solution cost-effective . What should you do?