Certification Practice Exams with Real Test Questions & Answers

You designed a 5-billion-parameter language model gcp video

 ·  PT1H46M27S  ·  EN

ml-engineer-pro video for you designed a 5-billion-parameter language model in TensorFlow Keras that used autotuned tf.data to load the data in memory. You

Full Certification Question

You designed a 5-billion-parameter language model in TensorFlow Keras that used autotuned tf.data to load the data in memory. You created a distributed training job in Vertex AI with tf.distribute.MirroredStrategy , and set the large_model_v100 machine for the primary instance. The training job fails with the following error: “The replica 0 ran out of memory with a non-zero status of 9.” You want to fix this error without vertically increasing the memory of the replicas. What should you do?