This is a dedicated watch page for a single video.
You are using Dataproc with a combination of on-demand and preemptible workers for cost optimization. Some jobs are taking longer than expected because tasks are being interrupted when preemptible instances are shut down. What should you do to improve job reliability while minimizing costs?