Scenario: Your company offers a machine learning model as a stateless web API. The API currently runs on a single Google Kubernetes Engine (GKE) cluster in asia-southeast1. Your company has started attracting customers from Europe, and you want to reduce latency for users in that region while ensuring high availability. Question: What should you do?