This is a dedicated watch page for a single video.
You hold the role of Operations Lead during an ongoing incident involving one of your services. Typically, the service operates at approximately 70% capacity. However, you've observed that one specific node is consistently returning 5xx errors for all incoming requests, and there has been a noticeable surge in customer support cases. Your objective is to remove the problematic node from the load balancer pool to isolate and investigate the issue, all while adhering to Google-recommended practices to effectively manage the incident and minimize user impact. What should you do?