Pricing

Outage in Google Cloud

Global: Calico enabled GKE clusters’ pods may get stuck Terminating or Pending after upgrading to 1.22+

Resolved Minor

September 15, 2022 - Started almost 3 years ago - Lasted 7 days

Need to monitor Google Cloud outages?
Stay on top of outages with IsDown. Monitor the official status pages of all your vendors, SaaS, and tools, including Google Cloud, and never miss an outage again.
Start Free Trial

Outage Details

Summary: Global: Calico enabled GKE clusters’ pods may get stuck terminating after upgrading to 1.22+ Description: GKE clusters running versions 1.22 or later and that use Calico Network Policy might experience issues with terminating Pods under some conditions. Our engineering team continues to investigate the issue and are qualifying a potential mitigation for release to the Rapid channel 1.24. After all the qualifications are done, we will expedite the backport of the fix to 1.22 as soon as possible. We will provide an update by Friday, 2022-09-16 15:00 US/Pacific with current details. We apologize to all who are affected by the disruption. Diagnosis: The Calico CNI plugin will show the following error terminating Pods: “Warning FailedKillPod 36m (x389 over 121m) kubelet error killing pod: failed to "KillPodSandbox" for "af9ab8f9-d6d6-4828-9b8c-a58441dd1f86" with KillPodSandboxError: "rpc error: code = Unknown desc = networkPlugin cni failed to teardown pod "myclient-pod-6474c76996" network: error getting ClusterInformation: connection is unauthorized: Unauthorized" Workaround: Affected customers may try the following: 1. Restart the kubelet and calico-node can help getting the pods unstuck. 2. Disable the Calico network policy. (workaround #1 is recommended, as this workaround is only viable if the customer does not have a strong need for Calico).

Components affected

Google Kubernetes Engine

Latest Google Cloud outages

We are investigating elevated error rates with multiple products in us-east1 - 15 days ago

Multiple GCP products are experiencing Service issues - about 2 months ago

Google Compute Engine (GCE) issue impacting multiple dependent GCP services across zones - 2 months ago

Customers are experiencing connectivity issues with multiple Google Cloud services in zone us-east5-c - 4 months ago

Apigee customers may experience unable to login to Admin UI portal. - 5 months ago

Be the First to Know When Vendors Go Down

With IsDown, you can monitor all your critical services' official status pages from one centralized dashboard and receive instant alerts the moment an outage is detected. Say goodbye to constantly checking multiple sites for updates and stay ahead of outages with IsDown.

Start free trial

No credit card required · Cancel anytime · 4400 services available

Integrations with Slack Microsoft Teams Google Chat Datadog PagerDuty Zapier Discord Webhook