Documentation for automated readers
A curated documentation index is available at: https://grafana.com/llms.txt
A complete documentation index is available at: https://grafana.com/llms-full.txt
These indexes can help with page discovery before fetching individual documents.
This page is also available in Markdown, which may be easier for automated readers and AI tools to parse than HTML. The Markdown version is available at https://grafana.com/docs/grafana-cloud/monitor-infrastructure/kubernetes-monitoring/triage-your-infrastructure/manage-availability.md, or by sending Accept: text/markdown to https://grafana.com/docs/grafana-cloud/monitor-infrastructure/kubernetes-monitoring/triage-your-infrastructure/manage-availability/. For broader documentation discovery, the curated index is available at https://grafana.com/llms.txt and the complete index is available at https://grafana.com/llms-full.txt.
Manage availability
The Availability section on Kubernetes Overview answers one question: is your infrastructure currently able to serve user traffic? It flags things that exist on paper but aren’t actually available.
Availability checks identify workloads and nodes that are down or unable to serve traffic.

Click View detail on any tile to see the affected items listed under Detail view at the bottom of the page.
Zero replica deployments
These are deployments that are configured to run at least one replica but have zero available replicas running. The workload is fully down. This excludes deployments intentionally scaled to zero.
Deployment rollout issues
These are deployments whose rollout has one of these conditions:
Not Progressingmeans the deployment controller has not made progress within the deadline.Replica Failuremeans at least one replica Pod could not be created or deleted.
Nodes not ready
These are Nodes where the Ready condition is False or Unknown. A NotReady node prevents new Pods from being scheduled and may disrupt running workloads. The Status column distinguishes a confirmed NotReady state from a transient Unknown state (meaning the node is unreachable).
kubelet crash or failure to report status, Node running out of memory, disk, or PIDs, network connectivity loss between the Node and the control plane, underlying VM or hardware failure, expired Node certificates, kernel or OS-level crash.kubelet logs and Node events. Restart the kubelet, free up Node resources, restore network connectivity, renew certificates, or replace the failed Node.Pods not ready
These are Pods in the Running phase that are failing their readiness probe. They are excluded from Service endpoints and are not receiving traffic.
Was this page helpful?
Related resources from Grafana Labs


