Tasks

Tasks
Example Task Template
Extend kubectl with plugins
Manage HugePages
Schedule GPUs
Manage Memory, CPU, and API Resources
Access Clusters Using the Kubernetes API
Access Services Running on Clusters
Advertise Extended Resources for a Node
Autoscale the DNS Service in a Cluster
Change the Reclaim Policy of a PersistentVolume
Change the default StorageClass
Cluster Management
Configure Default CPU Requests and Limits for a Namespace
Configure Default Memory Requests and Limits for a Namespace
Configure Memory and CPU Quotas for a Namespace
Configure Minimum and Maximum CPU Constraints for a Namespace
Configure Minimum and Maximum Memory Constraints for a Namespace
Configure Multiple Schedulers
Configure Out Of Resource Handling
Configure Quotas for API Objects
Configure a Pod Quota for a Namespace
Control CPU Management Policies on the Node
Customizing DNS Service
Debugging DNS Resolution
Declare Network Policy
Developing Cloud Controller Manager
Encrypting Secret Data at Rest
Guaranteed Scheduling For Critical Add-On Pods
IP Masquerade Agent User Guide
Kubernetes Cloud Controller Manager
Limit Storage Consumption
Namespaces Walkthrough
Operating etcd clusters for Kubernetes
Persistent Volume Claim Protection
Reconfigure a Node's Kubelet in a Live Cluster
Reserve Compute Resources for System Daemons
Romana for NetworkPolicy
Safely Drain a Node while Respecting Application SLOs
Securing a Cluster
Set Kubelet parameters via a config file
Set up High-Availability Kubernetes Masters
Set up a High-Availablity Etcd Cluster With Kubeadm
Share a Cluster with Namespaces
Static Pods
Storage Object in Use Protection
Use Calico for NetworkPolicy
Use Cilium for NetworkPolicy
Use Kube-router for NetworkPolicy
Using CoreDNS for Service Discovery
Using Sysctls in a Kubernetes Cluster
Using a KMS provider for data encryption
Weave Net for NetworkPolicy

Edit This Page

Troubleshoot Clusters

This doc is about cluster troubleshooting; we assume you have already ruled out your application as the root cause of the problem you are experiencing. See the application troubleshooting guide for tips on application debugging. You may also visit troubleshooting document for more information.

Listing your cluster

The first thing to debug in your cluster is if your nodes are all registered correctly.

Run

kubectl get nodes

And verify that all of the nodes you expect to see are present and that they are all in the Ready state.

Looking at logs

For now, digging deeper into the cluster requires logging into the relevant machines. Here are the locations of the relevant log files. (note that on systemd-based systems, you may need to use journalctl instead)

Master

Worker Nodes

A general overview of cluster failure modes

This is an incomplete list of things that could go wrong, and how to adjust your cluster setup to mitigate the problems.

Root causes:

Specific scenarios:

Mitigations:

Analytics

Create an Issue Edit this Page