How to Debug and Troubleshoot Common Problems in Kubernetes Deployments

Kubernetes is the reigning market leader when it comes to container orchestration! Any organization working with the container ecosystem is either already using Kubernetes or considering it. However, despite the undoubted ease and speed Kubernetes bring to the container ecosystem, they also need specialized expertise to deploy and manage.

Many organizations consider the DIY approach to Kubernetes and if you have an in-house IT team with the requisite experience or if your requirements are large enough to justify the cost of hiring a dedicated Kubernetes team – then an internal Kubernetes strategy could certainly be beneficial.

However, if you don’t fall in the category mentioned above, then managed Kubernetes is the smartest and most cost-effective way ahead. With professionals in the picture, you can be assured of getting long term strategy, seamless implementation, and dedicated on-going service, which will

  • reduce deployment time
  • provide 24×7 support
  • handle all upgrades and fixes
  • troubleshoot as and when needed

Kubernetes solution providers offer a wide range of services – from fully managed to bare bone implementation to preconfigured Kubernetes environments on SaaS models to training for your in-house staff.

Look at your operational needs and your budget and explore the market for Kubernetes services options before you pick the service and the digital partner that ticks all your boxes.   

Meanwhile, do look at our tutorial on troubleshooting Kubernetes deployments.

Kubernetes deployments issues are not always easy to troubleshoot. In some cases, the errors can be resolved easily, and, in some cases, detecting errors requires us to dig deeper and run various commands to identify and resolve the issues.

The first step is to list down all pods after installing your application. The following command lists down all pods in all namespaces.

kubectl get pods -A

If you find any issues on the pod status, you can then use kubectl describe, kubectl logs, kubectl exec commands to get more detailed information.

Debugging Pods
Pod Status Shows ImagePullBackOff or ErrImagePull

This status indicates that your pod could not run because the pod could not pull the image from the container registry. To confirm this, run the kubectl describe command along with the pod identifier to display the details.

kubectl describe pod <pod-identifier>

This command will provide more information about the issue.

  • Image name or tag incorrect.
    • Check the image name and tag and try to pull the image manually on the host using docker pull to verify.
  • Authentication failure related to Container registry.
    • Check the secrets, roles, service principal related to your container registry and try to pull the image manually on the host using docker pull to verify.
docker pull <image-name:tag> 
Pod Status Shows Waiting

This status indicates your pod has been scheduled to a worker node, but it can’t run on that machine. To confirm this, run the kubectl describe command along with the pod identifier to display the details.

kubectl describe pod <pod-identifier> -n <namespace>

The most common causes related to this issue are

  • Image name or tag incorrect.
    • Check the image name and tag and try to pull the image manually on the host using docker pull to verify.
  • Authentication failure related to Container registry.
    • Check the secrets, roles, service principal related to your container registry and try to pull the image manually on the host using docker pull to verify.
Pod Status Shows Pending or CrashLoopBackOff

This status indicates your pod could not be scheduled on a node for various reasons like resource constraints (insufficient CPU or memory resources), volume mounting issues.  To confirm this, run the kubectl describe command along with the pod identifier to display the details.

kubectl describe pod <pod-identifier> -n <namespace>

This command will provide more information about the issue. Most common issues are

  • Insufficient resources
    • If resources are insufficient, clean up your existing resources or scaling your nodes (vertically or horizontally) to increase the resources.
  • Volume mounting
    • Check your volume’s mounting definition and storage classes.
  • Using hostPort
    • When you bind a Pod to a hostPort, there are a limited number of places that a pod can be scheduled. In most cases, hostPort is unnecessary, try using a Service object to expose your pod. If you do require hostPort, then you can only schedule as many Pods as there are nodes in your Kubernetes cluster
Pod is crashing or unhealthy

Sometimes the scheduled pods are crashing or unhealthy.  Run kubectl logs to find the root cause.

kubectl logs <pod_identifier> -n <namespace>

If you have multiple containers, run the following command to find the root cause.

kubectl logs <pod_identifier> -c <container_name> -n <namespace>

If your container has previously crashed, you can access the previous container’s crash log with:

kubectl logs –previous <pod_identifier> -c <container_name> -n <namespace>

If your pod is running but with 0/1 ready state or 0/2 ready state (in case if you have multiple containers in your pod), then you need to verify the readiness. Check the health check (readiness probe) in this case.

Most common issues are

  • Application issues
    • Run the below command to check the logs.
               kubectl logs <pod_identifier> -c <container_name> -n <namespace>
  • Run the below command to verify the events.
               kubectl describe <pod_identifier> -n <namespace>
  • Readiness probe health check failed
    • Check the health check (readiness probe) in this case. Also, check the READY column of the kubectl get pods output to find out if the readiness probe is executing positively.
    • Run the below command to check the logs.
         kubectl logs <pod_identifier> -c <container_name> -n <namespace>
  • Run the below command to verify the events.
         kubectl describe <pod_identifier> -n <namespace>
  • Liveness probe health check failed
    • Check the health check (liveness probe) in this case. Also, check the RESTARTS column of the kubectl get pods output. To find out if the liveness probe is executing positively.
    • Run the below command to check the logs.
         kubectl logs <pod_identifier> -c <container_name> -n <namespace>
  • Run the below command to verify the events.
         kubectl describe <pod_identifier> -n <namespace>
Pod is running but has application issues

In some cases, the pods are running, but the output of the application is incorrect. In this case, you should run the following to find the root cause.

  • Run the below command and identify the issue.
kubectl logs <pod_identifier> -c <container_name> -n <namespace>
  • If you are interested in the last n lines of logs run
kubectl logs <pod_identifier> -c <container_name> --tail <n-lines> -n <namespace>
  • Run the commands inside the container using
kubectl exec <pod_identifier> -c <container_name> /bin/bash -n <namespace>

Run the commands like ‘curl’ or ‘ps’ ‘ls’ to troubleshoot the issue after you get into the container.

Pod is running and working but cannot access through services

In some cases, the pods are working as expected but cannot access through the services. Most common causes of this issue are

  • Service not registered properly
    • Check that the service exists and describe the service and validate the pod selectors to run the following commands.
kubectl get svc
kubectl describe svc <svc-name>
kubectl get endpoints
  • Run the following commands to verify pod selector
kubectl get pods --selector=name={name},{label-name}={label-value}
  • The service may be deployed in a different namespace.
    • Verify that the pod’s containerPort matches up with the Service’s containerPort
  • Service is registered properly but has a DNS issue
    • Get into the container using exec command and run nslookup using the following command
kubectl get endpoints
kubectl exec <pod_identifier> -c <container_name> /bin/bash
nslookup <service-name>
  • If you have any issues to run the command for curl or nslookup. Deploy debugging pod using image yauritux/busybox-curl in the same namespace to verify. Please run the following commands to verify
kubectl run --generator=run-pod/v1 -it --rm <name> --image=yauritux/busybox-curl -n <namespace>
  • Run the following to verify within the container
curl http://<servicename>
telnet <service-ip> <service-port>
nslookup <servicename>

Share this:

CloudIQ is a leading Cloud Consulting and Solutions firm that helps businesses solve today’s problems and plan the enterprise of tomorrow by integrating intelligent cloud solutions. We help you leverage the technologies that make your people more productive, your infrastructure more intelligent, and your business more profitable. 

US

626 120th Ave NE, B102, Bellevue,

WA, 98005.

 sales@cloudiqtech.com

INDIA

No. 3 & 4, Venkateswara Avenue,Bazaar Main Rd, Ramnagar South, Madipakkam, Chennai – 600091


© 2019 CloudIQ Technologies. All rights reserved.