Monitoring, Logging, Troubleshooting

Contents

Metrics

Two popular Kubernetes monitoring/metrics solutions are the Kubernetes Metrics Server and Prometheus.

Metrics Server is a cluster-wide aggregator of resource usage data - a relatively new feature in Kubernetes.
Prometheus, part of CNCF (Cloud Native Computing Foundation), can also be used to scrape the resource usage from different Kubernetes components and objects. Using its client libraries, we can also instrument the code of our application. Graphana is often used for creating dashboards.

Logging

At pod level
kubectl logs <pod-id> → first container only!
kubectl logs <pod-name> --container <container-name> → a specific container

At cluster level : ELK stack
Kubernetes does not provide cluster-wide logging by default, therefore third party tools are required to centralize and aggregate cluster logs.
The most common way to collect the logs is using Elasticsearch, which uses fluentd with custom configuration as an agent on the nodes. fluentd is an open source data collector, which is also part of CNCF.

OpenTracing is a Vendor-neutral APIs and instrumentation for distributed tracing. Jaeger is a well known implementation of OpenTracing.

kubectl exec

kubectl exec -it <pod-id> -- /bin/sh
kubectl exec <pod-id> -- /bin/sh -c 'cat /foo/bar.txt'
Executes a command in the first container of a given pod.

kubectl exec <pod-id> -c <container-name> -- /bin/sh -c 'cat /foo/bar.txt'

kubectl cp

kubectl cp foo.txt my-pod:/bar/foo.txt
→ but remember a pod is a bit ephemeral...

Trouble shooting cluster DNS

Command line

high level: nslookup → nslookup google.com
low level (to display DNS records): dig → dig ANY google.com

First kubernetes diagnostic
Use image gcr.io/kubernetes-e2e-test-images
kubectl run -it dnsutils --image gcr.io/kubernetes-e2e-test-images/dnsutils:1.3
then nslookup kubernetes
If it can't be resolved, the DNS is in error!

Restart Kubernetes DNS
kubectl delete pod -n kube-system -l k8s-app=kube-dns

Attachments

troubleshooting-kubernetes.pdf (144kb)