Kubernetes troubleshooting

Cluster components

kubectl get componentstatus

Run a new debug pod

kubectl -n varac run -it --rm debug --image=docker.io/nicolaka/netshoot --restart=Never -- sh

Run an ephemeral container in an existing pod

Esphemeral container docs:

kubectl -n headscale debug headscale-6646bf6ffd-tpzl7 -it --image=nicolaka/netshoot

Run an ephemeral container in an existing pod, and attach to process namespace of running container, to access the process list i.e.:

kubectl -n headscale debug headscale-6646bf6ffd-tpzl7 -it --image=nicolaka/netshoot --target=headscale

Create dedicated namespace and run netshoot:

kubectl create namespace tmp
kubectl -n tmp run tmp-shell --rm -i --tty --image nicolaka/netshoot
kubectl -n tmp run netshoot-tmp --attach=true --rm -i --tty --image nicolaka/netshoot

Exec into already running netshoot container:

kubectl exec -it netshoot-tmp -- sh

Network

netshoot image

For general netshoot docs see ../container/troubleshooting.md
Netshoot with Kubernetes
K8s docs: Debug Running Pods
k9s plugin
kubectl plugin

Delete evicted pods

After DiskPressure happened due to a full disk, hundreds of pods got evicted but still showed up after DiskPressure recovered. Delete all evicted pods with:

kubectl get pods --all-namespaces -ojson | jq -r '.items[] | \
  select(.status.reason!=null) | select(.status.reason | \
  contains("Evicted")) | .metadata.name + " " + .metadata.namespace' | \
  xargs -n2 -l bash -c 'kubectl delete pods $0 --namespace=$1'