Kubernetes troubleshooting
Cluster components
kubectl get componentstatus
Run a new debug pod
kubectl -n varac run -it --rm debug --image=docker.io/nicolaka/netshoot --restart=Never -- sh
Run an ephemeral container in an existing pod
kubectl -n headscale debug headscale-6646bf6ffd-tpzl7 -it --image=nicolaka/netshoot
Run an ephemeral container in an existing pod, and attach to process namespace of running container, to access the process list i.e.:
kubectl -n headscale debug headscale-6646bf6ffd-tpzl7 -it --image=nicolaka/netshoot --target=headscale
Create dedicated namespace and run netshoot:
kubectl create namespace tmp
kubectl -n tmp run tmp-shell --rm -i --tty --image nicolaka/netshoot
kubectl -n tmp run netshoot-tmp --attach=true --rm -i --tty --image nicolaka/netshoot
Exec into already running netshoot container:
kubectl exec -it netshoot-tmp -- sh
Network
netshoot image
- For general netshoot docs see ../container/troubleshooting.md
- Netshoot with Kubernetes
- K8s docs: Debug Running Pods
- k9s plugin
- kubectl plugin
Delete evicted pods
After DiskPressure
happened due to a full disk, hundreds of pods got evicted but still showed up after
DiskPressure
recovered.
Delete all evicted pods with:
kubectl get pods --all-namespaces -ojson | jq -r '.items[] | \
select(.status.reason!=null) | select(.status.reason | \
contains("Evicted")) | .metadata.name + " " + .metadata.namespace' | \
xargs -n2 -l bash -c 'kubectl delete pods $0 --namespace=$1'