Kubernetes Network Troubleshooting

Commands learned from the 2026-02-23 NodePort troubleshooting session.

Quick Diagnostics

Check Service Endpoints

kubectl get endpoints -A | grep <service-name>

Check IngressRoutes (Traefik)

kubectl get ingressroute -n monitoring -o custom-columns=NAME:.metadata.name,HOST:.spec.routes[0].match

Cilium BPF Debugging

Service List

kubectl -n kube-system exec ds/cilium -- cilium service list | grep <port>

BPF Load Balancer Map

kubectl -n kube-system exec ds/cilium -- cilium bpf lb list | head -30

Filter by Port

kubectl -n kube-system exec ds/cilium -- cilium bpf lb list | awk '/32503|32327/ {print $0}'

Routable vs Non-Routable Analysis

kubectl -n kube-system exec ds/cilium -- cilium bpf lb list | awk '
/NodePort/ {
  if (/non-routable/) print "NON-ROUTABLE:", $1, $2
  else print "ROUTABLE:    ", $1, $2
}'

Monitor Drops in Real-Time

kubectl -n kube-system exec ds/cilium -- cilium monitor --type drop
This is noisy. Ctrl+C to stop.

NAT Table

kubectl -n kube-system exec ds/cilium -- cilium bpf nat list | grep <port>

Cilium Status

kubectl -n kube-system exec ds/cilium -- cilium status --brief
kubectl -n kube-system exec ds/cilium -- cilium status | grep -i device

Cilium Config

kubectl -n kube-system get cm cilium-config -o yaml | grep -E "node|port|device|tunnel|bpf-lb"

Traffic Control (tc) Filters

BPF programs attach to interfaces via tc. If empty, Cilium isn’t processing traffic on that interface.

tc filter show dev eth0 ingress
tc filter show dev eth0 egress

Packet Capture

On Node (tcpdump)

sudo tcpdump -i eth0 port 32503 -c 10

On pfSense

ssh pfsense-01 "tcpdump -ni vtnet1 port 32503 -c 5"

Connectivity Testing

From Workstation

nc -zv 10.50.1.120 32503
nc -zv 10.50.1.120 443

From Inside Node (with Host header)

curl -kI -H "Host: grafana.inside.domusdigitalis.dev" https://localhost:32503

Test Multiple Ports

for port in 22 80 443 3000 9090 32503; do
  nc -zv 10.50.1.120 $port 2>&1 | awk '{print $NF}'
done

Kernel Settings

Reverse Path Filter

sysctl net.ipv4.conf.all.rp_filter
sysctl net.ipv4.conf.eth0.rp_filter

If rp_filter=1 and cross-subnet traffic fails:

sudo sysctl -w net.ipv4.conf.eth0.rp_filter=0

Cilium Modes

VXLAN (Tunnel) Mode

Default for k3s with Cilium. Creates overlay network.

  • NodePort: Works via SNAT mode

  • LoadBalancer: Needs MetalLB for external IPs

  • Pod-to-pod: VXLAN encapsulated

Native Routing Mode

Direct routing without encapsulation.

  • Requires tunnel: disabled

  • Needs BGP or static routes

  • Better performance

  • NodePort works on all interfaces

Check Current Mode

kubectl -n kube-system get cm cilium-config -o yaml | grep -E "tunnel|routing-mode"

Common Issues

"Node Port hybrid mode cannot be used with vxlan tunneling"

Don’t use bpf-lb-mode: hybrid with VXLAN. Use snat instead:

kubectl -n kube-system patch cm cilium-config --type merge -p '{"data":{"bpf-lb-mode":"snat"}}'
kubectl -n kube-system rollout restart ds/cilium

VLAN Traffic Dropped

Cilium drops tagged VLAN traffic by default. Check with:

kubectl -n kube-system exec ds/cilium -- cilium monitor --type drop | grep VLAN

NodePort Not Reachable

  1. Check firewall: sudo firewall-cmd --list-ports

  2. Add range: sudo firewall-cmd --add-port=30000-32767/tcp --permanent && sudo firewall-cmd --reload

  3. Check BPF: cilium bpf lb list | grep <port>

  4. Check tc filters: tc filter show dev eth0 ingress

Key Learnings (2026-02-23)

  • VXLAN mode = tc filters NOT attached to eth0

  • NodePort works via iptables/SNAT, not BPF in VXLAN mode

  • LoadBalancer type on bare metal needs MetalLB

  • bpf-lb-mode: snat is required for VXLAN + NodePort

  • Standard ports (80/443) via LoadBalancer may not work without MetalLB

  • High ports (NodePort range 30000-32767) need explicit firewall rules