DNS Outage β DHCP Misconfiguration Across VLAN Interfaces
Context
Clients across multiple VLANs reported DNS resolution failures with working IP
connectivity. Root cause: pfSense DHCP handing out gateway IPs (e.g., 10.50.10.1)
as DNS servers instead of resolvers. A secondary outage was introduced mid-session
by setting all VLAN pools to 10.50.1.90, 10.50.1.1 β both unreachable from
non-MGMT VLANs due to inter-VLAN firewall rules.
Session also covered: NetworkManager profile rename to Domus convention, ISE endpoint registration for a Work MacBook (MAB), and building a reusable 802.1X profile audit toolkit.
Root Cause Summary
| Interface | VLAN | Incorrect DNS (before) | Correct DNS |
|---|---|---|---|
opt1 |
DATA (10.50.10.x) |
|
|
opt2 |
(10.50.20.x) |
|
|
opt3 |
(10.50.30.x) |
|
|
opt4 |
IoT (10.50.40.x) |
|
|
opt5 |
MGMT (10.50.1.x) |
|
|
pfSense DNS Resolver listens on each VLAN interface IP and handles forwarding
internally: inside.domusdigitalis.dev β BIND (10.50.1.90), external β upstream.
Clients must use their own pfSense interface IP as DNS. BIND and 10.50.1.1 are
not reachable from non-MGMT VLANs through the inter-VLAN firewall.
|
Secondary outage introduced: opt2/opt3/opt4 were set to
10.50.1.90, 10.50.1.1 during the session. This broke DNS for those VLANs.
Rollback required β see Follow-ups.
|
Secondary Issue β pfSense DNS Resolver Not Binding on IoT Interface
After restoring 10.50.40.1 as DNS for opt4, nslookup on an IoT Windows host
returned 10.50.40.1 as server but timed out. The DNS Resolver is not listening
on the opt4 interface.
Fix:
-
pfSense WebUI β Services β DNS Resolver β General Settings
-
Network Interfaces β set to All (or explicitly add OPT4)
-
Save β Apply
Verify via SSH:
ssh pfsense-01 "cat /var/unbound/unbound.conf | grep interface"
Expected: should include interface: 10.50.40.1 if bound to IoT.
Objectives
-
Diagnose DNS failure β identify DHCP misconfiguration on opt1
-
Fix opt1 DHCP DNS
-
Renew lease on workstation β confirm correct DNS from DHCP
-
Rename NM profiles to Domus naming convention
-
Rename backing
.nmconnectionfiles -
Register Work MacBook (
80:3F:5D:08:37:B8) β ISEBYOD-Registered -
Extend DHCP fix to opt2 / opt3
-
ROLLBACK opt2/opt3/opt4 to correct per-VLAN pfSense IPs
-
Clear hardcoded DNS from NM profiles (
ipv4.dns,ignore-auto-dns) -
Verify resolution via pfSense (10.50.10.1) β INCIDENT RESOLVED
-
Fix pfSense DNS Resolver binding on opt4 (IoT interface)
-
Verify IoT DNS resolution after Resolver fix
-
Audit opt4 firewall rules β port 53 to
10.50.40.1(self) must be allowed -
Send CoA to Work MacBook if new ISE policy not applied automatically
Commands
Phase 1 β Initial DNS triage
# What nameservers did DHCP assign?
cat /etc/resolv.conf
# Generated by NetworkManager search inside.domusdigitalis.dev nameserver 10.50.10.1
# What does NetworkManager report for DNS?
nmcli dev show | grep -i dns
# Confirm BIND is alive
ping -c1 10.50.1.90
# Is BIND responding to queries?
dig @10.50.1.90 google.com +short
# Is pfSense DNS responding?
dig @10.50.1.1 google.com +short
# TEST: does the wrong DNS server (10.50.10.1) actually do DNS?
dig @10.50.10.1 google.com +short
PING 10.50.1.90: 1 received (5.30ms) β reachable 142.251.214.110 <- BIND resolves correctly 142.251.214.110 <- pfSense resolves correctly (no output) <- 10.50.10.1 does not respond to DNS -- gateway only
IP connectivity works, BIND is reachable, pfSense resolves β all fine.
The only problem is the DHCP-assigned nameserver (10.50.10.1) not being a resolver.
|
Phase 2 β Manual DNS override (workstation temporary fix)
# Bounce WiFi to get new lease attempt (connection was Domus-Secure-802.1X at the time)
sudo nmcli con down "Domus-Secure-802.1X" && sudo nmcli con up "Domus-Secure-802.1X"
Connection 'Domus-Secure-802.1X' successfully deactivated (D-Bus active path: /org/freedesktop/NetworkManager/ActiveConnection/98) Connection successfully activated (D-Bus active path: /org/freedesktop/NetworkManager/ActiveConnection/99)
# Override DNS on wired profile -- ignore what DHCP hands out
sudo nmcli con mod "Wired-802.1X-Vault" ipv4.dns "10.50.1.90 10.50.1.1"
sudo nmcli con mod "Wired-802.1X-Vault" ipv4.ignore-auto-dns yes
sudo nmcli con down "Wired-802.1X-Vault" && sudo nmcli con up "Wired-802.1X-Vault"
| At this point the wired connection was not active so the bounce errored. The permanent fix is correcting pfSense DHCP β not overriding on each client. |
# Quick port connectivity check (separate test)
timeout 3 nc -zv 10.50.1.200 80 2>&1 || echo "http closed"
Connection to 10.50.1.200 80 port [tcp/http] succeeded!
Phase 3 β pfSense DHCP audit
# Show DHCP config for DATA VLAN
netapi pfsense dhcp show opt1
DHCP Server - OPT1 Enabled Yes Range From 10.50.10.100 Range To 10.50.10.200 Domain inside.domusdigitalis.dev DNS Servers 10.50.10.1 Gateway 10.50.10.1 Lease Time 7200s
# Check pfSense DHCP leases for the DATA subnet
netapi pfsense dhcp-leases | grep -i "10.50.10"
# Or SSH to pfSense directly and check dhcpd config
ssh pfsense-01 "cat /var/dhcpd/etc/dhcpd.conf" | grep -A10 "subnet 10.50.10"
# Full audit across all DHCP interfaces
for iface in lan opt1 opt2 opt3 opt4 opt5; do
echo "=== $iface ==="
netapi pfsense dhcp show $iface 2>/dev/null | awk '/DNS|Enabled/'
done
=== lan === Enabled No DNS Servers (not set) === opt1 === Enabled Yes DNS Servers 10.50.1.90, 10.50.1.1 === opt2 === Enabled Yes DNS Servers 10.50.20.1 === opt3 === Enabled Yes DNS Servers 10.50.30.1 === opt4 === Enabled Yes DNS Servers 10.50.1.50, 10.50.1.1 === opt5 === Enabled Yes DNS Servers 10.50.1.1
# Detailed check for IoT and MGMT ranges/gateways
netapi pfsense dhcp show opt4 | awk '/DNS|Gateway|Range/'
netapi pfsense dhcp show opt5 | awk '/DNS|Gateway|Range/'
Range From 10.50.40.100 Range To 10.50.40.200 DNS Servers 10.50.1.50, 10.50.1.1 Gateway 10.50.40.1 Range From 10.50.1.200 Range To 10.50.1.210 DNS Servers 10.50.1.1 Gateway 10.50.1.1
Phase 4 β Fix pfSense DHCP DNS (and the set-dns argument gotcha)
netapi pfsense dhcp update does not exist. Discover available
subcommands first:
|
netapi pfsense dhcp update --help # will fail
netapi pfsense dhcp --help # correct
Usage: netapi pfsense dhcp [OPTIONS] COMMAND [ARGS]... DHCP Server operations Commands: show Show DHCP server configuration for specific interface. set-domain Set DHCP domain for an interface. set-dns Set DHCP DNS servers for an interface. apply Apply pending DHCP changes.
netapi pfsense dhcp set-dns --help
Arguments: interface TEXT Interface name (e.g., lan) [required] dns_servers DNS_SERVERS... DNS server IPs [required] Options: --apply -a Apply changes immediately [default: True]
Argument format gotcha β dns_servers is a variadic positional arg.
Comma-separated strings (quoted or not) both fail with a 400 error:
|
# FAILS -- comma-separated
netapi pfsense dhcp set-dns opt1 "10.50.1.90,10.50.1.1"
# Error: [400] Field `dnsserver` must be a valid IPv4 address, received `10.50.1.90,10.50.1.1`
# FAILS -- comma with space
netapi pfsense dhcp set-dns opt1 "10.50.1.90, 10.50.1.1"
# Error: [400] Field `dnsserver` must be a valid IPv4 address, received `10.50.1.90, 10.50.1.1`
# CORRECT -- space-separated positional args, no quotes
netapi pfsense dhcp set-dns opt1 10.50.1.90 10.50.1.1
# Fix opt1 (DATA VLAN)
netapi pfsense dhcp set-dns opt1 10.50.1.90 10.50.1.1
# Verify
netapi pfsense dhcp show opt1
Setting DHCP DNS servers for opt1: 10.50.1.90, 10.50.1.1
OK
Applying DHCP changes...
OK
DHCP Server - OPT1
Enabled Yes
Range From 10.50.10.100
Range To 10.50.10.200
Domain inside.domusdigitalis.dev
DNS Servers 10.50.1.90, 10.50.1.1
Gateway 10.50.10.1
Lease Time 7200s
# Extend to opt2 and opt3
# NOTE: later identified as incorrect -- should use per-VLAN pfSense IPs
# See rollback in Follow-ups
netapi pfsense dhcp set-dns opt2 10.50.1.90 10.50.1.1
netapi pfsense dhcp set-dns opt3 10.50.1.90 10.50.1.1
Setting DHCP DNS servers for opt2: 10.50.1.90, 10.50.1.1 OK Applying DHCP changes... OK Setting DHCP DNS servers for opt3: 10.50.1.90, 10.50.1.1 OK Applying DHCP changes... OK
# Verify opt1-3
for iface in opt1 opt2 opt3; do
netapi pfsense dhcp show $iface | awk -v i="$iface" '/DNS/ {print i": "$0}'
done
opt1: DNS Servers 10.50.1.90, 10.50.1.1 opt2: DNS Servers 10.50.1.90, 10.50.1.1 opt3: DNS Servers 10.50.1.90, 10.50.1.1
| This state is incorrect for opt2/opt3. Rollback required. |
Phase 5 β Renew workstation lease and verify DNS resolution
# List active connections (non-infrastructure)
nmcli con show | awk '/wired|ethernet|802.1X/ {print $1, $3}'
# Or active only
nmcli con show --active | awk 'NR>1 {print $1, $3}'
Wired-802.1X-Vault ethernet Domus-Secure-802.1X wifi Wired-802.1X-Vault ethernet Domus-Secure-802.1X wifi br-1a64fdac6aa5 bridge ...docker0 bridge virbr0 bridge
# Bounce wired to renew DHCP lease
sudo nmcli con down "Wired-802.1X-Vault" && sudo nmcli con up "Wired-802.1X-Vault"
# Verify nameservers
awk '/nameserver/ {print $2}' /etc/resolv.conf
Connection 'Wired-802.1X-Vault' successfully deactivated (D-Bus active path: /org/freedesktop/NetworkManager/ActiveConnection/100) Connection successfully activated (D-Bus active path: /org/freedesktop/NetworkManager/ActiveConnection/101) 10.50.1.90 10.50.1.1
# Resolution test loop
for d in google.com github.com cloudflare.com inside.domusdigitalis.dev; do
printf "%-30s %s\n" "$d" "$(dig +short $d | head -1)"
done
# Extended -- internal hosts
echo "google.com github.com vault-01.inside.domusdigitalis.dev ise-01.inside.domusdigitalis.dev" | \
tr ' ' '\n' | while read d; do
dig +short "$d" | awk -v dom="$d" 'NR==1 {printf "%-40s %s\n", dom, $0}'
done
google.com 142.251.214.110 github.com 140.82.113.4 vault-01.inside.domusdigitalis.dev 10.50.1.60 ise-01.inside.domusdigitalis.dev 10.50.1.20
Phase 6 β NM profile rename to Domus convention
# List non-infrastructure connections
nmcli con show | awk 'NR>1 && !/bridge|docker|virbr|lo/ {print NR": "$1, "("$3")"}'
2: Wired-802.1X-Vault (ethernet) 3: Domus-Secure-802.1X (wifi)
# Rename connection IDs (persistent -- NM writes id= field to .nmconnection)
# Syntax: nmcli con mod "OLD-NAME" connection.id "NEW-NAME"
nmcli con mod "Wired-802.1X-Vault" connection.id "Domus-Wired-EAP-TLS"
nmcli con mod "Domus-Secure-802.1X" connection.id "Domus-WiFi-EAP-TLS"
nmcli con show | awk '/Domus/ {print $1, $3}'
Domus-Wired-EAP-TLS ethernet Domus-WiFi-EAP-TLS wifi
# Confirm internal IDs in backing files
# NOTE: zsh glob fails on /etc/NetworkManager without sudo -- use sudo bash -c
sudo bash -c "awk -F= '/^id=/ {print FILENAME\": \"\$2}' \
/etc/NetworkManager/system-connections/*.nmconnection"
# Alternative -- find avoids glob entirely
sudo find /etc/NetworkManager/system-connections -name "*.nmconnection" \
-exec awk -F= '/^id=/ {print FILENAME": "$2}' {} \;
/etc/NetworkManager/system-connections/Domus-Secure-802.1X.nmconnection: Domus-WiFi-EAP-TLS /etc/NetworkManager/system-connections/Wired-802.1X-Vault.nmconnection: Domus-Wired-EAP-TLS
# Rename backing files to match (cosmetic, but clean)
sudo mv /etc/NetworkManager/system-connections/Wired-802.1X-Vault.nmconnection \
/etc/NetworkManager/system-connections/Domus-Wired-EAP-TLS.nmconnection
sudo mv /etc/NetworkManager/system-connections/Domus-Secure-802.1X.nmconnection \
/etc/NetworkManager/system-connections/Domus-WiFi-EAP-TLS.nmconnection
sudo nmcli con reload
sudo ls /etc/NetworkManager/system-connections/ | awk '/Domus/'
Domus-WiFi-EAP-TLS.nmconnection Domus-Wired-EAP-TLS.nmconnection
NM uses the id= field inside the file, not the filename. Both are now aligned.
Survives reboots.
|
Zsh glob fix — add to .zshrc to eliminate sudo bash -c workarounds:alias sudo='noglob sudo'
|
Phase 7 β 802.1X profile audit toolkit (reusable)
All commands require sudo bash -c '…' because zsh glob expansion fails
on /etc/NetworkManager/system-connections/ without root. Wrapping in sudo bash -c
solves this entirely.
|
# Quick cert/key path dump
sudo bash -c 'grep -h "cert=\|key=" \
/etc/NetworkManager/system-connections/Domus-*.nmconnection' \
| awk -F= '{print $1": "$2}'
ca-cert: /etc/ssl/certs/DOMUS-ROOT-CA.pem client-cert: /etc/ssl/certs/modestus-razer-eaptls.pem private-key: /etc/ssl/private/modestus-razer-eaptls.key ca-cert: /etc/ssl/certs/DOMUS-ROOT-CA.pem client-cert: /etc/ssl/certs/modestus-razer-eaptls.pem private-key: /etc/ssl/private/modestus-razer-eaptls.key
# Full structured extraction -- all 802.1X fields
sudo bash -c 'awk -F= "
/^\[/ {section=\$0}
/^id=|^type=|^interface-name=/ {print \$2}
/eap=|identity=|ca-cert=|client-cert=|private-key=/ {print \" \"\$1\": \"\$2}
" /etc/NetworkManager/system-connections/Domus-*.nmconnection'
# Formatted loop -- human-readable per-profile summary
sudo bash -c '
for f in /etc/NetworkManager/system-connections/Domus-*.nmconnection; do
echo "=== $(basename $f .nmconnection) ==="
awk -F= "
/^id=/ {print \" ID: \"\$2}
/^type=/ {print \" Type: \"\$2}
/^eap=/ {print \" EAP Method: \"\$2}
/^identity=/ {print \" Identity: \"\$2}
/^ca-cert=/ {print \" CA Cert: \"\$2}
/^client-cert=/ {print \" Client Cert: \"\$2}
/^private-key=/ {print \" Private Key: \"\$2}
" "$f"
echo
done
'
=== Domus-WiFi-EAP-TLS === ID: Domus-WiFi-EAP-TLS Type: wifi CA Cert: /etc/ssl/certs/DOMUS-ROOT-CA.pem Client Cert: /etc/ssl/certs/modestus-razer-eaptls.pem EAP Method: tls; Identity: modestus-razer.inside.domusdigitalis.dev Private Key: /etc/ssl/private/modestus-razer-eaptls.key === Domus-Wired-EAP-TLS === ID: Domus-Wired-EAP-TLS Type: ethernet CA Cert: /etc/ssl/certs/DOMUS-ROOT-CA.pem Client Cert: /etc/ssl/certs/modestus-razer-eaptls.pem EAP Method: tls; Identity: modestus-razer.inside.domusdigitalis.dev Private Key: /etc/ssl/private/modestus-razer-eaptls.key
# Side-by-side diff comparison (wired vs wifi)
sudo bash -c 'paste \
<(cat /etc/NetworkManager/system-connections/Domus-Wired-EAP-TLS.nmconnection) \
<(cat /etc/NetworkManager/system-connections/Domus-WiFi-EAP-TLS.nmconnection)' \
| awk -F'\t' '
BEGIN {
printf "\033[1;36m%-25s %-35s %-35s\033[0m\n", "FIELD", "WIRED", "WIFI"
}
/^id=|^type=|^eap=|^identity=|ca-cert=|client-cert=|private-key=/ {
split($1,a,"="); split($2,b,"=")
printf "\033[33m%-25s\033[0m \033[32m%-35s\033[0m \033[34m%-35s\033[0m\n", a[1], a[2], b[2]
}'
# ANSI box display -- single awk pass over both files
sudo bash -c 'awk -F= "
BEGIN {
printf \"\033[1;37mβββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ\nβ π DOMUS 802.1X PROFILES β\nβββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ\033[0m\n\"
}
FNR==1 {printf \"\n\033[1;36mβ %s\033[0m\n\", FILENAME}
/^id=/ {printf \" \033[33mβ\033[0m ID: \033[1;32m%s\033[0m\n\",\$2}
/^type=/ {printf \" \033[33mβ\033[0m Type: \033[34m%s\033[0m\n\",\$2}
/^eap=/ {printf \" \033[33mβ\033[0m EAP: \033[35m%s\033[0m\n\",\$2}
/^identity=/ {printf \" \033[33mβ\033[0m Identity: \033[32m%s\033[0m\n\",\$2}
/client-cert=/{printf \" \033[33mβ\033[0m Cert: \033[36m%s\033[0m\n\",\$2}
/private-key=/{printf \" \033[33mβ\033[0m Key: \033[31m%s\033[0m\n\",\$2}
" /etc/NetworkManager/system-connections/Domus-*.nmconnection'
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β π DOMUS 802.1X PROFILES β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β /etc/NetworkManager/system-connections/Domus-WiFi-EAP-TLS.nmconnection β ID: Domus-WiFi-EAP-TLS β Type: wifi β Cert: /etc/ssl/certs/modestus-razer-eaptls.pem β EAP: tls; β Identity: modestus-razer.inside.domusdigitalis.dev β Key: /etc/ssl/private/modestus-razer-eaptls.key β /etc/NetworkManager/system-connections/Domus-Wired-EAP-TLS.nmconnection β ID: Domus-Wired-EAP-TLS β Type: ethernet β Cert: /etc/ssl/certs/modestus-razer-eaptls.pem β EAP: tls; β Identity: modestus-razer.inside.domusdigitalis.dev β Key: /etc/ssl/private/modestus-razer-eaptls.key
# Certificate expiry -- date string
sudo bash -c 'for f in /etc/NetworkManager/system-connections/Domus-*.nmconnection; do
name=$(basename $f .nmconnection)
cert=$(awk -F= "/client-cert=/{print \$2}" $f)
exp=$(openssl x509 -enddate -noout -in "$cert" 2>/dev/null | cut -d= -f2)
printf "β‘ %s\n Cert: %s\n Expires: %s\n\n" "$name" "$cert" "$exp"
done'
β‘ Domus-WiFi-EAP-TLS Cert: /etc/ssl/certs/modestus-razer-eaptls.pem Expires: Feb 16 05:10:57 2027 GMT β‘ Domus-Wired-EAP-TLS Cert: /etc/ssl/certs/modestus-razer-eaptls.pem Expires: Feb 16 05:10:57 2027 GMT
# Certificate expiry -- days remaining (add to dotfiles as cert monitor alias)
sudo bash -c 'for f in /etc/NetworkManager/system-connections/Domus-*.nmconnection; do
cert=$(awk -F= "/client-cert=/{print \$2}" "$f")
days=$(( ($(openssl x509 -enddate -noout -in "$cert" | cut -d= -f2 \
| xargs -I{} date -d "{}" +%s) - $(date +%s)) / 86400 ))
awk -F= -v d="$days" "
FNR==1 {printf \"\n\033[1;36mβ %s\033[0m\n\", FILENAME}
/^id=/ {printf \" β ID: \033[32m%s\033[0m\n\",\$2}
/^type=/ {printf \" β Type: \033[34m%s\033[0m\n\",\$2}
/client-cert=/ {printf \" β Cert: %s\n β Expires: \033[33m%d days\033[0m\n\",\$2,d}
" "$f"
done'
β /etc/NetworkManager/system-connections/Domus-WiFi-EAP-TLS.nmconnection β ID: Domus-WiFi-EAP-TLS β Type: wifi β Cert: /etc/ssl/certs/modestus-razer-eaptls.pem β Expires: 348 days β /etc/NetworkManager/system-connections/Domus-Wired-EAP-TLS.nmconnection β ID: Domus-Wired-EAP-TLS β Type: ethernet β Cert: /etc/ssl/certs/modestus-razer-eaptls.pem β Expires: 348 days
| Cert valid until February 16, 2027. Add days-remaining one-liner to dotfiles as a cert health alias. |
Phase 8 β ISE endpoint registration (Work MacBook, MAB)
# List all endpoint groups
netapi ise get-endpoint-groups
ββββββββββββββββββββββββββββββββββ³βββββββββββββββββββββββββββββββββββββββ β Name β ID β β‘ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ© β BYOD-Registered β 127f7b10-f95b-11f0-b76e-52c54a1d1f56 β β OS_X_BigSur-Workstation β aeb29380-4fbf-11ed-a871-0050568f5811 β β Workstation β 3b76f840-8c00-11e6-996c-525400b48521 β β ... β ... β ββββββββββββββββββββββββββββββββββ΄βββββββββββββββββββββββββββββββββββββββ Total: 45 results
# Attempt create -- endpoint already existed from prior MAB auth
netapi ise create-endpoint 80:3F:5D:08:37:B8 \
--group "BYOD-Registered" \
--description "Work MacBook - MAB"
Error: Endpoint already exists: 80:3F:5D:08:37:B8 (ID: 003aa8f0-17e5-11f1-937c-aacf303a6e4d)
# Update existing endpoint
netapi ise update-endpoint 80:3F:5D:08:37:B8 \
--group "BYOD-Registered" \
--description "Work MacBook - MAB"
β Updated endpoint: 80:3F:5D:08:37:B8 Description: Work MacBook - MAB Group: BYOD-Registered
# Send CoA to apply new policy if device is currently connected
netapi ise coa reauth 80:3F:5D:08:37:B8
Phase 9 β VLAN DNS rollback investigation and IoT connectivity
Users on home and IoT VLANs reported continued outages after the opt1
fix. Cause: 10.50.1.90 and 10.50.1.1 set on opt2/opt3/opt4 are unreachable
from those VLANs through inter-VLAN firewall rules.
|
# Check pfSense firewall rules for IoT -- is DNS to 10.50.1.90 allowed?
netapi pfsense rules list opt4 | grep -i "dns\|53\|1.90"
# Windows client on IoT VLAN -- run on the affected host
ipconfig
ping 10.50.40.1 # gateway -- must respond
ping 10.50.1.1 # MGMT pfSense -- likely blocked
ping 8.8.8.8 # internet -- was failing (routing/firewall, not DNS)
nslookup google.com
ipconfig /release && ipconfig /renew
nslookup google.com
Phase 10 β wpa_supplicant auth verification
# Live supplicant log
journalctl -u wpa_supplicant --since "10 min ago" | tail -20
# Active connection check
nmcli con show --active | awk '/Wired|EAP/'
# List all physical MACs (exclude virtual interfaces)
ip link show | awk '/ether/ && !/docker|virbr|br-/ {print $2}'
# ISE auth status for a MAC
netapi ise mnt auth-status DC:8C:37:96:20:A6
wlan0: CTRL-EVENT-EAP-METHOD EAP vendor 0 method 13 (TLS) selected wlan0: CTRL-EVENT-EAP-PEER-CERT depth=2 subject='/C=US/O=Domus Digitalis/OU=Enterprise PKI/CN=DOMUS-ROOT-CA' wlan0: CTRL-EVENT-EAP-PEER-CERT depth=1 subject='/CN=DOMUS-ISSUING-CA' wlan0: CTRL-EVENT-EAP-PEER-CERT depth=0 subject='/CN=ise-01.inside.domusdigitalis.dev' wlan0: CTRL-EVENT-EAP-SUCCESS EAP authentication completed successfully wlan0: PMKSA-CACHE-ADDED 78:bc:1a:36:82:ce 0 wlan0: WPA: Key negotiation completed with 78:bc:1a:36:82:ce [PTK=CCMP GTK=CCMP] wlan0: CTRL-EVENT-CONNECTED - Connection to 78:bc:1a:36:82:ce completed
enp130s0: CTRL-EVENT-DISCONNECTED bssid=01:80:c2:00:00:03 reason=3 locally_generated=1
No authentication records for DC:8C:37:96:20:A6 in last 300s
LAB-3560CX-01# show access-session interface GigabitEthernet1/0/3 detail
Interface: GigabitEthernet1/0/3
MAC Address: dc8c.3796.20a6
User-Name: DC-8C-37-96-20-A6
Status: Authorized
Domain: DATA
Oper host mode: multi-auth
Current Policy: PMAP_DefaultWiredDot1xClosedAuth_1X_MAB
Method status list:
Method State
dot1x Stopped
mab Authc Success
DC:8C:37:96:20:A6 is not the Razer. The Razer was confirmed authenticated
on WiFi (wlan0) via EAP-TLS. ISE returned no auth records for that MAC because
it was a different device using MAB fallback.
|
Phase 11 β DNS DHCP rollback (restore per-VLAN pfSense gateway IPs)
The earlier fix set all VLANs to use 10.50.1.90, 10.50.1.1 which are
unreachable from non-MGMT VLANs. This phase restores the correct configuration:
each VLAN uses its own pfSense gateway IP as DNS.
|
Architecture reminder: pfSense DNS Resolver (Unbound) is a conditional forwarder:
-
Listens on ALL VLAN interface IPs (10.50.10.1, 10.50.20.1, etc.)
-
Internal queries (
inside.domusdigitalis.dev) β forwards to BIND (10.50.1.90) -
External queries (google.com, etc.) β resolves via upstream
Clients never talk to BIND directly β pfSense handles the split.
# Diagnostic β confirmed BIND not forwarding external, pfSense works
dig @10.50.1.90 google.com +short # returns nothing
dig @10.50.10.1 google.com +short # returns 142.251.214.110
# Rollback β restore per-VLAN DNS
netapi pfsense dhcp set-dns opt1 10.50.10.1
netapi pfsense dhcp set-dns opt2 10.50.20.1
netapi pfsense dhcp set-dns opt3 10.50.30.1
netapi pfsense dhcp set-dns opt4 10.50.40.1
Setting DHCP DNS servers for opt1: 10.50.10.1 OK Applying DHCP changes... OK Setting DHCP DNS servers for opt2: 10.50.20.1 OK Applying DHCP changes... OK Setting DHCP DNS servers for opt3: 10.50.30.1 OK Applying DHCP changes... OK Setting DHCP DNS servers for opt4: 10.50.40.1 OK Applying DHCP changes... OK
# Verify all VLANs
for iface in opt1 opt2 opt3 opt4; do
netapi pfsense dhcp show $iface | awk -v i="$iface" '/DNS/ {print i": "$0}'
done
opt1: DNS Servers 10.50.10.1 opt2: DNS Servers 10.50.20.1 opt3: DNS Servers 10.50.30.1 opt4: DNS Servers 10.50.40.1
# Renew workstation lease
sudo nmcli con down "Domus-Wired-EAP-TLS" && sudo nmcli con up "Domus-Wired-EAP-TLS"
Connection 'Domus-Wired-EAP-TLS' successfully deactivated (D-Bus active path: /org/freedesktop/NetworkManager/ActiveConnection/122) Connection successfully activated (D-Bus active path: /org/freedesktop/NetworkManager/ActiveConnection/123)
Phase 12 β DNS deep diagnostics (intermittent wired/wireless issues)
=== Quick Status
# What interface am I on? What DNS am I using?
nmcli dev status | awk '/connected/'
awk '/nameserver/ {print "DNS: "$2}' /etc/resolv.conf
# Which connection is active?
nmcli con show --active | awk 'NR>1 && !/bridge|docker|virbr/ {print $1, $3}'
=== DNS Resolution Tests
# Basic resolution
dig google.com +short
# With timing
dig google.com | awk '/Query time|SERVER/'
# Specific server test (pfSense gateway)
dig @10.50.10.1 google.com +short
# Internal domain (should go to BIND via pfSense)
dig vault-01.inside.domusdigitalis.dev +short
=== Trace & Debug
# Full trace β see every hop
dig google.com +trace
# Verbose with all sections
dig google.com +noall +answer +authority +additional
# Show which server answered
dig google.com +noall +answer +comments | awk '/SERVER|;.*IN/'
=== Compare Wired vs Wireless DNS
# Get DNS from each profile
for prof in "Domus-Wired-EAP-TLS" "Domus-WiFi-EAP-TLS"; do
echo "=== $prof ==="
nmcli con show "$prof" | awk '/ipv4.dns:|IP4.DNS/'
done
# What DNS did DHCP actually assign? (active connection)
nmcli dev show | awk '/IP4.DNS/ {print}'
=== Loop Test (catch intermittent failures)
# 10 queries, 1 second apart β watch for failures
for i in {1..10}; do
result=$(dig +short google.com 2>&1)
ts=$(date +%H:%M:%S)
if [ -z "$result" ]; then
echo "$ts FAIL"
else
echo "$ts OK: $result"
fi
sleep 1
done
# Continuous monitor (Ctrl+C to stop)
while true; do
ts=$(date +%H:%M:%S)
iface=$(ip route get 8.8.8.8 2>/dev/null | awk '/dev/ {print $5}')
dns=$(awk '/nameserver/ {print $2; exit}' /etc/resolv.conf)
result=$(timeout 2 dig +short google.com 2>&1)
if [ -z "$result" ]; then
printf "%s [%s] DNS=%s \033[31mFAIL\033[0m\n" "$ts" "$iface" "$dns"
else
printf "%s [%s] DNS=%s \033[32mOK\033[0m %s\n" "$ts" "$iface" "$dns" "$result"
fi
sleep 2
done
=== Check for DNS Leakage / Wrong Server
# Is NetworkManager ignoring DHCP DNS?
nmcli con show "Domus-Wired-EAP-TLS" | awk '/ignore-auto-dns/'
nmcli con show "Domus-WiFi-EAP-TLS" | awk '/ignore-auto-dns/'
# Hardcoded DNS in profiles? (should be empty if using DHCP)
nmcli con show "Domus-Wired-EAP-TLS" | awk '/ipv4.dns:/'
nmcli con show "Domus-WiFi-EAP-TLS" | awk '/ipv4.dns:/'
# systemd-resolved status (if active)
resolvectl status 2>/dev/null | awk '/DNS Server|Current Scopes/'
=== Network Path Check
# Can I reach the DNS server?
ping -c1 -W2 $(awk '/nameserver/ {print $2; exit}' /etc/resolv.conf)
# Is port 53 open to DNS?
dns=$(awk '/nameserver/ {print $2; exit}' /etc/resolv.conf)
timeout 2 nc -zvu $dns 53 2>&1
# Default route β which interface?
ip route get 8.8.8.8 | awk '{print "via", $3, "dev", $5}'
=== Wired/Wireless Flip Detection
# Watch for interface changes (run in background)
ip monitor link | awk '/state UP|state DOWN/ {print strftime("%H:%M:%S"), $0}'
# Current connection priority (lower = preferred)
nmcli con show "Domus-Wired-EAP-TLS" | awk '/connection.autoconnect-priority/'
nmcli con show "Domus-WiFi-EAP-TLS" | awk '/connection.autoconnect-priority/'
=== Full Diagnostic Dump
# One-shot diagnostic
echo "=== Active Connections ===" && \
nmcli con show --active | awk 'NR>1 && !/bridge|docker|virbr/' && \
echo -e "\n=== DNS Servers ===" && \
awk '/nameserver/ {print $2}' /etc/resolv.conf && \
echo -e "\n=== Default Route ===" && \
ip route get 8.8.8.8 | awk '{print "via", $3, "dev", $5}' && \
echo -e "\n=== Resolution Test ===" && \
dig +short google.com && \
dig +short vault-01.inside.domusdigitalis.dev
Outcomes
-
Root cause identified and fixed: opt1 DHCP had
10.50.10.1as DNS β the DATA VLAN gateway, which is not a DNS resolver. pfSense DNS Resolver listens on each VLAN interface IP and handles forwarding internally. -
Secondary outage introduced and rolled back: opt2/opt3/opt4 were set to
10.50.1.90, 10.50.1.1which are unreachable from non-MGMT VLANs. Rolled back to per-VLAN pfSense gateway IPs. -
NM profiles renamed:
Wired-802.1X-VaultβDomus-Wired-EAP-TLS,Domus-Secure-802.1XβDomus-WiFi-EAP-TLS. Backing files renamed. Persistent across reboots. -
Work MacBook registered in ISE:
80:3F:5D:08:37:B8βBYOD-Registered. -
802.1X audit toolkit built: cert paths, expiry date, days remaining, ANSI box display, side-by-side diff. Cert expires Feb 16 2027 (348 days).
-
pfSense DNS Resolver not binding on opt4 (IoT) β unresolved.
-
Key lesson:
netapi pfsense dhcp set-dnstakes space-separated positional IPs β comma-separated strings fail with a 400 error.
Follow-ups
-
ROLLBACK β completed: All VLANs restored to per-VLAN pfSense gateway IPs. See Phase 11.
-
Fix pfSense DNS Resolver binding on opt4: Services β DNS Resolver β Network Interfaces β All
-
Verify port 53 from opt4 to
10.50.40.1is allowed:netapi pfsense rules list opt4 -
Renew DHCP on affected clients after rollback; verify with
nslookup -
CoA Work MacBook if ISE policy not applied:
netapi ise coa reauth 80:3F:5D:08:37:B8 -
Add cert expiry alias to dotfiles
-
Add
alias sudo='noglob sudo'to.zshrc
Phase 13 β NetworkManager hardcoded DNS cleanup
During the DNS rollback, we discovered both 802.1X profiles had hardcoded DNS settings from an earlier troubleshooting session. This overrode DHCP-assigned DNS and caused intermittent failures.
=== Root Cause Discovery
Using the continuous monitor loop (Phase 12), we caught intermittent failures:
11:35:02 [wlan0] DNS=10.50.1.90 OK 142.251.41.14 11:35:05 [enp130s0] DNS=10.50.1.90 FAIL 11:35:08 [enp130s0] DNS=10.50.1.90 FAIL
WiFi worked, wired failed β both using same DNS server. The difference: profile configuration.
=== Diagnosis
# Check wired profile DNS settings
nmcli con show "Domus-Wired-EAP-TLS" | awk '/ipv4.dns:|ignore-auto-dns/'
ipv4.dns: 10.50.1.90,10.50.1.1 ipv4.ignore-auto-dns: yes
# Check WiFi profile DNS settings
nmcli con show "Domus-WiFi-EAP-TLS" | awk '/ipv4.dns:|ignore-auto-dns/'
ipv4.dns: 10.50.1.90,10.50.1.1 ipv4.ignore-auto-dns: yes
Both profiles had ignore-auto-dns: yes which means DHCP DNS was ignored,
and hardcoded values were used instead.
=== Fix β Clear Hardcoded DNS
# Remove hardcoded DNS from wired profile
nmcli con mod "Domus-Wired-EAP-TLS" ipv4.dns ""
nmcli con mod "Domus-Wired-EAP-TLS" ipv4.ignore-auto-dns no
# Remove hardcoded DNS from WiFi profile
nmcli con mod "Domus-WiFi-EAP-TLS" ipv4.dns ""
nmcli con mod "Domus-WiFi-EAP-TLS" ipv4.ignore-auto-dns no
# Bounce connections to apply
sudo nmcli con down "Domus-Wired-EAP-TLS" && sudo nmcli con up "Domus-Wired-EAP-TLS"
=== Verification
# Both profiles should show empty DNS and ignore-auto-dns: no
nmcli con show "Domus-Wired-EAP-TLS" | awk '/ipv4.dns:|ignore-auto-dns/'
nmcli con show "Domus-WiFi-EAP-TLS" | awk '/ipv4.dns:|ignore-auto-dns/'
ipv4.dns: -- ipv4.ignore-auto-dns: no ipv6.ignore-auto-dns: no ipv4.dns: -- ipv4.ignore-auto-dns: no ipv6.ignore-auto-dns: no
Phase 14 β Current state analysis and VyOS status
After clearing hardcoded DNS, intermittent issues continued. Investigation revealed the DNS architecture conflict.
=== Current Gateway
ip route | awk '/default/{print "Gateway:", $3}'
Gateway: 10.50.10.1
Gateway is pfSense (10.50.10.1) β VyOS is NOT active.
=== VyOS Status
# Test connectivity to vyos-02
ping -c2 10.50.1.3
PING 10.50.1.3 (10.50.1.3) 56(84) bytes of data. --- 10.50.1.3 ping statistics --- 2 packets transmitted, 0 received, 100% packet loss
# Test kvm-02 (hypervisor)
ping -c1 10.50.1.111
64 bytes from 10.50.1.111: icmp_seq=1 ttl=63 time=0.966 ms
=== BIND Status
dig @10.50.1.90 google.com +short +time=2
142.251.41.14
=== Key Finding: DNS Architecture Conflict
VyOS runbook (vyos-services.adoc Phase 5-6) configures DHCP to hand out BIND IPs:
set service dhcp-server ... option name-server {bind-ip} # 10.50.1.90
set service dhcp-server ... option name-server {bind-02-ip} # 10.50.1.91
But pfSense rollback set per-VLAN gateway IPs as DNS:
netapi pfsense dhcp set-dns opt1 10.50.10.1
Conflict: pfSense DHCP appears to still be handing out BIND IPs despite rollback commands. Need to verify pfSense DHCP config and determine if VyOS DHCP is somehow active.
=== Monitor for ARP Conflicts
# Check if vyos-02 is conflicting with pfSense gateway
ip neigh show 10.50.10.1
=== Continuous BIND Monitor
while true; do
ts=$(date +%H:%M:%S)
result=$(timeout 2 dig @10.50.1.90 google.com +short 2>&1)
if [ -n "$result" ] && [[ ! "$result" =~ "timed out" ]]; then
printf "%s BIND \033[32mOK\033[0m %s\n" "$ts" "$result"
else
printf "%s BIND \033[31mFAIL\033[0m\n" "$ts"
fi
sleep 3
done
Phase 15 β Resolution confirmed
After clearing hardcoded DNS and bouncing the connection, workstation received correct DNS from pfSense DHCP.
=== DHCP DNS Verification
nmcli dev show enp130s0 | awk '/IP4.DNS/'
IP4.DNS[1]: 10.50.10.1
=== pfSense DNS Monitor
Testing the actual client path (pfSense forwards to BIND internally):
while true; do
ts=$(date +%H:%M:%S)
result=$(timeout 2 dig @10.50.10.1 google.com +short 2>&1)
if [ -n "$result" ] && [[ ! "$result" =~ "timed out" ]]; then
printf "%s pfSense \033[32mOK\033[0m %s\n" "$ts" "$result"
else
printf "%s pfSense \033[31mFAIL\033[0m\n" "$ts"
fi
sleep 3
done
11:55:49 pfSense OK 142.251.41.14 11:55:52 pfSense OK 142.251.41.14 11:55:55 pfSense OK 142.251.41.14
=== Root Cause Summary
| Issue | Resolution |
|---|---|
Hardcoded DNS in NM profiles |
Cleared |
Stale DHCP lease |
Connection bounce obtained fresh lease with correct DNS |
BIND direct access failures |
Expected β clients now use pfSense (10.50.10.1), not BIND directly |
Incident resolved. Architecture: Client β pfSense (per-VLAN IP) β BIND for internal.
Appendix A β Switch Trunk Configuration (Reference)
Captured during troubleshooting for VyOS HA architecture reference.
! Te1/0/1 β kvm-02 (vyos-02)
interface TenGigabitEthernet1/0/1
description TRUNK-TO-SUPERMICRO-KVM-02
switchport trunk allowed vlan 10,20,30,40,100,999
switchport trunk native vlan 100
switchport mode trunk
ip arp inspection trust
spanning-tree portfast edge trunk
ip dhcp snooping trust
! Te1/0/2 β kvm-01 (vyos-01)
interface TenGigabitEthernet1/0/2
description TRUNK-TO-SUPERMICRO-KVM-01
switchport trunk allowed vlan 10,20,30,40,100,999
switchport trunk native vlan 100
switchport mode trunk
ip arp inspection trust
spanning-tree portfast edge trunk
ip dhcp snooping trust
| Native VLAN 100 (INFRA). VLANs 110, 120 (SECURITY, SERVICES) not yet added β pending VyOS deployment completion. |
References
-
pfSense Docs β Services β DNS Resolver (unbound)
-
netapi pfsense dhcp --helpβshow,set-dns,set-domain,apply -
netapi ise --helpβcreate-endpoint,update-endpoint,get-endpoint-groups,coa -
NM connection files:
/etc/NetworkManager/system-connections/Domus-*.nmconnection -
ISE MNT auth lookup:
netapi ise mnt auth-status <MAC> -
Switch CLI:
show access-session interface <int> detail