WRKLOG-2026-03-11
Summary
CLI tooling day + Vocera triage. Expanded netapi with git forge integrations (GitHub, GitLab, Gitea) and enhanced Monad pipeline management. All commands support -f json for jq piping.
Work priority: ~10 Vocera phones failing 802.1X due to missing EAP-TLS supplicant configuration.
Completed Today (2026-03-10)
| Task | Details | Status |
|---|---|---|
Monad CLI Enhancement |
Added 7 new commands: input-create, quick-pipeline, graph, health, clone, watch, inputs --type filter |
COMPLETE |
GitHub CLI |
26 commands: repos, prs, issues, workflows, gists, starred, orgs, etc. |
COMPLETE |
GitLab CLI |
23 commands: projects, mrs, pipelines, groups, etc. |
COMPLETE |
Gitea CLI |
16 commands: repos, prs, issues, mirror, create, delete, etc. |
COMPLETE |
Command Composition Patterns |
Created command-composition.adoc - bash patterns mapping to Python/Go concepts |
COMPLETE |
netapi CLI Summary
| Module | Commands | Env Vars |
|---|---|---|
|
33 commands (pipelines, inputs, outputs, transforms, health, graph, watch, etc.) |
|
|
26 commands (repos, prs, issues, workflows, gists, starred, etc.) |
|
|
23 commands (projects, mrs, pipelines, groups, etc.) |
|
|
16 commands (repos, prs, issues, mirror, create, delete, etc.) |
|
Previous Day (2026-03-09)
| Task | Details | Status |
|---|---|---|
BIND HA |
bind-02 deployed on kvm-02, AXFR zone sync working |
COMPLETE |
Vault HA Cluster |
vault-02/03 deployed, TLS certs issued, joined Raft cluster |
COMPLETE |
evanusmodestus sudo |
Password reset via virsh console (ansible user) |
COMPLETE |
Vault HA Cluster Status
Node Address State Voter
vault-01 vault-01.inside.domusdigitalis.dev:8201 leader true
vault-02 vault-02.inside.domusdigitalis.dev:8201 follower true
vault-03 vault-03.inside.domusdigitalis.dev:8201 follower true
Today’s Priorities (2026-03-11)
| Priority | Task | Status | Notes |
|---|---|---|---|
P0 |
Vocera EAP-TLS Supplicant Fix |
[ ] IN PROGRESS |
~10 phones failing 802.1X, missing supplicant config |
P0 |
Monad Pipeline Evaluation |
[ ] PENDING |
Test pipeline creation, input sources, transforms |
P1 |
k3s NAT verification |
[ ] PENDING |
NAT rule 170 applied, test pod internet access |
P1 |
Wazuh indexer recovery |
[ ] PENDING |
Restart pod after NAT confirmed working |
Vocera EAP-TLS Commands (File-Based MAC Lookup)
Create MAC file:
# One MAC per line, uppercase with colons
cat > /tmp/vocera-macs.txt << 'EOF'
00:09:EF:AA:BB:CC
00:09:EF:DD:EE:FF
18:B4:30:11:22:33
EOF
Query ISE for each MAC:
# Get endpoint details for each MAC
while read -r mac; do
echo "=== $mac ==="
netapi ise endpoint get "$mac" -f json | jq '{mac: .mac, profile: .profileId, group: .groupId, staticGroupAssignment: .staticGroupAssignment}'
done < /tmp/vocera-macs.txt
Check authentication history (DataConnect):
# Build SQL IN clause from file
MACS=$(awk '{printf "\x27%s\x27,", $1}' /tmp/vocera-macs.txt | sed 's/,$//')
netapi ise dc query "SELECT
acs_timestamp,
calling_station_id AS mac,
passed,
failure_reason,
selected_azn_profiles
FROM mnt.radius_auth_48_live
WHERE calling_station_id IN ($MACS)
ORDER BY acs_timestamp DESC
LIMIT 100" -f json | jq '.'
Find failed auths only:
while read -r mac; do
echo "=== $mac failures ==="
netapi ise dc query "SELECT acs_timestamp, failure_reason
FROM mnt.radius_auth_48_live
WHERE calling_station_id = '$mac' AND passed = 0
ORDER BY acs_timestamp DESC LIMIT 5" -f json | jq -r '.[] | "\(.acs_timestamp): \(.failure_reason)"'
done < /tmp/vocera-macs.txt
Bulk CoA reauth after fix:
# Reauth all devices after supplicant is configured
while read -r mac; do
echo "Reauth: $mac"
netapi ise mnt coa reauth --mac "$mac"
sleep 1 # Rate limit
done < /tmp/vocera-macs.txt
Carried Over / Previous Priorities
| Priority | Task | Status | Notes |
|---|---|---|---|
P0 |
Vault HA failover test (Phase 6) |
[x] DONE |
Verified: vault-02 became leader, PKI worked, vault-01 rejoined |
P0 |
vault-ssh-sign HA update |
[x] DONE |
Script now health-checks all 3 nodes, auto-failover |
P0 |
Fix vault-backup.service |
[x] DONE |
SELinux policy module installed - rsync_t → ssh_exec_t |
P2 |
PacketFence VM exploration |
[ ] PENDING |
Deploy packetfence-01 on kvm-02, evaluate open-source NAC |
HA Deployment Queue Status
| Priority | System | Status | Next Action |
|---|---|---|---|
P1 |
BIND |
COMPLETE ✓ |
bind-01 + bind-02 (AXFR) |
P2 |
Vault |
COMPLETE ✓ |
vault-01/02/03 (Raft) |
P3 |
Keycloak |
NEXT |
Rebuild from scratch (corrupted) |
P4 |
FreeIPA |
PLANNED |
ipa-01 + ipa-02 (IPA Replication) |
P5 |
AD DC |
PLANNED |
home-dc01 + home-dc02 (AD Replication) |
P6 |
iPSK |
PLANNED |
ipsk-mgr-01 + ipsk-mgr-02 (MySQL Replication) |
P7 |
ISE |
DEFERRED |
ise-01 reconfigure after ise-02 stable |
Current Single Points of Failure
| System | Impact if Down |
|---|---|
ISE (ise-02) |
All 802.1X stops - wired + wireless auth fails |
Keycloak |
SAML/OIDC SSO broken (ISE admin, Grafana, etc.) |
FreeIPA (ipa-01) |
Linux authentication, sudo rules, HBAC |
AD DC (home-dc01) |
Windows auth, Kerberos, GPO |
iPSK Manager |
Self-service PSK portal unavailable |
Evaluation VMs
PacketFence Evaluation
Purpose: Educational deployment. ISE remains production NAC. Understand FreeRADIUS internals.
VM Specs:
| Resource | Value |
|---|---|
Name |
packetfence-01 |
Hypervisor |
kvm-02 |
vCPU |
4 |
RAM |
8GB |
Disk |
100GB |
IP |
TBD (10.50.1.x) |
OS |
Rocky Linux 9 or Debian 12 |
PacketFence Components:
-
FreeRADIUS (802.1X, MAB)
-
MariaDB (backend)
-
Captive portal
-
Device profiling
-
VLAN assignment
-
Guest management
Evaluation Goals:
-
Deploy standalone instance
-
Test 802.1X with a single endpoint
-
Compare admin experience vs ISE
-
Document findings for future reference
Session Log
Session 1: Vault HA Failover Test
Objective: Verify Raft leader election works correctly.
Pre-flight check:
export VAULT_ADDR="https://vault-01.inside.domusdigitalis.dev:8200"
vault operator raft list-peers
Failover test:
# Stop the leader
ssh vault-01 "sudo systemctl stop vault"
# Wait for election
sleep 10
# Check new leader (from vault-02)
export VAULT_ADDR="https://vault-02.inside.domusdigitalis.dev:8200"
vault operator raft list-peers
Verify operations work:
vault list pki_int/certs | head -3
Restart vault-01:
ssh vault-01 "sudo systemctl start vault"
# Unseal (3 keys required)
ssh -t vault-01 "VAULT_ADDR=https://127.0.0.1:8200 VAULT_SKIP_VERIFY=1 vault operator unseal"
ssh -t vault-01 "VAULT_ADDR=https://127.0.0.1:8200 VAULT_SKIP_VERIFY=1 vault operator unseal"
ssh -t vault-01 "VAULT_ADDR=https://127.0.0.1:8200 VAULT_SKIP_VERIFY=1 vault operator unseal"
# Verify rejoin
vault operator raft list-peers
Result: [x] DONE - vault-02 became leader, PKI cert issuance verified, vault-01 rejoined as follower.
Critical Fix Discovered:
# /var/log/vault MUST exist on all nodes before failover works
ssh vault-02 "sudo mkdir -p /var/log/vault && sudo chown vault:vault /var/log/vault"
ssh vault-03 "sudo mkdir -p /var/log/vault && sudo chown vault:vault /var/log/vault"
Audit log configuration replicates via Raft, but filesystem doesn’t. Without the directory, failover fails with "mkdir /var/log/vault: permission denied".
Session 2: vault-backup.service SELinux Fix
Objective: Fix failed systemd unit on vault-01.
Root Cause: SELinux rsync_t domain cannot execute ssh_exec_t.
Error:
rsync: [sender] Failed to exec ssh: Permission denied (13)
rsync error: error in IPC code (code 14)
Key insight: Manual sudo rsync worked because it runs in unconfined_t. The systemd service runs rsync in the confined rsync_t domain.
Fix (permissive domain approach):
# Capture ALL denials at once
sudo semanage permissive -a rsync_t
sudo systemctl start vault-backup.service
# Generate comprehensive policy
sudo ausearch -m avc --start today | grep rsync | audit2allow -M vault-backup
# Install and re-enable enforcing
sudo semodule -i vault-backup.pp
sudo semanage permissive -d rsync_t
# Test
sudo systemctl start vault-backup.service && systemctl status vault-backup.service
Result: [x] DONE - Service succeeded, timer scheduled for 02:29 UTC.
Documentation created:
-
Updated vault-backup.adoc runbook with SELinux section
Session 3: vault-ssh-sign HA Update
Objective: Make vault-ssh-sign script HA-aware so it doesn’t fail when vault-01 is down.
Problem: Non-deterministic leader election means any node could be leader. Clients shouldn’t care who’s leader.
Solution: Script now health-checks all 3 nodes and uses first healthy one.
Key patterns:
# Health check endpoint
# 200 = active leader, 429 = standby (both can serve)
curl -sk --max-time 3 -o /dev/null -w "%{http_code}" \
"https://vault-01.inside.domusdigitalis.dev:8200/v1/sys/health"
Nodes array (preference order):
VAULT_NODES=(
"https://vault-01.inside.domusdigitalis.dev:8200"
"https://vault-02.inside.domusdigitalis.dev:8200"
"https://vault-03.inside.domusdigitalis.dev:8200"
)
Committed: dotfiles-optimus - feat(vault): HA-aware vault-ssh-sign with automatic failover
Test commands:
# Run the updated script
vault-ssh-sign
# Verify SSH cert works
vault-ssh-test
Result: [x] DONE - Script updated and pushed to GitHub.
Session 4: netapi Git Forge Integration
Objective: Add GitHub, GitLab, Gitea CLI commands to netapi for jq-friendly API access.
Files created:
| File | Purpose |
|---|---|
|
GitHub REST API client (Bearer auth) |
|
GitLab REST API client (PRIVATE-TOKEN auth) |
|
26 CLI commands |
|
23 CLI commands |
|
16 CLI commands (existing vendor, new CLI) |
jq examples:
# List all repo names
netapi github repos -f json | jq -r '.[].full_name'
# Get open MR titles from GitLab
netapi gitlab mrs mygroup/myproject -f json | jq -r '.[].title'
# Search Gitea and get clone URLs
netapi gitea search netapi -f json | jq -r '.[].ssh_url'
# GitHub PR files with stats
netapi github pr-files owner/repo 42 -f json | jq '.[] | "\(.filename): +\(.additions) -\(.deletions)"'
Commits:
-
feat(monad): Add pipeline visualization, health check, and quick-create commands -
feat(forge): Add GitHub, GitLab, Gitea CLI commands
Result: [x] DONE - 65 new commands across 3 git forges, all with -f json support.
Session 5: Command Composition Patterns
Objective: Document bash patterns that map to Python/Go concepts for learning.
File created: domus-captures/docs/modules/ROOT/examples/codex/bash/command-composition.adoc
Session 6: netapi ISE TAC Case Prep Expansion
Objective: Massive expansion of ISE TAC diagnostic patterns for work use.
File updated: domus-captures/docs/modules/ROOT/examples/commands/netapi/ise-tac-case-prep.adoc
Changes: 881 → 2073 lines (+1192 lines)
New sections added:
| Section | Contents |
|---|---|
Certificate Diagnostics |
Cert failures by issuer, CN, time, chain validation errors |
Time-Based Analysis |
Peak hour analysis, hourly trends, business hours vs after-hours |
VLAN Analysis |
VLAN distribution, misassignment, change patterns |
Security Analysis |
Brute force detection, MAC spoofing, rogue devices, unauthorized access |
EAP Method Analysis |
EAP-TLS vs PEAP vs TEAP breakdown, method failures, protocol transitions |
NAD Analysis |
Per-switch/WLC auth stats, failure hotspots, port-level breakdown |
User Analysis |
User history, multi-device users, roaming patterns |
Session Analysis |
Session duration, concurrent sessions, session lifecycle |
Policy Analysis |
Policy set effectiveness, hit counts, rule ordering optimization |
CoA Analysis |
CoA success/failure rates, reauth patterns, disconnect analysis |
Profiler Deep Dive |
Profiling accuracy, endpoint DB size, stale endpoint cleanup |
Extended Bundles |
20-step TAC bundle, 10-step security audit |
Commit: 927c5f3 - pushed to origin
Key discovery: -f json only works for ERS commands (get-*) and api-call, NOT for MnT or DataConnect commands. MnT/DC output is human-readable only.
Session 7: modestus-razer EAP-TLS WiFi Troubleshooting
Objective: Fix intermittent EAP-TLS authentication failures on workstation.
Symptom:
ISE Error: 5411 - "Supplicant stopped responding to ISE"
Failure Reason: 12935 - "Supplicant stopped responding to ISE during EAP-TLS certificate exchange"
Step 12935 latency: 120001 ms (2 minute timeout)
wpa_supplicant logs showed:
EAP-TLS: Certificate chain validated ✓ (ROOT → ISSUING → ise-02)
Selected EAP-TLS
CTRL-EVENT-DISCONNECTED reason=3 locally_generated=1
NetworkManager error:
Secrets were required, but not provided
Diagnosis steps:
-
Cert files exist ✓
ls -la /etc/ssl/certs/modestus-razer-eaptls.pem /etc/ssl/private/modestus-razer-eaptls.key # Both present, root-owned, 0644/0600 -
Key NOT encrypted ✓
sudo head -3 /etc/ssl/private/modestus-razer-eaptls.key # -----BEGIN RSA PRIVATE KEY----- (not ENCRYPTED) -
Cert/key modulus MATCH ✓
openssl x509 -noout -modulus -in /etc/ssl/certs/modestus-razer-eaptls.pem | openssl md5 # 9d83a2e6b0f21ac6faa5529b99285a5c sudo openssl rsa -noout -modulus -in /etc/ssl/private/modestus-razer-eaptls.key | openssl md5 # 9d83a2e6b0f21ac6faa5529b99285a5c -
Certificate chain analysis:
openssl x509 -noout -subject -issuer -in /etc/ssl/certs/modestus-razer-eaptls.pem # subject=O=Domus-Infrastructure, OU=Domus-Admins, CN=modestus-razer.inside.domusdigitalis.dev # issuer=CN=DOMUS-ISSUING-CA # How many certs in file? openssl crl2pkcs7 -nocrl -certfile /etc/ssl/certs/modestus-razer-eaptls.pem | openssl pkcs7 -print_certs -noout | grep -c "subject=" # 1 ← ONLY the leaf cert!
ROOT CAUSE IDENTIFIED:
Client certificate file contains only the leaf cert (count=1). ISE expects the client to send the full chain during EAP-TLS handshake:
-
client cert → DOMUS-ISSUING-CA → DOMUS-ROOT-CA
If ISE doesn’t have the intermediate cached or the client doesn’t send it, ISE waits for the full chain, times out after 120 seconds, and logs error 12935.
Fix (pending):
# Create chain file
cat /etc/ssl/certs/modestus-razer-eaptls.pem \
/etc/ssl/certs/DOMUS-ISSUING-CA.pem \
> /tmp/modestus-razer-chain.pem
# Verify chain
openssl crl2pkcs7 -nocrl -certfile /tmp/modestus-razer-chain.pem | \
openssl pkcs7 -print_certs -noout | grep -c "subject="
# Should be 2 (leaf + intermediate)
# Update NM connection
sudo cp /tmp/modestus-razer-chain.pem /etc/ssl/certs/modestus-razer-eaptls-chain.pem
sudo nmcli con modify "Domus-WiFi-EAP-TLS" \
802-1x.client-cert /etc/ssl/certs/modestus-razer-eaptls-chain.pem
# Reconnect
nmcli con down "Domus-WiFi-EAP-TLS" && nmcli con up "Domus-WiFi-EAP-TLS"
Result: [ ] PENDING - fix not yet applied
Key Learnings from 2026-03-11
EAP-TLS Chain Lesson
| Issue | Solution |
|---|---|
ISE error 5411 "Supplicant stopped responding" |
Client cert file missing intermediate CA - ISE waits 120s for full chain |
"Secrets were required, but not provided" |
Red herring - NM error when wpa_supplicant can’t complete handshake |
Leaf-only cert + 2-tier PKI |
Always include intermediate in client cert file for EAP-TLS |
Key Learnings from 2026-03-10
Vault HA Lessons
| Issue | Solution |
|---|---|
PKI role doesn’t allow short hostnames |
Add hostnames explicitly to |
"chown: invalid user: vault:vault" |
Vault not installed - cloud-init may not run on copied images |
Glob expansion over SSH fails |
Use explicit file paths, not |
"failed to get raft challenge" |
Add |
"file descriptor 0 is not a terminal" |
Use |
Failover fails: "mkdir /var/log/vault: permission denied" |
Create audit directory on ALL nodes BEFORE enabling audit logging |
Non-deterministic leader election |
Don’t care who’s leader - make clients HA-aware with health checks |
SELinux denial whack-a-mole |
Use |
Manual vs systemd SELinux |
Manual |
CLI Development Lessons
| Pattern | Learning |
|---|---|
Typer + Rich |
|
JSON output flag |
Always add |
Client validation |
Validate API keys early in |
Git forge auth differences |
GitHub: |
xargs with shell functions |
Functions aren’t in PATH - use |
Command composition |
|
virsh Console Emergency Access
When SSH fails and sudo is broken:
# From kvm-01/kvm-02
sudo virsh console vault-01
# Login as ansible (has NOPASSWD sudo)
# Reset user password
sudo passwd evanusmodestus
# Exit console: Ctrl+]
Tomorrow (2026-03-12)
P0 - Must Complete
-
Fix modestus-razer EAP-TLS - Add intermediate CA to client cert chain
-
Vocera EAP-TLS - ~10 phones failing 802.1X (work)
Carried Over (Not Complete Today)
-
k3s NAT verification - NAT rule 170 applied, test pod internet access
-
Wazuh indexer recovery - Restart pod after NAT confirmed working
-
Monad Pipeline Evaluation - Test pipeline creation, input sources, transforms
-
PacketFence VM exploration - Deploy packetfence-01 on kvm-02
HA Queue
-
Keycloak rebuild (P3 in HA queue)
-
FreeIPA ipa-02 replica (P4)
Other
-
Documentation catchup
-
Test netapi git forge commands with real tokens
Runbook References
| Runbook | URL |
|---|---|
Vault HA Deployment |
Vault HA Deployment (infra-ops runbook) |
Vault Backup |
Vault Backup to NAS (infra-ops runbook) |
KVM Operations |
KVM Operations (infra-ops runbook) |