WRKLOG-2026-02-22
Summary
Full infrastructure backup cycle completed. Vault SSH CA automation scripts added to git (vault-ssh-sign, vault-ssh-test.sh). NAS share management documented with critical fixes (scp workaround, synoshare syntax). Comprehensive k3s-prometheus-grafana runbook created with D2 animated architecture diagram. kvm-02 hardware upgrade in progress (64GB RAM).
Carried Over from 02-21
| Task | Status | Notes |
|---|---|---|
Install k3s with SELinux |
Done (02-21) |
k3s + Cilium CNI operational |
Configure Vault Agent integration |
Pending |
Deferred to Prometheus deployment |
Add DNS for k3s-master-01 |
Done (02-21) |
Added via netapi pfsense dns |
Day Priorities
| Priority | Status | Task |
|---|---|---|
P0 |
DONE |
Infrastructure backups (all systems) |
P0 |
DONE |
Vault SSH scripts to git |
P0 |
DONE |
k3s-prometheus-grafana runbook |
P1 |
IN PROGRESS |
Deploy Prometheus + Grafana |
P1 |
IN PROGRESS |
kvm-02 hardware upgrade (64GB RAM) |
P2 |
Pending |
k3s HA cluster (after kvm-02) |
Infrastructure Backups
Completed Backups
| System | Method | Destination | Status |
|---|---|---|---|
pfSense |
|
|
Done |
WLC 9800 |
|
|
Done |
IOS Switches |
|
|
Done (1/2 - 9300 offline) |
KVM VMs |
|
|
Done (10/10) |
ISE |
|
|
Done |
Vault |
systemd timer (auto) |
|
Done |
Keycloak |
|
|
Done |
BIND DNS |
SSH tarball + cat pipe |
|
Done |
FreeIPA |
SSH tarball + cat pipe |
|
Done |
NAS Share Creation
Created missing backup shares on nas-01:
# CRITICAL: synoshare creates directory - do NOT mkdir first
sudo synoshare --add bind_backups "BIND DNS backups" /volume1/bind_backups "" administrators "" 1 1
sudo synoshare --add keycloak_backups "Keycloak realm exports" /volume1/keycloak_backups "" administrators "" 1 1
sudo synoshare --add ipa_backups "FreeIPA backups" /volume1/ipa_backups "" administrators "" 1 1
Key Learnings
Vault SSH CA: default_principals silently ignored
CRITICAL: Vault SSH roles do NOT support default_principals. The parameter is silently ignored. You MUST specify valid_principals on every sign request.
Required principals: adminerosado,admin,ansible,evanusmodestus,root
Synology scp blocked
Synology DSM restricts scp writes even with 777 permissions. Use cat pipe:
# WRONG - scp blocked
scp file.tar.gz nas-01:/volume1/share/file.tar.gz
# CORRECT - cat pipe works
cat file.tar.gz | ssh nas-01 "cat > /volume1/share/file.tar.gz"
synoshare syntax is positional
# WRONG (key=value)
synoshare --add name=foo desc="bar" path=/volume1/foo
# CORRECT (positional)
synoshare --add foo "bar" /volume1/foo "" administrators "" 1 1
dsec path for keycloak backup
Updated NAS_HQ_01_BACKUP_KEYCLOAK in dsec to use designed share:
NAS_HQ_01_BACKUP_KEYCLOAK={{NAS_HQ_01_BASE_PATH}}/keycloak_backups
Vault SSH CA Scripts
Added to domus-infra-ops/docs/asciidoc/modules/ROOT/examples/:
| Script | Purpose | Install To |
|---|---|---|
|
Sign SSH key with all principals (8h cert) |
|
|
Test connectivity to all infrastructure hosts |
|
Updated vault-ssh-ca.adoc with:
-
Automation Scripts section
-
Collapsible script source blocks
-
Installation commands
k3s-prometheus-grafana Runbook
Created comprehensive runbook (k3s-prometheus-grafana.adoc):
-
D2 architecture diagram with animated flows
-
NFS StorageClass setup for NAS persistence
-
Complete Helm values (Prometheus, Grafana, AlertManager)
-
Resource limits and retention settings
-
Traefik IngressRoute configuration
-
Custom dashboard import patterns
-
AlertManager Slack integration example
-
Troubleshooting section
-
Validation health checks
Archived original to .archive/k3s-prometheus-grafana.adoc.bak
kvm-02 Hardware Upgrade
Installing 64GB RAM in Supermicro B (kvm-02).
Planned VMs for kvm-02
| VM | Resources | Purpose |
|---|---|---|
k3s-master-02 |
4 vCPU, 8GB |
k3s HA (Raft) |
k3s-master-03 |
4 vCPU, 8GB |
k3s HA (Raft) |
vault-02 |
2 vCPU, 4GB |
Vault HA (Raft) |
vault-03 |
2 vCPU, 4GB |
Vault HA (Raft) |
k3s-worker-01 |
4 vCPU, 16GB |
Workloads (Wazuh, etc.) |
Work Status
Active Projects
| Project | Priority | Description | Status |
|---|---|---|---|
Linux AD Auth (Xianming Ding) |
P0 |
dACL + 802.1X for Linux research workstations |
Runbook ready, awaiting deployment window |
iPSK Manager HA |
P1 |
High availability for iPSK-Manager |
Blocked - need to resolve DB replication |
ISE 3.4 Migration |
P1 |
Upgrade ise-02 (3.2p9) → ise-01 (3.4) |
In planning |
Switch Firmware Upgrades |
P2 |
C9300 IOS-XE upgrade |
Scheduled for maintenance window |
Monday Preparation
-
Review Linux AD Auth deployment steps
-
Check iPSK-Manager DB replication status
-
Update CHLA change request for switch upgrades
Documentation Commits
| Repo | Commits |
|---|---|
domus-infra-ops |
|
domus-infra-ops |
|
domus-netapi-docs |
(synoshare syntax fixes - already pushed earlier) |
Next Actions
-
Deploy Prometheus + Grafana (follow runbook)
-
Complete kvm-02 hardware + deploy VMs
-
k3s HA cluster (3 masters)
-
Vault HA cluster (3 nodes)