WRKLOG-2026-02-22

Summary

Full infrastructure backup cycle completed. Vault SSH CA automation scripts added to git (vault-ssh-sign, vault-ssh-test.sh). NAS share management documented with critical fixes (scp workaround, synoshare syntax). Comprehensive k3s-prometheus-grafana runbook created with D2 animated architecture diagram. kvm-02 hardware upgrade in progress (64GB RAM).

Carried Over from 02-21

Task Status Notes

Install k3s with SELinux

Done (02-21)

k3s + Cilium CNI operational

Configure Vault Agent integration

Pending

Deferred to Prometheus deployment

Add DNS for k3s-master-01

Done (02-21)

Added via netapi pfsense dns

Day Priorities

Priority Status Task

P0

DONE

Infrastructure backups (all systems)

P0

DONE

Vault SSH scripts to git

P0

DONE

k3s-prometheus-grafana runbook

P1

IN PROGRESS

Deploy Prometheus + Grafana

P1

IN PROGRESS

kvm-02 hardware upgrade (64GB RAM)

P2

Pending

k3s HA cluster (after kvm-02)

Infrastructure Backups

Completed Backups

System Method Destination Status

pfSense

netapi pfsense backup --upload-nas

/volume1/firewall_backups

Done

WLC 9800

netapi wlc backup --upload-nas

/volume1/wlc_backups

Done

IOS Switches

netapi ios backup --all --upload-nas

/volume1/switch_backups

Done (1/2 - 9300 offline)

KVM VMs

netapi kvm backup --all --upload-nas

/volume1/kvm_backups

Done (10/10)

ISE

netapi ise backup

/volume1/ise_backups

Done

Vault

systemd timer (auto)

/volume1/vault_backups

Done

Keycloak

netapi keycloak backup --upload-nas

/volume1/keycloak_backups

Done

BIND DNS

SSH tarball + cat pipe

/volume1/bind_backups

Done

FreeIPA

SSH tarball + cat pipe

/volume1/ipa_backups

Done

NAS Share Creation

Created missing backup shares on nas-01:

# CRITICAL: synoshare creates directory - do NOT mkdir first
sudo synoshare --add bind_backups "BIND DNS backups" /volume1/bind_backups "" administrators "" 1 1
sudo synoshare --add keycloak_backups "Keycloak realm exports" /volume1/keycloak_backups "" administrators "" 1 1
sudo synoshare --add ipa_backups "FreeIPA backups" /volume1/ipa_backups "" administrators "" 1 1

Key Learnings

Vault SSH CA: default_principals silently ignored

CRITICAL: Vault SSH roles do NOT support default_principals. The parameter is silently ignored. You MUST specify valid_principals on every sign request.

Required principals: adminerosado,admin,ansible,evanusmodestus,root

Synology scp blocked

Synology DSM restricts scp writes even with 777 permissions. Use cat pipe:

# WRONG - scp blocked
scp file.tar.gz nas-01:/volume1/share/file.tar.gz

# CORRECT - cat pipe works
cat file.tar.gz | ssh nas-01 "cat > /volume1/share/file.tar.gz"

synoshare syntax is positional

# WRONG (key=value)
synoshare --add name=foo desc="bar" path=/volume1/foo

# CORRECT (positional)
synoshare --add foo "bar" /volume1/foo "" administrators "" 1 1

dsec path for keycloak backup

Updated NAS_HQ_01_BACKUP_KEYCLOAK in dsec to use designed share:

NAS_HQ_01_BACKUP_KEYCLOAK={{NAS_HQ_01_BASE_PATH}}/keycloak_backups

Vault SSH CA Scripts

Added to domus-infra-ops/docs/asciidoc/modules/ROOT/examples/:

Script Purpose Install To

vault-ssh-sign

Sign SSH key with all principals (8h cert)

~/.local/bin/

vault-ssh-test.sh

Test connectivity to all infrastructure hosts

~/.local/bin/vault-ssh-test

Updated vault-ssh-ca.adoc with:

  • Automation Scripts section

  • Collapsible script source blocks

  • Installation commands

k3s-prometheus-grafana Runbook

Created comprehensive runbook (k3s-prometheus-grafana.adoc):

  • D2 architecture diagram with animated flows

  • NFS StorageClass setup for NAS persistence

  • Complete Helm values (Prometheus, Grafana, AlertManager)

  • Resource limits and retention settings

  • Traefik IngressRoute configuration

  • Custom dashboard import patterns

  • AlertManager Slack integration example

  • Troubleshooting section

  • Validation health checks

Archived original to .archive/k3s-prometheus-grafana.adoc.bak

kvm-02 Hardware Upgrade

Installing 64GB RAM in Supermicro B (kvm-02).

Planned VMs for kvm-02

VM Resources Purpose

k3s-master-02

4 vCPU, 8GB

k3s HA (Raft)

k3s-master-03

4 vCPU, 8GB

k3s HA (Raft)

vault-02

2 vCPU, 4GB

Vault HA (Raft)

vault-03

2 vCPU, 4GB

Vault HA (Raft)

k3s-worker-01

4 vCPU, 16GB

Workloads (Wazuh, etc.)

Work Status

Active Projects

Project Priority Description Status

Linux AD Auth (Xianming Ding)

P0

dACL + 802.1X for Linux research workstations

Runbook ready, awaiting deployment window

iPSK Manager HA

P1

High availability for iPSK-Manager

Blocked - need to resolve DB replication

ISE 3.4 Migration

P1

Upgrade ise-02 (3.2p9) → ise-01 (3.4)

In planning

Switch Firmware Upgrades

P2

C9300 IOS-XE upgrade

Scheduled for maintenance window

Monday Preparation

  • Review Linux AD Auth deployment steps

  • Check iPSK-Manager DB replication status

  • Update CHLA change request for switch upgrades

Documentation Commits

Repo Commits

domus-infra-ops

0696fbe - Vault SSH scripts
3dea199 - k3s-prometheus-grafana runbook

domus-infra-ops

89d1ce3 - NAS scp workaround + share status

domus-netapi-docs

(synoshare syntax fixes - already pushed earlier)

Next Actions

  • Deploy Prometheus + Grafana (follow runbook)

  • Complete kvm-02 hardware + deploy VMs

  • k3s HA cluster (3 masters)

  • Vault HA cluster (3 nodes)