Vault Enterprise Hardening Roadmap
Complete Infrastructure Overview
Vault is the secrets backbone for this entire ecosystem. Every service shown here will integrate with Vault for certificates, secrets, and dynamic credentials.
Current State Assessment
Two-Tier PKI
DOMUS-ROOT-CA (offline) + DOMUS-ISSUING-CA (online)
Certificate Issuance
EAP-TLS certs for 802.1X working
High Availability
Single node = single point of failure
Audit Logging
/var/log/vault/audit.log enabled (2026-02-21)
Policies
pki-issuer, kv-reader, admin, ssh-client
Auth Methods
AppRole (netapi, ssh-user) - LDAP/OIDC planned
Target Topology
| Host | Current VMs | Target VMs |
|---|---|---|
Supermicro A |
vault-01, bind-01, k3s-01 |
vault-01 (leader), bind-01, k3s-01 |
Supermicro B |
(planned) |
vault-02, vault-03, bind-02, k3s-02 |
Raft Quorum: 3 nodes = survives 1 node failure
Kubernetes Identity Integration
Once Vault HA is operational, it becomes the secrets backbone for container workloads. This diagram shows how your existing infrastructure (AD, Keycloak, Vault, ISE) powers Kubernetes:
Key Integration Points:
-
Active Directory β Source of truth for users and groups
-
Keycloak β OIDC broker for k8s API and application SSO
-
Vault β Secrets injection via Vault Agent sidecar (no hardcoded credentials)
-
ISE β 802.1X authentication for k3s nodes at the network layer
Vault will STOP serving requests if it cannot write to audit log. Ensure path is reliable.
# Create audit directory
sudo mkdir -p /var/log/vault
sudo chown vault:vault /var/log/vault
# Enable file audit
vault audit enable file file_path=/var/log/vault/audit.log
# Verify
vault audit list
vault secrets list
sudo tail -1 /var/log/vault/audit.log | jq '.request.path'
2.1 PKI Issuer Policy
For netapi and automation to issue certificates:
cat > /tmp/pki-issuer.hcl << 'EOF'
path "pki_int/issue/*" {
capabilities = ["create", "update"]
}
path "pki_int/certs" {
capabilities = ["list"]
}
path "pki_int/cert/*" {
capabilities = ["read"]
}
path "pki_int/ca/pem" {
capabilities = ["read"]
}
path "pki_int/ca_chain" {
capabilities = ["read"]
}
EOF
vault policy write pki-issuer /tmp/pki-issuer.hcl
2.2 KV Reader Policy
Read-only secrets access:
cat > /tmp/kv-reader.hcl << 'EOF'
path "kv/data/domus/*" {
capabilities = ["read", "list"]
}
path "kv/metadata/domus/*" {
capabilities = ["list"]
}
EOF
vault policy write kv-reader /tmp/kv-reader.hcl
2.3 Admin Policy
Full access (use sparingly):
cat > /tmp/admin.hcl << 'EOF'
path "*" {
capabilities = ["create", "read", "update", "delete", "list", "sudo"]
}
EOF
vault policy write admin /tmp/admin.hcl
2.4 Verify Policies
vault policy list
admin default kv-reader pki-issuer root ssh-client
Use Cases:
-
API keys that automation needs
-
Service account passwords
-
Shared team secrets
-
Secrets that need versioning/rotation
# Enable KV v2
vault secrets enable -path=kv kv-v2
# Create namespace structure
vault kv put kv/domus/infrastructure/placeholder initialized=true
vault kv put kv/domus/automation/placeholder initialized=true
vault kv put kv/domus/certificates/placeholder initialized=true
4.1 AppRole for Automation
For netapi, dsec, CI/CD pipelines:
vault auth enable approle
vault write auth/approle/role/netapi \
token_policies="pki-issuer" \
token_ttl=1h \
token_max_ttl=4h \
secret_id_ttl=0
# Get role ID
vault read auth/approle/role/netapi/role-id
# Generate secret ID (store securely!)
vault write -f auth/approle/role/netapi/secret-id
4.2 LDAP Auth (Future)
vault auth enable ldap
vault write auth/ldap/config \
url="ldaps://home-dc01.inside.domusdigitalis.dev" \
userdn="CN=Users,DC=inside,DC=domusdigitalis,DC=dev" \
groupdn="CN=Users,DC=inside,DC=domusdigitalis,DC=dev" \
binddn="CN=vault-ldap,CN=Users,DC=inside,DC=domusdigitalis,DC=dev" \
bindpass="<ldap-bind-password>" \
certificate=@/etc/ssl/certs/ca-certificates.crt
4.3 OIDC via Keycloak (Future)
vault auth enable oidc
vault write auth/oidc/config \
oidc_discovery_url="https://keycloak-01.inside.domusdigitalis.dev:8443/realms/domusdigitalis" \
oidc_client_id="vault" \
oidc_client_secret="<client-secret>" \
default_role="reader"
|
Requires planning and downtime. Back up Vault data directory first! |
5.1 Prerequisites
-
Vault External TLS configured on current node
-
Supermicro Host B operational (or Synology NFS for initial deployment)
-
3 VMs: vault-01, vault-02, vault-03
-
DNS records configured
-
Firewall: 8200/tcp (API), 8201/tcp (Raft)
-
TLS certs from DOMUS-ISSUING-CA for each node
-
FileβRaft migration completed (see 5.2 below)
5.2 File to Raft Storage Migration
|
If vault-01 currently uses |
5.2.1 Check Current Storage Backend
ssh vault-01 "grep -A3 'storage' /etc/vault.d/vault.hcl"
If output shows storage "file", proceed with migration. If storage "raft", skip to 5.3.
5.2.2 Backup Current Vault Data
# Stop Vault service
ssh vault-01 "sudo systemctl stop vault"
# Create backup of file storage
ssh vault-01 "sudo tar -czvf /tmp/vault-file-backup-$(date +%Y%m%d).tar.gz /opt/vault/data"
# Copy backup to NAS (belt and suspenders)
ssh vault-01 "sudo cp /tmp/vault-file-backup-*.tar.gz /mnt/nas/backups/vault/"
5.2.3 Create Migration Configuration
ssh vault-01 "sudo cat > /etc/vault.d/migrate.hcl << 'EOF'
storage_source \"file\" {
path = \"/opt/vault/data\"
}
storage_destination \"raft\" {
path = \"/opt/vault/raft\"
node_id = \"vault-01\"
}
EOF"
5.2.4 Prepare Raft Directory
ssh vault-01 "sudo mkdir -p /opt/vault/raft && sudo chown vault:vault /opt/vault/raft"
5.2.5 Run Migration
ssh vault-01 "sudo -u vault vault operator migrate -config=/etc/vault.d/migrate.hcl"
2026/02/24 14:30:00 [INFO] copied key: core/... 2026/02/24 14:30:00 [INFO] copied key: logical/... ... Success! All data migrated.
5.2.6 Update vault.hcl for Raft
Backup existing config:
ssh vault-01 "sudo cp /etc/vault.d/vault.hcl /etc/vault.d/vault.hcl.file-backup"
Create new Raft-enabled config:
ssh vault-01 "sudo cat > /etc/vault.d/vault.hcl << 'EOF'
# Vault Configuration - Raft HA Cluster
# Migrated from file storage on $(date +%Y-%m-%d)
ui = true
disable_mlock = true
# Raft Integrated Storage (HA-ready)
storage \"raft\" {
path = \"/opt/vault/raft\"
node_id = \"vault-01\"
# Retry join for HA (vault-02/03 will join this node)
retry_join {
leader_api_addr = \"https://vault-02.inside.domusdigitalis.dev:8200\"
}
retry_join {
leader_api_addr = \"https://vault-03.inside.domusdigitalis.dev:8200\"
}
}
# HTTPS listener
listener \"tcp\" {
address = \"0.0.0.0:8200\"
tls_cert_file = \"/opt/vault/tls/vault.crt\"
tls_key_file = \"/opt/vault/tls/vault.key\"
}
# Cluster communication
cluster_addr = \"https://vault-01.inside.domusdigitalis.dev:8201\"
api_addr = \"https://vault-01.inside.domusdigitalis.dev:8200\"
# Audit logging
# Enable with: vault audit enable file file_path=/var/log/vault/audit.log
EOF"
5.2.7 Start Vault and Unseal
ssh vault-01 "sudo systemctl start vault"
# Check status (will be sealed)
ssh vault-01 "vault status"
# Unseal (requires 2 of 3 keys from dsec d000 dev/vault)
ssh vault-01 "vault operator unseal" # Enter key 1
ssh vault-01 "vault operator unseal" # Enter key 2
5.2.8 Verify Migration
# Check storage backend
ssh vault-01 "vault status | grep -E 'Storage|HA'"
Storage Type raft HA Enabled true HA Cluster https://vault-01.inside.domusdigitalis.dev:8201 HA Mode active
# Verify data integrity - list PKI certs
ssh vault-01 "vault list pki_int/certs"
# Verify SSH CA
ssh vault-01 "vault read ssh/config/ca"
5.3 Deploy vault-02 and vault-03
Deploy two new VMs for HA. Use Rocky Linux 9 (same as vault-01).
5.3.1 VM Deployment Options
| Option | Storage Location | Failure Domain |
|---|---|---|
Recommended |
vault-01 on kvm-01 SSD, vault-02 on kvm-02 SSD, vault-03 on NAS |
Survives single host or NAS failure |
Initial (today) |
All 3 on Synology NAS NFS |
NAS = SPOF, but gets HA running |
Future |
Move vault-02 to kvm-02 local SSD when available |
Proper failure domains |
5.3.2 Create vault-02 and vault-03 VMs
Use same cloud-init pattern as k3s-master-01 (see k3s Deployment).
# On kvm-01 or kvm-02
for NODE in vault-02 vault-03; do
IP_SUFFIX=$([[ "$NODE" == "vault-02" ]] && echo "61" || echo "62")
# Create cloud-init
cat > /tmp/${NODE}-cloud-init.yml << EOF
#cloud-config
hostname: ${NODE}
fqdn: ${NODE}.inside.domusdigitalis.dev
users:
- name: ansible
groups: wheel
sudo: ALL=(ALL) NOPASSWD:ALL
ssh_authorized_keys:
- $(cat ~/.ssh/id_ed25519.pub)
runcmd:
- dnf install -y vault
EOF
# Create VM (adjust paths for your storage)
# ... (use virt-install pattern from k3s-deployment)
done
5.3.3 Install Vault on New Nodes
for NODE in vault-02 vault-03; do
ssh $NODE "sudo dnf install -y dnf-plugins-core && \
sudo dnf config-manager --add-repo https://rpm.releases.hashicorp.com/RHEL/hashicorp.repo && \
sudo dnf install -y vault"
done
5.3.4 Configure vault-02.hcl
ssh vault-02 "sudo cat > /etc/vault.d/vault.hcl << 'EOF'
ui = true
disable_mlock = true
storage \"raft\" {
path = \"/opt/vault/raft\"
node_id = \"vault-02\"
retry_join {
leader_api_addr = \"https://vault-01.inside.domusdigitalis.dev:8200\"
}
retry_join {
leader_api_addr = \"https://vault-03.inside.domusdigitalis.dev:8200\"
}
}
listener \"tcp\" {
address = \"0.0.0.0:8200\"
tls_cert_file = \"/opt/vault/tls/vault.crt\"
tls_key_file = \"/opt/vault/tls/vault.key\"
}
cluster_addr = \"https://vault-02.inside.domusdigitalis.dev:8201\"
api_addr = \"https://vault-02.inside.domusdigitalis.dev:8200\"
EOF"
5.3.5 Configure vault-03.hcl
ssh vault-03 "sudo cat > /etc/vault.d/vault.hcl << 'EOF'
ui = true
disable_mlock = true
storage \"raft\" {
path = \"/opt/vault/raft\"
node_id = \"vault-03\"
retry_join {
leader_api_addr = \"https://vault-01.inside.domusdigitalis.dev:8200\"
}
retry_join {
leader_api_addr = \"https://vault-02.inside.domusdigitalis.dev:8200\"
}
}
listener \"tcp\" {
address = \"0.0.0.0:8200\"
tls_cert_file = \"/opt/vault/tls/vault.crt\"
tls_key_file = \"/opt/vault/tls/vault.key\"
}
cluster_addr = \"https://vault-03.inside.domusdigitalis.dev:8201\"
api_addr = \"https://vault-03.inside.domusdigitalis.dev:8200\"
EOF"
5.4 TLS Certificates for New Nodes
Issue certs from Vault PKI for vault-02 and vault-03.
for NODE in vault-02 vault-03; do
# Issue cert
vault write -format=json pki_int/issue/domus-server \
common_name="${NODE}.inside.domusdigitalis.dev" \
alt_names="${NODE}" \
ttl="8760h" > /tmp/${NODE}-cert.json
# Extract
jq -r '.data.certificate' /tmp/${NODE}-cert.json > /tmp/${NODE}.crt
jq -r '.data.private_key' /tmp/${NODE}-cert.json > /tmp/${NODE}.key
jq -r '.data.ca_chain[]' /tmp/${NODE}-cert.json > /tmp/${NODE}-chain.crt
# Deploy
ssh $NODE "sudo mkdir -p /opt/vault/tls"
scp /tmp/${NODE}.crt ${NODE}:/tmp/
scp /tmp/${NODE}.key ${NODE}:/tmp/
scp /tmp/${NODE}-chain.crt ${NODE}:/tmp/
ssh $NODE "sudo mv /tmp/${NODE}.crt /opt/vault/tls/vault.crt && \
sudo mv /tmp/${NODE}.key /opt/vault/tls/vault.key && \
sudo cat /tmp/${NODE}-chain.crt >> /opt/vault/tls/vault.crt && \
sudo chown vault:vault /opt/vault/tls/* && \
sudo chmod 600 /opt/vault/tls/vault.key"
done
5.5 Join Cluster
5.5.1 Start Vault on New Nodes
for NODE in vault-02 vault-03; do
ssh $NODE "sudo mkdir -p /opt/vault/raft && sudo chown vault:vault /opt/vault/raft"
ssh $NODE "sudo systemctl enable --now vault"
done
5.5.2 Join to Leader (vault-01)
# On vault-02
ssh vault-02 "VAULT_ADDR='https://vault-02.inside.domusdigitalis.dev:8200' vault operator raft join https://vault-01.inside.domusdigitalis.dev:8200"
# On vault-03
ssh vault-03 "VAULT_ADDR='https://vault-03.inside.domusdigitalis.dev:8200' vault operator raft join https://vault-01.inside.domusdigitalis.dev:8200"
5.6 Verify HA Cluster
# List Raft peers (run from any node)
vault operator raft list-peers
Node Address State Voter ---- ------- ----- ----- vault-01 vault-01.inside.domusdigitalis.dev:8201 leader true vault-02 vault-02.inside.domusdigitalis.dev:8201 follower true vault-03 vault-03.inside.domusdigitalis.dev:8201 follower true
# Check HA status from each node
for NODE in vault-01 vault-02 vault-03; do
echo "=== $NODE ==="
ssh $NODE "vault status | grep -E 'HA|Storage'"
done
5.7 DNS Load Balancing (Optional)
For client HA, add DNS round-robin or use a load balancer.
# Add vault.inside.domusdigitalis.dev with round-robin A records via BIND
ssh bind-01 "sudo nsupdate -l << 'EOF'
zone inside.domusdigitalis.dev
update add vault.inside.domusdigitalis.dev. 300 A 10.50.1.60
update add vault.inside.domusdigitalis.dev. 300 A 10.50.1.61
update add vault.inside.domusdigitalis.dev. 300 A 10.50.1.62
send
EOF"
# Low TTL (300s) for faster failover
# For true LB, use HAProxy or keepalived VIP
5.8 Verify Failover
Test that cluster survives node failure:
# Stop the leader
ssh vault-01 "sudo systemctl stop vault"
# Check new leader election (wait 10s)
sleep 10
ssh vault-02 "vault operator raft list-peers"
# Verify operations still work
vault list pki_int/certs
# Restart vault-01
ssh vault-01 "sudo systemctl start vault"
ssh vault-01 "vault operator unseal" # Key 1
ssh vault-01 "vault operator unseal" # Key 2
Current State: Shamir Seal (Manual)
Vault uses Shamir’s Secret Sharing (invented by Adi Shamir, 1979 - the "S" in RSA). The master key is split into N shares, requiring K threshold shares to reconstruct.
| Parameter | Value |
|---|---|
Total shares |
3 |
Threshold |
2 |
Storage |
dsec (d000/dev/vault) |
Risk: After VM restart, power outage, or Vault service restart - Vault remains SEALED until 2 operators provide unseal keys. All PKI, secrets, and SSH CA operations fail.
Monitoring Sealed State
Add to monitoring stack (Prometheus/Zabbix):
# Check seal status (0 = unsealed, 1 = sealed)
curl -s http://127.0.0.1:8200/v1/sys/health | jq '.sealed'
# One-liner for cron alerting
vault status -format=json | jq -e '.sealed == false' > /dev/null || echo "VAULT SEALED" | mail -s "ALERT: Vault Sealed" admin@example.com
Auto-Unseal Options
| Method | Use Case | Cost |
|---|---|---|
AWS KMS |
Cloud deployments |
~$1/month |
Azure Key Vault |
Azure deployments |
~$0.03/10k ops |
GCP Cloud KMS |
GCP deployments |
~$0.03/10k ops |
Transit Engine |
Self-hosted (another Vault) |
Infrastructure only |
HSM (PKCS#11) |
Hardware Security Module |
$$$ |
Recommended: Transit Auto-Unseal
For homelab/enterprise: Deploy a minimal "seal Vault" on separate infrastructure:
βββββββββββββββββββββββ βββββββββββββββββββββββ
β Primary Vault β β Seal Vault β
β (vault-01) βββββββΊβ (nas-01 container)β
β β β β
β - PKI β β - Transit only β
β - KV secrets β β - Auto-unseal key β
β - SSH CA β β - Minimal surface β
β - Auto-unseals β β - Different host β
βββββββββββββββββββββββ βββββββββββββββββββββββ
Implementation: Phase 6 of this roadmap (future work).
Quick Reference
Daily Operations
# Login with AppRole
vault write auth/approle/login role_id=<role-id> secret_id=<secret-id>
# Issue certificate
vault write pki_int/issue/domus-client-users common_name="hostname.inside.domusdigitalis.dev"
# Read a secret
vault kv get kv/domus/infrastructure/ise
Related Documentation
-
PKI Strategy runbook (infra-ops)
-
DOMUS PKI Key Ceremony (infra-ops)
-
dsec Integration (secrets-infrastructure)
Revision History
| Version | Date | Changes |
|---|---|---|
1.4 |
2026-02-24 |
Phase 5 overhaul: Added fileβraft migration (5.2), VM deployment (5.3), TLS certs (5.4), cluster join (5.5), verification (5.6), failover testing (5.8). Root cause: Original Phase 5 assumed Raft storage, but vault-01 uses file storage. |
1.3 |
2026-02-18 |
Added Complete Infrastructure Overview with radial diagram |
1.2 |
2026-02-18 |
Added Kubernetes Identity Integration section with diagram |
1.1 |
2026-02-18 |
Visual redesign with phase cards |
1.0 |
2026-02-17 |
Initial roadmap with 6 phases |