Disaster Recovery Roadmap

Disaster recovery strategy for all critical infrastructure systems, including backup automation and secondary Supermicro server deployment.

1. Status Overview

System Backup Status Priority

ISE-02 (PAN)

Configured

P1

pfSense-01

Configured

P1

bind-01 (BIND DNS)

NOT CONFIGURED

P0 - CRITICAL

ipa-01 (FreeIPA)

NOT CONFIGURED

P0 - CRITICAL

Keycloak-01

NOT CONFIGURED

P1

KVM-01 VMs

Partial

P1

NAS-01 (Synology)

RAID + Snapshots

P2

home-dc01 (AD)

NOT CONFIGURED

P1

2. Phase 1: Critical Backups (Immediate)

2.1. 1.1 FreeIPA Backup (ipa-01)

FreeIPA includes built-in backup tools.

# Full backup (offline - stops services briefly)
sudo ipa-backup

# Data-only backup (online - no service interruption)
sudo ipa-backup --data --online

Backups stored in: /var/lib/ipa/backup/

Automation:

# Add to crontab
0 2 * * * /usr/sbin/ipa-backup --data --online --log-file=/var/log/ipa-backup.log

Upload to NAS:

# After backup, sync to NAS
rsync -avz /var/lib/ipa/backup/ nas-01:/volume1/backups/ipa-01/

2.2. 1.2 BIND DNS Backup (bind-01)

# Zone files
/var/named/*.zone

# Configuration
/etc/named.conf
/etc/named/*.conf

Backup script:

#!/bin/bash
BACKUP_DIR="/var/backups/bind"
DATE=$(date +%Y%m%d)

mkdir -p $BACKUP_DIR
tar -czf $BACKUP_DIR/bind-$DATE.tar.gz \
    /etc/named.conf \
    /etc/named/ \
    /var/named/

# Upload to NAS
rsync -avz $BACKUP_DIR/ nas-01:/volume1/backups/bind-01/

2.3. 1.3 Keycloak Backup (keycloak-01)

Keycloak uses PostgreSQL backend.

# Database backup
pg_dump -U keycloak keycloak > /var/backups/keycloak/keycloak-$(date +%Y%m%d).sql

# Or using Keycloak export
/opt/keycloak/bin/kc.sh export --dir /var/backups/keycloak/export

2.4. 1.4 AD Domain Controller (home-dc01)

Windows Server backup via PowerShell:

# System state backup
wbadmin start systemstatebackup -backupTarget:\\nas-01\backups\home-dc01

# Or full backup
wbadmin start backup -backupTarget:\\nas-01\backups\home-dc01 -include:C: -allCritical

3. Phase 2: KVM VM Backups

3.1. 2.1 Current VMs on KVM-01

VM Purpose Backup Method Status

ipa-01

FreeIPA IdM

ipa-backup + qcow2 snapshot

TODO

bind-01

BIND DNS

Zone files + qcow2 snapshot

TODO

keycloak-01

SAML/OIDC IdP

pg_dump + qcow2 snapshot

TODO

vault-01

Vault PKI

Vault backup + qcow2 snapshot

TODO

ipsk-mgr-01

iPSK Portal

MySQL dump + qcow2 snapshot

TODO

3.2. 2.2 qcow2 Snapshot Strategy

# Create snapshot (VM must be stopped or use --quiesce)
virsh snapshot-create-as ipa-01 snap-$(date +%Y%m%d) --disk-only --atomic

# List snapshots
virsh snapshot-list ipa-01

# Backup qcow2 to NAS
rsync -avz --progress /var/lib/libvirt/images/ipa-01.qcow2 nas-01:/volume1/backups/kvm-01/

3.3. 2.3 Automated VM Backup Script

#!/bin/bash
# /usr/local/bin/backup-vms.sh

VMS="ipa-01 bind-01 keycloak-01 vault-01"
BACKUP_DIR="/var/backups/kvm"
NAS_PATH="nas-01:/volume1/backups/kvm-01"
DATE=$(date +%Y%m%d)

for VM in $VMS; do
    echo "Backing up $VM..."

    # Create snapshot
    virsh snapshot-create-as $VM snap-$DATE --disk-only --atomic

    # Copy base image
    rsync -avz --progress /var/lib/libvirt/images/${VM}.qcow2 $NAS_PATH/

    # Commit and delete snapshot
    virsh blockcommit $VM vda --active --pivot
    virsh snapshot-delete $VM snap-$DATE --metadata
done

echo "Backup complete: $(date)"

4. Phase 3: Secondary Supermicro Server (KVM-02)

4.1. 3.1 Hardware Specifications

Component Value

Model

Supermicro 300-9D (same as KVM-01)

Hostname

kvm-02.inside.domusdigitalis.dev

IP Address

10.50.1.62 (planned)

IPMI IP

10.50.1.63 (planned)

Role

HA/DR replica, standby VMs

4.2. 3.2 Deployment Tasks

  • Rack and cable KVM-02

  • Install Arch Linux (match KVM-01 config)

  • Configure libvirt/KVM

  • Configure NFS mount to NAS-01

  • Set up VM replication from KVM-01

  • Test failover procedure

4.3. 3.3 Replication Strategy

Option A: rsync-based (Simple)

# Periodic sync of qcow2 images
rsync -avz --progress /var/lib/libvirt/images/ kvm-02:/var/lib/libvirt/images/

Option B: DRBD (Real-time)

Distributed Replicated Block Device for real-time replication.

# /etc/drbd.d/kvm-storage.res
resource kvm-storage {
    device /dev/drbd0;
    disk /dev/sdb1;
    meta-disk internal;

    on kvm-01 {
        address 10.50.1.61:7789;
    }
    on kvm-02 {
        address 10.50.1.62:7789;
    }
}

Option C: Ceph (Enterprise)

Distributed storage cluster - overkill for 2 nodes but future-proof.

4.4. 3.4 Failover Procedure

# On KVM-02 (standby becoming primary)

# 1. Verify latest backup sync
ls -la /var/lib/libvirt/images/

# 2. Import VM definitions
for xml in /var/lib/libvirt/qemu/*.xml; do
    virsh define $xml
done

# 3. Start critical VMs
virsh start ipa-01
virsh start bind-01
virsh start keycloak-01

# 4. Update DNS (on bind-01 or pfSense)
# Point service names to KVM-02 IPs

# 5. Verify services
ssh ipa-01 'sudo ipactl status'

5. Phase 4: Backup Verification

5.1. 4.1 Monthly Restore Tests

System Test Procedure Last Tested

FreeIPA

Restore to test VM, verify ipa user-find

Never

BIND

Restore zones, verify dig queries

Never

Keycloak

Restore DB, verify SAML login

Never

ISE

Restore config backup, verify policies

Never

5.2. 4.2 Restore Commands

FreeIPA:

sudo ipa-restore /var/lib/ipa/backup/ipa-full-YYYYMMDD-HHMMSS

BIND:

tar -xzf bind-YYYYMMDD.tar.gz -C /
systemctl restart named

Keycloak:

psql -U keycloak keycloak < keycloak-YYYYMMDD.sql
systemctl restart keycloak

6. Phase 5: Monitoring & Alerts

6.1. 5.1 Backup Monitoring

  • Set up backup success/failure notifications

  • Monitor NAS storage capacity

  • Alert on missed backup windows

  • Track backup sizes over time

6.2. 5.2 Integration with netapi

# Planned netapi commands
netapi backup status --all
netapi backup run ipa-01
netapi backup verify ipa-01 --restore-test

7. Timeline

Phase Deliverable Target

Phase 1

Critical system backups configured

2026-02-20

Phase 2

KVM VM backup automation

2026-02-25

Phase 3

KVM-02 deployed and syncing

2026-03-15

Phase 4

Monthly restore tests scheduled

2026-03-01

Phase 5

Monitoring integrated

2026-03-30

8. Phase 6: YubiKey Security Integration

8.1. 6.1 Why YubiKey for DR

Use Case Benefit

Backup Encryption

Age encryption keys stored on YubiKey PIV slot - backups can’t be decrypted without physical key

SSH to Backup Systems

sk-ssh-ed25519 keys require YubiKey touch - no stolen SSH keys

LUKS Recovery

DR systems require YubiKey + passphrase - two-factor disk encryption

GPG Signing

Backup manifests signed with YubiKey-resident GPG key - tamper-evident

8.2. 6.2 Age Encryption with YubiKey PIV

# Use age-plugin-yubikey for hardware-backed encryption
age -R ~/.secrets/.metadata/keys/yubikey.pub -o backup.tar.gz.age backup.tar.gz

# Decrypt requires YubiKey touch
age -d -i yubikey backup.tar.gz.age > backup.tar.gz

8.3. 6.3 SSH Backup Access

Already configured - your sk-ssh-ed25519 keys require YubiKey for:

  • SSH to ipa-01, bind-01, kvm-01

  • SSH to NAS-01 for backup uploads

  • Git push to ~/.secrets repo

8.4. 6.4 LUKS with YubiKey Challenge-Response

For KVM-02 and DR systems:

# Add YubiKey slot to LUKS
sudo systemd-cryptenroll /dev/sda2 --fido2-device=auto

# Now unlock requires YubiKey + passphrase

8.5. 6.5 Benefits Over Password-Only

Attack Vector Password-Only YubiKey + Password

Keylogger

Compromised

Protected (need physical key)

Credential Theft

Compromised

Protected

Backup Server Compromise

Backups readable

Encrypted, need YubiKey

SSH Key Theft

Full access

Key useless without YubiKey

Bottom line: Anyone who says YubiKey for DR is stupid doesn’t understand defense in depth.