Recovery Drills

Overview

Regular recovery drills validate backup integrity and restore procedures. Without testing, backups are assumptions.

Schedule

Frequency Drill Type Scope

Monthly

Backup Verification

Confirm all backups exist and are readable

Quarterly

Partial Restore

Restore one component to verify procedure

Annually

Full DR Simulation

Rebuild critical infrastructure from backups

Monthly: Backup Verification

Step 1: Load Secrets

dsource d000 dev/storage

Step 2: Check Backup Status

netapi synology backup-status --detailed

Success criteria:

  • All categories show ✓ OK or age < 7 days

  • No errors reported

  • ISE, WLC, pfSense, Switches, KVM, Keycloak all present

Step 3: Run Fresh Backups (if stale)

# Network devices
dsource d000 dev/network
netapi ise backup --repo nas-01 --name "monthly-drill" --wait
netapi wlc backup --upload-nas
netapi pfsense backup --upload-nas
netapi ios backup --all --upload-nas
netapi kvm backup --all --upload-nas

# Keycloak
dsource d000 dev/identity
netapi keycloak backup --upload-nas

Step 4: Verify Borg Repository

# Check repository health
borg check ssh://nas-01/volume1/Backups/borg

# List recent archives
borg list ssh://nas-01/volume1/Backups/borg --last 5

Step 5: Document Results

Update drill log below with date, status, and any issues found.

Quarterly: Partial Restore Test

Rotate Target Each Quarter

Quarter Target Restore To

Q1 (Jan-Mar)

ISE Configuration

ise-02 (lab)

Q2 (Apr-Jun)

KVM VM Definition

Verify XML loads

Q3 (Jul-Sep)

Keycloak Realm

Test import to lab

Q4 (Oct-Dec)

Borg File Restore

Extract specific files

ISE Restore Test (Q1)

# Download backup
dsource d000 dev/storage
netapi synology backup-list ise
netapi synology download /ise_backups/<latest>.tar /tmp/

# Stage for restore (do NOT apply to production)
ls -la /tmp/*.tar

KVM Restore Test (Q2)

# Download VM definition
dsource d000 dev/storage
netapi synology download /kvm_backups/<vm>-<date>.xml /tmp/

# Verify XML parses
virsh dominfo --config /tmp/<vm>.xml

Keycloak Restore Test (Q3)

# Download realm export
dsource d000 dev/storage
netapi synology download /Backups/keycloak/<realm>.json /tmp/

# Verify JSON valid
jq . /tmp/<realm>.json | head -50

Borg File Restore Test (Q4)

# List archive contents
borg list ssh://nas-01/volume1/Backups/borg::<archive> | head -20

# Extract specific file
borg extract ssh://nas-01/volume1/Backups/borg::<archive> home/evanusmodestus/.secrets/.metadata/keys/master.age.key --stdout | head -5

Annually: Full DR Simulation

Scenario

Simulate complete infrastructure loss. Rebuild from:

  1. LUKS USB (cold storage)

  2. Borg backups (NAS)

  3. Infrastructure backups (NAS)

Pre-Drill Checklist

  • LUKS USB #1 accessible

  • NAS online and reachable

  • Test VM or spare hardware available

  • 4+ hours blocked for drill

Drill Procedure

# 1. Mount LUKS USB
sudo cryptsetup luksOpen /dev/sdX1 recovery
sudo mount /dev/mapper/recovery /mnt/recovery

# 2. Verify master key present
ls -la /mnt/recovery/secrets/master.age.key

# 3. Test decryption
age -d -i /mnt/recovery/secrets/master.age.key <test-file.age>

# 4. Verify Borg accessible
BORG_PASSPHRASE=$(cat /mnt/recovery/secrets/borg-passphrase)
borg list ssh://nas-01/volume1/Backups/borg

# 5. Download and stage infrastructure configs
netapi synology backup-list ise
netapi synology backup-list kvm

Success Criteria

  • LUKS USB decrypts successfully

  • Master age key decrypts test files

  • Borg repository lists archives

  • Infrastructure backups downloadable

  • Documented any gaps or issues

Known Issues

Device Issue Workaround Status

SWITCH_9300

10.50.1.11 - powered off

High power draw; only powered on when needed for lab work

Expected

Drill Log

Date Type Status Notes

2026-02-08

Monthly Verification

✓ Complete

All backups refreshed. SWITCH_9300 unreachable (10.50.1.11). ISE/WLC/KVM/Keycloak OK.