kvm-01 Migration Planning
Overview
This runbook documents kvm-01’s current state and plans the migration of VMs to kvm-02.
Related Documentation
-
kvm-02 Deployment - Secondary hypervisor setup
-
VyOS Migration - Overall infrastructure migration project
|
HA-FIRST STRATEGY: Deploy HA infrastructure on kvm-02 BEFORE migrating VMs from kvm-01. This ensures true high availability across hypervisors, not just moving single points of failure. |
HA Deployment Prerequisites
Complete these phases before VM migration:
| Phase | Task | Status | Runbook |
|---|---|---|---|
0 |
NAS NFS permissions for kvm-02 (10.50.1.111) |
[ ] Pending |
|
1 |
Vault HA: vault-01 file→raft, deploy vault-02/03 on kvm-02 |
[ ] Pending |
|
2 |
DNS HA: deploy bind-02 on kvm-02 |
[ ] Pending |
|
3 |
Non-critical VM migration (ipsk-manager, keycloak-01) |
[ ] Pending |
|
4 (Future) |
Critical infrastructure HA (home-dc02, ise-02, vyos-02 VRRP) |
[ ] Planned |
TBD |
Current Network Topology (Actual from captured output)
| Interface | IP | Purpose |
|---|---|---|
eno1 |
192.168.1.225/24 |
OOB from AT&T modem (DHCP) - backup management |
virbr0 |
10.50.1.99/24 |
VM bridge (infrastructure VMs live here) |
virbr1 |
192.168.100.1/24 |
Lab bridge (DOWN, unused) |
eno8np3 |
(no IP, bridge member) |
10GbE uplink to switch, member of virbr0 |
The Dual-Path Issue
-
192.168.1.225 (eno1) - Direct from modem, always reachable from modem subnet
-
10.50.1.99 (virbr0) - Infrastructure network, reachable when pfSense routes it
The Routing Problem (Root Cause)
Default route goes to modem (192.168.1.1), NOT pfSense (10.50.1.1):
default via 192.168.1.1 dev eno1 proto static metric 20101
This means:
-
kvm-01 host internet traffic → modem → FAILS (modem does IP passthrough to pfSense, not kvm-01)
-
VMs on virbr0 → pfSense → internet → WORKS
-
kvm-01 to 10.50.1.x → virbr0 → WORKS (local to bridge)
Phase 0: IPMI Configuration
Out-of-band management via IPMI. Required for emergency recovery when network/SSH fails.
Related: kvm-02 IPMI Configuration
0.1 Verify IPMI Settings
# Check current IPMI network config
sudo ipmitool lan print 1 | grep -E "IP Address|MAC Address|Subnet|Gateway"
IP Address Source : Static Address
IP Address : {ipmi-ip}
Subnet Mask : {netmask-24}
MAC Address : 3c:ec:ef:43:50:42
Default Gateway IP : {pfsense-ip}
0.2 Set Static IP (if needed)
# Set static IP for IPMI
sudo ipmitool lan set 1 ipsrc static
sudo ipmitool lan set 1 ipaddr 10.50.1.200
sudo ipmitool lan set 1 netmask 255.255.255.0
sudo ipmitool lan set 1 defgw ipaddr 10.50.1.1
0.3 Verify IPMI LAN Mode (Dedicated)
Supermicro BMC supports three LAN modes. Dedicated mode is required for the separate IPMI port.
# Check current LAN mode
sudo ipmitool raw 0x30 0x70 0x0c 0
| Value | Mode | Description |
|---|---|---|
|
Dedicated (Required) |
Uses dedicated IPMI port only |
|
Shared |
Shares with onboard NIC1 |
|
Failover |
Tries dedicated, falls back to shared |
If mode is NOT 00, set to dedicated:
# Set LAN mode to Dedicated (0x00)
sudo ipmitool raw 0x30 0x70 0x0c 1 0
# Reset BMC to apply all changes
sudo ipmitool mc reset cold
|
If IPMI shows wrong MAC on switch (e.g., matches eno2 instead of dedicated port), the BMC is likely in Failover or Shared mode. Set to Dedicated mode and reset BMC. |
Phase 1: Current State Validation
1.1 Network Interfaces
# Show all interfaces with IPs
ip -4 -br addr show | grep -v '^lo'
virbr0 UP 10.50.1.99/24 eno1 UP 192.168.1.225/24 virbr1 DOWN 192.168.100.1/24
# Show routing table
ip route | awk 'NR<=10 {print NR": "$0}'
1: default via 192.168.1.1 dev eno1 proto static metric 20101 2: 10.50.1.0/24 dev virbr0 proto kernel scope link src 10.50.1.99 3: 192.168.1.0/24 dev eno1 proto kernel scope link src 192.168.1.225 metric 101 4: 192.168.100.0/24 dev virbr1 proto kernel scope link src 192.168.100.1 linkdown
# Check default gateways
ip route | grep default
evanusmodestus@supermicro300-9d1:~$ # Check default gateways ip route | grep default default via 192.168.1.1 dev eno1 proto static metric 20101
1.2 Bridge Configuration
# List bridges and members
bridge link show
3: eno5np0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 hwmode VEPA 5: eno6np1: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 hwmode VEPA 10: eno8np3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 master virbr0 state forwarding priority 32 cost 2 10: eno8np3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 master virbr0 hwmode VEPA 15: vnet0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 master virbr0 state forwarding priority 32 cost 2 16: vnet1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 master virbr0 state forwarding priority 32 cost 2 17: vnet2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 master virbr0 state forwarding priority 32 cost 2 19: vnet4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 master virbr0 state forwarding priority 32 cost 2 20: vnet5: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 master virbr0 state forwarding priority 32 cost 2 21: vnet6: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 master virbr0 state forwarding priority 32 cost 2 24: vnet9: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 master virbr0 state forwarding priority 32 cost 2 34: vnet19: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 master virbr0 state forwarding priority 32 cost 2 41: vnet26: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 master virbr0 state forwarding priority 32 cost 2 54: vnet39: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 master virbr0 state forwarding priority 32 cost 2 61: vnet46: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 master virbr0 state forwarding priority 32 cost 2 75: vnet60: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 master virbr0 state forwarding priority 32 cost 2
# Check virbr0 specifically
ip -d link show virbr0
2: virbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
link/ether 3a:1e:7c:ca:b9:ed brd ff:ff:ff:ff:ff:ff promiscuity 0 allmulti 0 minmtu 68 maxmtu 65535 netns-immutable
bridge forward_delay 1500 hello_time 200 max_age 2000 ageing_time 30000 stp_state 0 priority 32768 vlan_filtering 0 vlan_protocol 802.1Q bridge_id 8000.3a:1e:7c:ca:b9:ed designated_root 8000.3a:1e:7c:ca:b9:ed root_port 0 root_path_cost 0 topology_change 0 topology_change_detected 0 hello_timer 0.00 tcn_timer 0.00 topology_change_timer 0.00 gc_timer 154.49 fdb_n_learned 18 fdb_max_learned 0 vlan_default_pvid 1 vlan_stats_enabled 0 vlan_stats_per_port 0 group_fwd_mask 0 group_address 01:80:c2:00:00:00 mcast_snooping 1 no_linklocal_learn 0 mcast_vlan_snooping 0 mst_enabled 0 mdb_offload_fail_notification 0 mcast_router 1 mcast_query_use_ifaddr 0 mcast_querier 0 mcast_hash_elasticity 16 mcast_hash_max 4096 mcast_last_member_count 2 mcast_startup_query_count 2 mcast_last_member_interval 100 mcast_membership_interval 26000 mcast_querier_interval 25500 mcast_query_interval 12500 mcast_query_response_interval 1000 mcast_startup_query_interval 3125 mcast_stats_enabled 0 mcast_igmp_version 2 mcast_mld_version 1 nf_call_iptables 0 nf_call_ip6tables 0 nf_call_arptables 0 addrgenmode eui64 numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535 tso_max_size 65536 tso_max_segs 65535 gro_max_size 65536 gso_ipv4_max_size 65536 gro_ipv4_max_size 65536
1.3 VM Inventory
# List all VMs with state
sudo virsh list --all
# Show VM resource allocation
sudo virsh list --all | awk 'NR>2 && NF {print $2}' | while read vm; do
echo "=== $vm ==="
sudo virsh dominfo "$vm" | grep -E "CPU|Memory"
done
Id Name State ------------------------------- 1 pfSense-FW01 running 2 vault-01 running 4 9800-CL-WLC running 5 ipsk-manager running 8 keycloak-01 running 18 home-dc01 running 25 ise-01 running 39 bind-01 running 47 ipa-01 running 64 k3s-master-01 running
1.4 Storage Pools
# List storage pools
sudo virsh pool-list --all
Name State Autostart ----------------------------------- images active yes images-1 active yes iso active yes isos active yes nas-isos active yes nas-vms active yes nvram active yes onboard-ssd active yes tmp active yes virtio-win active yes vms active yes
# Show pool details
sudo virsh pool-info onboard-ssd
evanusmodestus@supermicro300-9d1:~$ sudo virsh pool-info onboard-ssd Name: onboard-ssd UUID: 373797e2-e00f-4372-8bba-5c15f70c1eaa State: running Persistent: yes Autostart: yes Capacity: 961.66 GiB Allocation: 360.06 GiB Available: 601.61 GiB
1.5 Network Verification
# Test connectivity to gateway
ping -c 2 10.50.1.1
.output
PING 10.50.1.1 (10.50.1.1) 56(84) bytes of data. 64 bytes from 10.50.1.1: icmp_seq=1 ttl=64 time=0.199 ms 64 bytes from 10.50.1.1: icmp_seq=2 ttl=64 time=0.202 ms
--- 10.50.1.1 ping statistics --- 2 packets transmitted, 2 received, 0% packet loss, time 1015ms rtt min/avg/max/mdev = 0.199/0.200/0.202/0.001 ms
# Test connectivity to DNS ping -c 2 10.50.1.90 .output
evanusmodestus@supermicro300-9d1:~$ ping -c 2 10.50.1.90 PING 10.50.1.90 (10.50.1.90) 56(84) bytes of data. 64 bytes from 10.50.1.90: icmp_seq=1 ttl=64 time=0.502 ms 64 bytes from 10.50.1.90: icmp_seq=2 ttl=64 time=0.173 ms
--- 10.50.1.90 ping statistics --- 2 packets transmitted, 2 received, 0% packet loss, time 1021ms rtt min/avg/max/mdev = 0.173/0.337/0.502/0.164 ms
# Test internet via pfSense ping -c 2 8.8.8.8
evanusmodestus@supermicro300-9d1:~$ ping -c 2 8.8.8.8 PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data. From 192.168.1.225 icmp_seq=1 Destination Host Unreachable From 192.168.1.225 icmp_seq=2 Destination Host Unreachable --- 8.8.8.8 ping statistics --- 2 packets transmitted, 0 received, +2 errors, 100% packet loss, time 1020ms pipe 2
Phase 2: Migration Strategy
2.1 VM Migration Order
Priority order for migrating VMs to kvm-02:
-
Non-critical VMs first - Test migration process
-
ipsk-manager
-
keycloak-01
-
-
Secondary services
-
bind-01 (DNS - have bind-02 ready)
-
ipa-01 (FreeIPA)
-
-
Critical infrastructure last
-
vault-01 (PKI/Secrets)
-
home-dc01 (AD DS)
-
ise-01 (NAC)
-
pfSense-FW01 (Firewall - LAST)
-
2.2 Pre-Migration Checklist (Programmatic)
2.2.1 Pre-Flight Validation
# Validate kvm-02 health (must be 22/22)
ssh kvm-02 "/usr/local/bin/kvm-health-check" | awk '/Results:/'
# List VMs on both hosts
netapi kvm list -H kvm-01
netapi kvm list -H kvm-02
# Check storage pools on kvm-02
ssh kvm-02 "sudo virsh pool-list --all"
# Check NAS connectivity from kvm-02
ssh kvm-02 "ping -c 2 10.50.1.70"
# Verify IPMI access to both hosts
ipmitool -I lanplus -H 10.50.1.200 -U ADMIN chassis status | head -3
ipmitool -I lanplus -H 10.50.1.201 -U ADMIN chassis status | head -3
2.2.2 NAS NFS Permissions for kvm-02
# Check current NFS shares on Synology
netapi synology shares
# TODO: Add kvm-02 (10.50.1.111) to NFS allowed hosts
# This requires Synology DSM UI or API call
# Path: Control Panel → Shared Folder → Edit → NFS Permissions
# Add: 10.50.1.111 with read/write, no_root_squash
2.3 VM Migration Procedure
2.3.1 Migration Workflow (Per VM)
# Variables - set per VM
VM_NAME="ipsk-manager"
SRC_HOST="kvm-01"
DST_HOST="kvm-02"
SRC_POOL="/var/lib/libvirt/images"
DST_POOL="/var/lib/libvirt/images"
# Step 1: Get VM info and disk path
netapi kvm info -H ${SRC_HOST} ${VM_NAME}
ssh ${SRC_HOST} "sudo virsh domblklist ${VM_NAME}"
# Step 2: Graceful shutdown on source
netapi kvm stop -H ${SRC_HOST} ${VM_NAME}
# Or: ssh ${SRC_HOST} "sudo virsh shutdown ${VM_NAME}"
# Step 3: Verify VM is shut off
ssh ${SRC_HOST} "sudo virsh domstate ${VM_NAME}"
# Expected: shut off
# Step 4: Copy disk image to kvm-02
# Option A: Direct SCP (slower but works)
ssh ${SRC_HOST} "sudo cat ${SRC_POOL}/${VM_NAME}.qcow2" | \
ssh ${DST_HOST} "sudo tee ${DST_POOL}/${VM_NAME}.qcow2 > /dev/null"
# Option B: Via NAS (if both can mount)
# ssh ${SRC_HOST} "sudo cp ${SRC_POOL}/${VM_NAME}.qcow2 /mnt/nas-vms/"
# ssh ${DST_HOST} "sudo cp /mnt/nas-vms/${VM_NAME}.qcow2 ${DST_POOL}/"
# Step 5: Export and modify VM XML
ssh ${SRC_HOST} "sudo virsh dumpxml ${VM_NAME}" > /tmp/${VM_NAME}.xml
# Edit XML: change source path and network bridge
sed -i "s|${SRC_POOL}|${DST_POOL}|g" /tmp/${VM_NAME}.xml
sed -i "s|virbr0|br-mgmt|g" /tmp/${VM_NAME}.xml
# Step 6: Import VM on kvm-02
scp /tmp/${VM_NAME}.xml ${DST_HOST}:/tmp/
ssh ${DST_HOST} "sudo virsh define /tmp/${VM_NAME}.xml"
# Step 7: Start VM on kvm-02
netapi kvm start -H ${DST_HOST} ${VM_NAME}
# Or: ssh ${DST_HOST} "sudo virsh start ${VM_NAME}"
# Step 8: Verify VM is running and accessible
ssh ${DST_HOST} "sudo virsh domstate ${VM_NAME}"
ping -c 3 <VM_IP>
2.4 Migration Status Tracking
| Priority | VM | Source | Target | Status / Notes |
|---|---|---|---|---|
1 |
ipsk-manager |
kvm-01 |
kvm-02 |
[ ] Pending (after HA Phase 0-2) |
2 |
keycloak-01 |
kvm-01 |
kvm-02 |
[ ] Pending (after HA Phase 0-2) |
3 |
bind-01 |
kvm-01 |
STAY |
[ ] Keep on kvm-01 (bind-02 on kvm-02 = HA) |
4 |
ipa-01 |
kvm-01 |
kvm-02 |
[ ] Pending (consider ipa-02 for HA) |
5 |
vault-01 |
kvm-01 |
STAY |
[ ] Keep on kvm-01 (vault-02/03 on kvm-02 = HA) |
6 |
home-dc01 |
kvm-01 |
STAY |
[ ] Keep on kvm-01 (home-dc02 on kvm-02 = HA) |
7 |
ise-01 |
kvm-01 |
STAY |
[ ] Keep on kvm-01 (ise-02 on kvm-02 = HA) |
8 |
9800-CL-WLC |
kvm-01 |
kvm-02 |
[ ] Pending (single instance OK) |
9 |
k3s-master-01 |
kvm-01 |
kvm-02 |
[ ] Pending (plan k3s HA first) |
10 |
pfSense-FW01 |
kvm-01 |
STAY |
[ ] Keep on kvm-01 (pfSense-FW02 + CARP = HA) |
| HA Strategy Change - Instead of migrating single points of failure, deploy secondaries on kvm-02 to achieve true HA across hypervisors. Primary services stay on kvm-01, secondaries on kvm-02. |
Phase 3: Network Cleanup (AFTER Migration)
| Do NOT change kvm-01 networking until VMs are migrated. Current state is a "blackhole" for host internet only - VMs work fine through pfSense. |
3.0 Current State Assessment
| Component | Status | Action |
|---|---|---|
kvm-01 VMs |
Working (route through pfSense) |
No change needed |
kvm-01 host internet |
Blackhole (routes to modem, fails) |
Fix AFTER migration |
kvm-02 |
Properly configured (routes through pfSense) |
Ready for VMs |