Keycloak Rebuild — SAML/OIDC SSO Recovery
Summary
Rebuild keycloak-01.inside.domusdigitalis.dev from scratch as a KVM virtual machine running Keycloak 26.x (Quarkus distribution) on a minimal Linux base. The previous instance was corrupted — no realm export exists. All SAML/OIDC clients must be recreated from documentation and incident records.
Motivation
Keycloak is a single point of failure for SSO across the home lab. With the instance down:
-
ISE Admin Portal SAML SSO is unavailable (local admin fallback only)
-
Grafana OIDC federation is broken
-
Any future SP integrations (Gitea, Vault UI, Portainer) are blocked
The ISE SAML restoration incident (INC-2026-02-14-001) proved the Keycloak REST API workflow is solid — token retrieval, SAML client CRUD, redirect URI management. That knowledge carries forward.
Prior State (keycloak-01 — corrupted)
| Component | Detail |
|---|---|
Realm |
|
SAML Clients |
ISE Admin Portal (Entity ID: |
OIDC Clients |
Grafana (likely), others TBD |
TLS |
Vault-issued certificates |
URL |
|
Auth |
Admin account, REST API on port 8443 |
Phase Summary
| Phase | Description | Status | Notes |
|---|---|---|---|
0: Pre-Work |
Inventory SPs, choose base OS, allocate KVM resources |
🟡 In progress |
Research complete — decisions pending |
1: VM Provisioning |
Create KVM VM, install base OS, harden |
❌ Not started |
— |
2: Keycloak Installation |
JDK 17+, PostgreSQL, Keycloak 26.x Quarkus |
❌ Not started |
— |
3: TLS & Certificates |
Vault-issued certs, HTTPS configuration |
❌ Not started |
Vault PKI operational |
4: Realm & Client Configuration |
Create realm, SAML client for ISE, OIDC for Grafana |
❌ Not started |
REST API workflow documented in INC-2026-02-14-001 |
5: ISE SAML Integration |
Import IdP metadata, configure ISE external identity source |
❌ Not started |
— |
6: Additional SP Integrations |
Grafana OIDC, Gitea, Vault UI — as needed |
❌ Not started |
Scope grows by appending |
7: Validation & Hardening |
SSO login tests, backup strategy, monitoring |
❌ Not started |
— |
Assessment
Risk
| Factor | Detail |
|---|---|
SPOF Duration |
keycloak-01 has been down since corruption — SSO unavailable across lab |
Blast Radius |
ISE admin SSO, Grafana federation, future SP integrations all blocked |
Mitigation During Rebuild |
ISE local admin account ( |
No Realm Export |
Must recreate from documentation. INC-2026-02-14-001 has ISE SAML client config. Grafana OIDC config needs discovery. |
Dependency Chain |
Vault PKI (✅ operational) → TLS certs → Keycloak → SAML/OIDC → ISE + Grafana |
Decision Log
| Date | Decision | Rationale |
|---|---|---|
2026-06-10 |
Keycloak over lighter alternatives (SimpleSAMLphp, Authentik, Kanidm) |
Keycloak handles both SAML and OIDC, REST API is documented and proven, ISE integration is validated. Heavy but correct. |
2026-06-10 |
KVM VM over container (Podman/k3s) |
Full VM provides isolation, snapshot capability, and avoids the k3s NAT blocker (93 days, P0). Matches existing infrastructure pattern. |
2026-06-10 |
PostgreSQL over H2 |
H2 is embedded and fragile — the previous corruption may have been H2-related. PostgreSQL provides durability and backup options. |
Open Questions
-
Which KVM host — kvm-01 (10.50.1.100) or kvm-02 (10.50.1.101)?
-
Base OS — Rocky 9, Alma 9, Ubuntu 24.04 Server, or Debian 12?
-
Realm name — was
domusdigitalisper incident, attribute saysdomus(currentlydomus). Which is canonical? -
Full SP inventory — ISE SAML confirmed. Grafana OIDC likely. Others?
-
Backup strategy — realm export cron? VM snapshot schedule?
Metadata
| Field | Value |
|---|---|
PRJ ID |
PRJ-2026-06-keycloak-rebuild |
Author |
Evan Rosado |
Created |
2026-06-10 |
Updated |
2026-06-10 |
Status |
Active |
Category |
Infrastructure / Identity |
Priority |
P2 (SPOF — promotes to P0 on SSO outage) |
KVM Host |
TBD (kvm-01 or kvm-02) |
VM Name |
keycloak-01 |
IP |
10.50.1.80 |
FQDN |
keycloak-01.inside.domusdigitalis.dev |
Realm |
domus |
Keycloak Version |
26.x (Quarkus) |
Prior Incident |
|
Related Project |