TAC Case: 802.1X Authentication Failures (~500 Endpoints)
Case Summary
SR Number |
pending |
Severity |
S1 (production network down - medical facility) |
Product |
Cisco ISE 3.2 Patch 6 |
Contract |
add SmartNet contract ID |
Opened |
2026-03-12 |
Request |
Live engineer with distributed ISE/MNT experience |
Problem Statement
Approximately 500 endpoints are failing 802.1X authentication across wired and wireless networks. Affects domain-joined Windows and Jamf-managed Macs using EAP-TEAP, MSCHAPv2, EAP-TLS, and SCEP-issued certificates.
CRITICAL: This is a medical facility. Patient care systems may be impacted.
SECONDARY SYMPTOM: Live Logs show authentication entries, but clicking details returns:
No data available for this record. Either the data is purged or authentication for this session record happened a week ago. Or if this is a 'PassiveID' or 'PassiveID Visibility' session, it will not have authentication details on ISE.
PassiveID/Visibility services are NOT enabled. This suggests MNT database or replication issue.
Environment
ISE Deployment
| Node | Role | Status |
|---|---|---|
ppan.ise.chla.org |
Primary PAN |
check |
span.ise.chla.org |
Secondary PAN |
check |
pmnt.ise.chla.org |
Primary MNT |
check |
smnt.ise.chla.org |
Secondary MNT |
check |
psn-1.ise.chla.org |
PSN |
check |
psn-2.ise.chla.org |
PSN |
check |
psn-3.ise.chla.org |
PSN |
check |
psn-4.ise.chla.org |
PSN |
check |
ISE Version: 3.2 Patch 6
Deployment Type: Distributed (8 nodes)
Affected Networks
-
Wired (LAN) - 802.1X
-
Wireless (WLAN) - 802.1X
Affected Endpoints
| Device Type | Auth Method | ~Count |
|---|---|---|
Windows 10/11 (Domain Joined) |
EAP-TEAP, MSCHAPv2, EAP-TLS |
estimate |
macOS (Jamf Managed) |
EAP-TLS (SCEP certs) |
estimate |
WOWs (Wyse on Wheels) |
confirm auth method |
estimate |
Chromebooks |
confirm auth method |
estimate |
| WOWs and Chromebooks are critical to patient care. These devices are used at bedside for clinical workflows. |
RADIUS Architecture (NetScaler Load Balancing)
| VIP | Backend PSNs | Usage |
|---|---|---|
VIP-1 (NetScaler SNIP) |
psn-1.ise.chla.org, psn-2.ise.chla.org |
Primary for all NADs (except ASA) |
VIP-2 (NetScaler SNIP) |
psn-3.ise.chla.org, psn-4.ise.chla.org |
Secondary for all NADs (except ASA) |
-
NADs configured: Primary = VIP-1, Secondary = VIP-2
-
ASA uses direct PSN addressing (not behind VIP)
Timeline
| Date/Time | Event |
|---|---|
~2026-03-05 |
Noticed lack of logs / logging anomalies (1 week ago) |
2026-03-11 |
Authentication failures reported (~500 endpoints) |
2026-03-12 |
TAC case opened |
investigate |
Any changes in the 2 weeks before 03-05? (patch, cert renewal, DB maintenance, replication changes) |
Key observation: Logging issues preceded auth failures by ~6 days. These are likely related.
Symptoms
User Experience
-
Endpoints fail to connect to network
-
Previously working devices now failing
-
add specific error messages users see
ISE Live Logs
Failure Reason(s): check Operations > Live Logs
# Common failure reasons to look for: # 12514 - EAP-TLS failed SSL/TLS handshake # 12308 - Client certificate chain not trusted # 22056 - Subject not found in identity store # 24408 - User/machine not found in AD # 24415 - Could not locate AD domain
Sample Failed Authentications:
| Timestamp | Username/MAC | Auth Method | PSN | Failure Reason |
|---|---|---|---|---|
sample 1 |
||||
sample 2 |
||||
sample 3 |
Pattern Analysis
-
Failures on ALL PSNs or specific PSN?
-
Failures started suddenly or gradual increase?
-
Specific SSID/switch affected?
-
Time-based pattern?
Current Workaround
Adding devices by MAC address CSV import to General-Device-Onboard identity group.
This identity group is referenced in an authorization policy positioned before the default guest rule as a safety net.
Impact: Manual process, not scalable for 500+ devices.
Working Theory
The "no data available" error for recent sessions + auth failures suggests:
-
MNT database issue - session data not being written or replicated
-
Disk space exhaustion - DB partition full, can’t write new records
-
Replication failure - PSNs can’t sync session data to MNT
-
Database corruption - requires TAC intervention
The logging issue appearing ~6 days before auth failures suggests a DB/storage problem that gradually worsened until it began affecting live authentications.
What TAC Will Ask
Be ready with:
-
[ ] SmartNet contract number
-
[ ] ISE version: 3.2 Patch 6
-
[ ] When logging issues started: ~2026-03-05
-
[ ] When auth failures started: 2026-03-11
-
[ ] Any changes in past 2 weeks? (patches, certs, AD changes, VM snapshots)
-
[ ] Output of
show diskfrom each MNT node -
[ ] Output of
show application status isefrom each node -
[ ] DB replication status (option 24 from
application configure ise) -
[ ] Support bundle from Primary PAN