802.1X Auth Failures - Investigation

Diagnostic Data

Support Bundle

# Generate from Primary PAN GUI:
# Administration > System > Logging > Debug Log Configuration
# Set debug levels BEFORE reproducing, then generate bundle

Support bundle generated: [ ] Yes  [ ] No
Bundle filename: ise-support-bundle-YYYY-MM-DD.tar.gz

Debug Logs to Enable

Before reproducing the issue, enable these debugs on the failing PSN:

Component Level

runtime-AAA

DEBUG

eap

DEBUG

eap-tls

DEBUG

ad-connector

DEBUG

identity-store-AD

DEBUG

Show Commands (from ISE CLI)

# Run on each PSN
show application status ise
show logging application ise-psc.log tail count 100

# AD connectivity
test aaa group <AD-join-point> <test-user> <password>

MNT Replication Health (CRITICAL - check this first)

From Primary PAN GUI:

  • Administration > System > Deployment - check node sync status

  • Administration > System > Settings > Logging > Log Collector - verify pmnt/smnt status

From MNT CLI (pmnt.ise.chla.org):

# Database status
show application status ise

# Check replication
application configure ise
# Select option 24: View DB Replication Status

# Disk space (if DB is full, session data won't write)
show disk

If replication is broken or DB is full, that explains both symptoms.

Session Data Check

# From MNT CLI - check if session database is responding
application configure ise
# Select option 14: Purge Runtime Sessions
# (DO NOT purge - just see if it responds)

API Diagnostic Commands (netapi)

Run these before/during TAC call to have data ready.

Deployment Status

# Node overview
netapi ise api info

# All nodes with roles/services
netapi ise -f json api-call openapi GET "/api/v1/deployment/node" | jq -r '["HOSTNAME","IP","ROLES","SERVICES","STATUS"], (.response[] | [.hostname, .ipAddress, (.roles|join("/")), (.services|join(",")), .nodeStatus]) | @tsv' | column -t

MNT Health Check

# Check MNT node status specifically
netapi ise -f json api-call openapi GET "/api/v1/deployment/node" | jq '.response[] | select(.roles[] | contains("MNT")) | {hostname, ipAddress, status: .nodeStatus, services}'

Recent Auth Failures (Live Logs via API)

# Last 24 hours failed authentications
netapi ise -f json mnt failures --hours 24 | jq -r '.[] | [.timestamp, .username, .nas_ip, .failure_reason] | @tsv' | head -100

# Group failures by reason code
netapi ise -f json mnt failures --hours 24 | jq -r '.[].failure_reason' | sort | uniq -c | sort -rn

# Group failures by PSN (which PSN is seeing failures?)
netapi ise -f json mnt failures --hours 24 | jq -r '.[].psn' | sort | uniq -c | sort -rn

Active Sessions

# Current session count per PSN
netapi ise -f json mnt sessions | jq -r 'group_by(.psn) | .[] | {psn: .[0].psn, count: length}'

# Total active sessions
netapi ise -f json mnt sessions | jq 'length'

Policy Sets

# List authentication policy sets
netapi ise policy-sets

# Check the General-Device-Onboard identity group (workaround)
netapi ise -f json identity-groups | jq '.[] | select(.name | contains("General-Device-Onboard"))'

AD Connectivity

# AD join point status
netapi ise -f json api-call openapi GET "/api/v1/active-directory" | jq '.response[] | {name, domain, status: .adJoinPointStatus}'

Export Full Config (for TAC upload)

# Dump deployment info to JSON
netapi ise export > /tmp/ise-config-$(date +%Y%m%d).json

TAC Engagement — 2026-03-12 15:08

1. TAC Initial Observations

1.1 Disabled ISE Messaging Services

TAC observed that ISE Messaging Service for UDP syslog delivery to MNT is disabled. .Location ppan.ise.chla.org/admin/#administration/administration_system/administration_system_logging/local_log .Setting ISE Messaging Settings

Use "ISE Messaging Service" for UDP Syslogs delivery to MnT

Impact: If disabled, PSNs may fail to send session/auth records to MNT, contributing to "No data available for this record" errors.

2. TAC Recommendations Tracking

Recommendation

Description

Owner

Status

Notes / Next Steps

Enable ISE Messaging Services

Turn on "Use ISE Messaging Service for UDP syslogs delivery to MnT".

InfoSec Engineering

Pending

Must be enabled during maintenance window; confirm PSN → MNT log ingestion resumes.

Resolve MNT Replication Failure

PAN dashboard shows alarms: Replication Failed from PMnT. Deregister/re-register affected nodes.

InfoSec Engineering + TAC

In Progress

Perform on both PMnT and SMnT. Validate DB state & cluster hashing before re-registration.

Promote SMnT to Primary MNT

TAC recommends promoting secondary MNT to primary role temporarily.

InfoSec Engineering

Pending Decision

Requires validation of replication health and disk space. Ensure no corruption on SMnT.

Upgrade to ISE 3.2 Patch 9

TAC recommends installing latest patch to address known replication and logging issues.

InfoSec Engineering

Pending

Download link: software.cisco.com/download/home/283801620/type/283802505/release/3.2%20Patch%209

Review Disk Space on PMnT + SMnT

Verify DB/log partitions; full partitions can break logging and replication.

InfoSec Engineering

In Progress

Capture from CLI: show disk

Validate ISE Node Sync Status

Ensure deployment sync and configuration database replication are functioning.

InfoSec Engineering

Pending

GUI: Admin → System → Deployment

4. Scratch Space (Working Notes)

(Keep this for live call notes, timestamps, commands run, replication output, disk output, etc.)

ISE Primary MNT CPU rabbit mq service is over 100%

Management Summary — Primary MNT CPU / RabbitMQ Issue

We identified a critical performance issue on the Primary Monitoring Node (MNT) within our Cisco ISE deployment. The RabbitMQ messaging service, which is responsible for processing authentication and session logs, is running at over 100% CPU. This indicates that the MNT is unable to process messages efficiently, causing backlog and instability in the logging and monitoring functions. Recommended Action Cisco TAC has advised us to reboot the Primary MNT to clear the overloaded messaging service. During this reboot:

The Secondary MNT will automatically take over all monitoring/logging responsibilities. There is no impact to user authentication or network access. All authentication is handled by the four Policy Service Nodes (PSNs), which remain fully operational.

Why This Matters The overloaded Primary MNT is contributing to the issues we are seeing with missing log data and failed session lookups. Addressing this is part of stabilizing the overall environment and restoring full visibility into authentication events. Next Steps

After reboot, validate replication, queue processing, and log ingestion. Continue working with Cisco TAC to assess whether additional corrective actions are require

2026-03-12 16:19 pmnt services stopped and node rebooted
- [ ] app ise stop
- [ ] reboot
- [ ] saved ade-os
- [ ] acknowledged reboot
- [ ] ssh'd into server at 2026-03-12 16:21
- [ ] pmnt/admin#show uptime
 16:21:26 up 3 min,  1 user,  load average: 2.77, 1.35, 0.53
- [ ] 2026-03-12 16:29 running show application status ise

ISE PROCESS NAME                       STATE            PROCESS ID
--------------------------------------------------------------------
Database Listener                      running          8851
Database Server                        running          300 PROCESSES
Application Server                     running          28749
Profiler Database                      running          17555
ISE Indexing Engine                    disabled
AD Connector                           running          29781
M&T Session Database                   running          24955
M&T Log Processor                      running          29017
Certificate Authority Service          disabled
EST Service                            running          157854
SXP Engine Service                     disabled
TC-NAC Service                         disabled
PassiveID WMI Service                  disabled
PassiveID Syslog Service               disabled
PassiveID API Service                  disabled
PassiveID Agent Service                disabled
PassiveID Endpoint Service             disabled
PassiveID SPAN Service                 disabled
DHCP Server (dhcpd)                    disabled
DNS Server (named)                     disabled
ISE Messaging Service                  running          12480
ISE API Gateway Database Service       running          16233
ISE API Gateway Service                running          23247
ISE pxGrid Direct Service              disabled
Segmentation Policy Service            disabled
REST Auth Service                      running          145209
SSE Connector                          disabled
Hermes (pxGrid Cloud Agent)            disabled
McTrust (Meraki Sync Service)          disabled
ISE Node Exporter                      running          48340
ISE Prometheus Service                 disabled
ISE Grafana Service                    disabled
ISE MNT LogAnalytics Elasticsearch     running          57535
ISE Logstash Service                   running          75054
ISE Kibana Service                     running          92228

- [ ]




[scratch area for case log / CLI findings]

TAC Communication Log

Date Who Notes

2026-03-12

your name

Case opened