PREP: Defense Matrix & Diagnostics

Defense Matrix

"Is ISE causing the authentication failures?"

Answer No. ISE PSNs are processing RADIUS authentications correctly.

Evidence

  • All 4 PSNs show running status for all RADIUS services

  • Active session count confirms endpoints authenticating

  • The MNT logging issue affected visibility, not authentication

Clarification

The ~500 endpoint failures predated the MNT issue and require separate investigation (certificate chain, supplicant config, AD connectivity).

"Why did RabbitMQ spike to 100%+ CPU?"

Answer Message queue saturation from high session volume combined with replication delay.

Root Cause

  • MNT receives session data from all 4 PSNs

  • Replication between Primary/Secondary MNT was degraded

  • Queue backed up → CPU spike → logging stopped working

TAC Guidance

Known issue addressed in ISE 3.2 Patch 9. Reboot cleared backlog. Patch upgrade scheduled.

"Is the network safe right now?"

Answer Yes. Authentication infrastructure is fully operational.

Architecture

  • 4 PSNs behind NetScaler VIPs - load balanced, redundant

  • Secondary MNT provides logging redundancy

  • PANs handle policy distribution (unaffected)

Monitoring

TAC case remains open. Proactive monitoring in place.

"What’s the timeline to full resolution?"

Pending Actions

  1. ISE Messaging Service enable - Maintenance window required

  2. ISE 3.2 Patch 9 upgrade - TAC coordinated, addresses known replication issues

Current State

Stable. No authentication impact. Logging restored.

Next Steps

Schedule maintenance window for remaining changes after business validation.

Quick Diagnostic Commands

Run these before/during any meeting to have current data:

System Health Check

# All nodes status
netapi ise -f json api-call openapi GET "/api/v1/deployment/node" | \
  jq -r '["HOSTNAME","STATUS"], (.response[] | [.hostname, .nodeStatus]) | @tsv' | column -t

# Expected: All nodes "OK" or "Connected"

Authentication Verification

# Active session count (proves auth is working)
netapi ise -f json mnt sessions | jq 'length'

# Sessions by PSN (distribution should be balanced)
netapi ise -f json mnt sessions | jq -r 'group_by(.psn) | .[] | "\(.[0].psn): \(length)"'

# Recent failures (should be minimal)
netapi ise -f json mnt failures --hours 1 | jq 'length'

MNT Health Specifically

# Check MNT nodes
netapi ise -f json api-call openapi GET "/api/v1/deployment/node" | \
  jq '.response[] | select(.roles[] | contains("MNT")) | {hostname, status: .nodeStatus}'