Incident Report Template

Copy this template when creating a new incident report.

Filename convention: INC-YYYY-MM-DD-brief-description.adoc


Incident Summary

Field Value

Detected

YYYY-MM-DD HH:MM TZ (how detected)

Mitigated

YYYY-MM-DD HH:MM TZ (or N/A)

Resolved

YYYY-MM-DD HH:MM TZ (or ongoing)

Duration

X hours/days

Severity

P1 (Critical) / P2 (High) / P3 (Medium) / P4 (Low)

Impact

Brief description of what was affected

Root Cause

One-line root cause statement

Timeline

Time (TZ) Event

HH:MM

Initial symptom observed

HH:MM

Alert triggered / user reported

HH:MM

Investigation started

HH:MM

Root cause identified

HH:MM

Mitigation applied

HH:MM

Verified resolved

Symptoms

  • What was observed?

  • Error messages?

  • Failed services or processes?

  • User reports?

Investigation

Initial Triage

# First diagnostic commands run
# Example: systemctl status <service>

Log Analysis

# Log queries
# Example: journalctl -u <service> --since "today"

Findings

  1. Finding 1 - what was discovered

  2. Finding 2 - what led to root cause

  3. Finding 3 - contributing factors

Root Cause

Technical explanation: One paragraph explaining why the incident occurred.

Why it happened:

  • Immediate cause: [what failed]

  • Contributing factors: [what made it worse or allowed it to happen]

  • Systemic issues: [underlying problems]

Resolution

Immediate Fix

# Commands used to resolve the incident

Verification

# Commands used to verify the fix
  • Service restored

  • Monitoring shows healthy

  • No new errors in logs

  • Users confirmed resolution

Impact Assessment

Systems Affected

System Status Impact Duration

System 1

Restored

X hours

System 2

N/A

-

Business Impact

  • Users affected: [count or percentage]

  • Data loss: Yes / No - details

  • Compliance implications: [if any]

  • External visibility: [customer-facing?]

Prevention

Short-term (This Week)

  • Action 1 - Owner

  • Action 2 - Owner

Long-term (This Quarter)

  • Systemic improvement 1 - Owner

  • Process change 1 - Owner

  • Monitoring enhancement - Owner

Lessons Learned

What Went Well

  • Item 1

  • Item 2

What Could Be Improved

  • Item 1

  • Item 2

Key Takeaways

  1. Key insight from this incident

  2. Pattern to watch for in the future

  3. Best practice reinforced or learned

Communication Log

Time Audience Message

HH:MM

Team/Management

Initial notification

HH:MM

Stakeholders

Status update

HH:MM

All

Resolution notification

  • Change Request: CR-YYYY-MM-DD-description.adoc (link to related CR)

  • RCA: RCA-YYYY-MM-DD-NNN.adoc (link to detailed RCA if P1/P2)

  • Runbook: Link to relevant runbook

  • Monitoring: Link to dashboard/alert

Metadata

Field Value

Incident ID

INC-YYYY-MM-DD-NNN

Author

Name

Created

YYYY-MM-DD

Last Updated

YYYY-MM-DD

Status

Draft / In Review / Final

Post-Incident Review

YYYY-MM-DD (within 5 business days for P1/P2)


Severity Definitions

Severity Criteria Response Time

P1 - Critical

Production down, data loss, security breach

Immediate, all hands

P2 - High

Major functionality impaired, workaround difficult

Within 1 hour, dedicated team

P3 - Medium

Functionality degraded, workaround available

Within 4 hours, normal priority

P4 - Low

Minor issue, cosmetic, no user impact

Next business day