STD-012: Runbook Structure

Runbooks document repeatable operational procedures with explicit commands, validation at every step, and rollback paths. A runbook is not a description of what to do — it is the exact sequence of commands an operator executes. If a step cannot be copy-pasted into a terminal, it is not a runbook step.

Principles

  1. Explicit commands, not descriptions. Every step contains the exact command to run. "Configure the firewall" is not a step. firewall-cmd --add-port=443/tcp --permanent is.

  2. Validation at every step. Every command that changes state MUST be followed by a verification command with documented expected output. This is the verify-change-verify pattern from STD-005.

  3. Expected output documented. Operators MUST know what success looks like before they execute. Every verification includes a .Expected Output block.

  4. Reversible with rollback. Every procedure MUST have a Rollback section that restores the system to its pre-execution state. If a procedure is not reversible, the Rollback section MUST document why and what compensating controls exist.

  5. Variables for reuse. Environment-specific values (hostnames, IPs, paths) are defined once in a Variables section and referenced throughout. Runbooks are portable across environments by changing only the variables.

Naming Convention

Prefix Purpose Example

RUNBOOK-

Standard operational procedure

RUNBOOK-2026-02-12-vault-pki-issuance.adoc

INCIDENT-

Incident response procedure

INCIDENT-2026-02-12-auth-failure-triage.adoc

DEPLOY-

Deployment procedure

DEPLOY-2026-02-12-ise-upgrade.adoc

RECOVERY-

Disaster recovery procedure

RECOVERY-2026-02-12-backup-restore.adoc

Runbook files are stored in pages/runbooks/ and named using the appropriate prefix, date, and kebab-case slug.

Required Sections

Every runbook MUST contain the following sections in this order:

Section Content

Overview

Brief description of the procedure, when to use it, and what it accomplishes

Prerequisites

Checklist of required access, permissions, tools, and pre-conditions — verified before execution begins

Scope

Explicit in-scope and out-of-scope boundaries so operators know where this runbook ends

Variables

All environment-specific values defined as shell variables at the top, with concrete examples

Procedure (phased)

Numbered phases, each containing numbered steps. Every step has a command block. Every mutating step has a verification with expected output

Rollback

Step-by-step reversal of the procedure, restoring the system to its pre-execution state

Troubleshooting

Common failure modes with symptom, cause, and resolution command for each

Post-Execution Checklist

Verification checklist confirming the procedure achieved its goals: service running, functionality tested, documentation updated, stakeholders notified

Related

Cross-references to related runbooks, standards, incident reports, and external documentation

Procedure Phase Structure

Each phase within the Procedure section follows this structure:

=== Phase N: Phase Name

==== N.1 Step Name

[source,bash]

Command to execute

command --flag value

.Expected Output

output confirming success


Steps within a phase are numbered sequentially: 1.1, 1.2, 2.1, 2.2. Phases represent logical groupings: Preparation, Execution, Verification, Cleanup.

Requirements

  1. Every step that modifies system state MUST have a verification command immediately following it.

  2. Expected output MUST be shown in a .Expected Output block after every verification command.

  3. Rollback MUST be documented for every procedure. If the procedure is irreversible, the Rollback section MUST state this explicitly and document compensating controls using this format:

    == Rollback
    
    WARNING: This procedure is irreversible. The following compensating controls apply:
    
    * *Pre-execution backup:* [backup command and location]
    * *Recovery path:* [how to restore from backup if needed]
    * *Acceptance criteria:* [how to verify the change is acceptable if rollback is impossible]
  4. Variables MUST be used for all environment-specific values — hostnames, IPs, ports, paths, usernames. No hardcoded values in procedure steps.

  5. The Prerequisites section MUST be a checklist (using * [ ]) so operators can verify readiness before starting.

  6. Phases MUST be numbered sequentially and named descriptively.

  7. NOTE, WARNING, and CAUTION admonitions MUST be used for steps with elevated risk or non-obvious behavior.

  8. Runbooks SHOULD be tested in non-production before use against production systems.

Compliance

Check Method Pass Criterion

All required sections present

Runbook contains Overview, Prerequisites, Scope, Variables, Procedure, Rollback, Troubleshooting, Post-Execution Checklist, Related

No missing sections

Verification at every step

Every mutating command is followed by a verification command with .Expected Output

No blind changes

Rollback complete

Rollback section contains explicit commands that reverse the procedure, or documents why reversal is impossible

System can be restored

Variables used

No hardcoded IPs, hostnames, ports, or paths in procedure steps — all reference variables from the Variables section

Environment-portable

Naming correct

Filename uses registered prefix (RUNBOOK-, INCIDENT-, DEPLOY-, RECOVERY-) with date and kebab-case slug

Matches naming convention

Prerequisites checkable

Prerequisites are a checklist with verifiable items, not prose

Operator can confirm readiness