Appearance
Runbooks
Runbooks should be direct, current, and written for people under pressure.
Template
Use this structure for each runbook:
- Symptom
- Impact
- First checks
- Recovery steps
- Escalation path
- Related dashboards, alerts, and logs
Initial Runbooks
- Failed data pipeline
- Delayed source extract
- Schema drift detected
- Access request blocked
- Production deployment rollback