← Back to blog

10 Essential Runbook Examples for DevOps

· 5 min read · Stew Team
runbookdevopsexamples

Every DevOps team reinvents the same runbooks. Deployment procedures, incident response, database maintenance—the patterns are universal. If you need a structured approach, start with our runbook template guide.

Here are 10 runbook examples you can adapt for your team.

Runbook Examples for DevOps

1. Production Deployment Runbook

# Production Deployment

## Prerequisites
- [ ] All tests passing in CI
- [ ] Changelog updated
- [ ] Team notified in #deployments

## Procedure

### Pre-deployment
​```bash
# Verify current state
kubectl get deployments -n production
git log --oneline -5
​```

### Deploy
​```bash
kubectl set image deployment/api api=myapp:$VERSION -n production
kubectl rollout status deployment/api -n production
​```

### Verify
​```bash
curl https://api.example.com/health
kubectl logs deployment/api -n production --tail=50
​```

## Rollback
​```bash
kubectl rollout undo deployment/api -n production
​```

2. Incident Response Runbook

For a comprehensive incident management approach, see our dedicated runbook for incident management guide.

# Incident Response

## Initial Response (First 5 Minutes)

1. Acknowledge alert in PagerDuty
2. Join #incident-response channel
3. Post initial assessment:

​```markdown
**Incident:** [Brief description]
**Impact:** [User-facing impact]
**Status:** Investigating
​```

## Triage

### Check Service Health
​```bash
kubectl get pods -n production
curl -I https://api.example.com/health
​```

### Check Recent Changes
​```bash
git log --oneline --since="2 hours ago"
kubectl rollout history deployment/api -n production
​```

## Escalation
If not resolved in 15 minutes, page secondary on-call.

3. Database Backup Runbook

# Database Backup

## Daily Backup Procedure

### Create Backup
​```bash
pg_dump -h $DB_HOST -U $DB_USER -d production \
  --format=custom \
  --file=/backups/prod_$(date +%Y%m%d).dump
​```

### Verify Backup
​```bash
pg_restore --list /backups/prod_$(date +%Y%m%d).dump | head -20
ls -lh /backups/prod_$(date +%Y%m%d).dump
​```

### Upload to S3
​```bash
aws s3 cp /backups/prod_$(date +%Y%m%d).dump \
  s3://backups-bucket/postgres/
​```

## Verification
​```bash
aws s3 ls s3://backups-bucket/postgres/ | tail -5
​```

4. SSL Certificate Renewal Runbook

# SSL Certificate Renewal

## Check Expiration
​```bash
echo | openssl s_client -servername example.com -connect example.com:443 2>/dev/null | openssl x509 -noout -dates
​```

## Renewal Steps

### Generate New Certificate
​```bash
certbot certonly --dns-cloudflare -d example.com -d *.example.com
​```

### Update Kubernetes Secret
​```bash
kubectl create secret tls example-tls \
  --cert=/etc/letsencrypt/live/example.com/fullchain.pem \
  --key=/etc/letsencrypt/live/example.com/privkey.pem \
  --dry-run=client -o yaml | kubectl apply -f -
​```

### Restart Ingress
​```bash
kubectl rollout restart deployment/ingress-nginx -n ingress
​```

## Verify
​```bash
curl -vI https://example.com 2>&1 | grep "expire date"
​```

5. Scaling Runbook

# Horizontal Scaling

## When to Scale
- CPU usage > 80% for 5 minutes
- Response latency > 2 seconds
- Queue depth > 1000 messages

## Scale Up Procedure

### Check Current State
​```bash
kubectl get hpa -n production
kubectl top pods -n production
​```

### Manual Scale
​```bash
kubectl scale deployment/api --replicas=10 -n production
​```

### Verify
​```bash
kubectl get pods -n production -w
​```

## Scale Down
​```bash
kubectl scale deployment/api --replicas=3 -n production
​```

6. Log Investigation Runbook

# Log Investigation

## Find Error Logs
​```bash
kubectl logs deployment/api -n production --since=1h | grep -i error
​```

## Search by Request ID
​```bash
kubectl logs deployment/api -n production --all-containers | grep "req-abc123"
​```

## Aggregate Error Counts
​```bash
kubectl logs deployment/api -n production --since=1h | grep -i error | sort | uniq -c | sort -rn | head -10
​```

## Stream Live Logs
​```bash
kubectl logs -f deployment/api -n production
​```

7. Cache Clear Runbook

# Clear Redis Cache

## Full Cache Clear
​```bash
redis-cli -h $REDIS_HOST FLUSHDB
​```

## Selective Clear
​```bash
# Clear user session cache
redis-cli -h $REDIS_HOST KEYS "session:*" | xargs redis-cli DEL

# Clear API response cache
redis-cli -h $REDIS_HOST KEYS "api:cache:*" | xargs redis-cli DEL
​```

## Verify
​```bash
redis-cli -h $REDIS_HOST DBSIZE
​```

8. DNS Update Runbook

# DNS Update

## Pre-Change
​```bash
# Record current state
dig +short api.example.com
​```

## Update DNS
​```bash
aws route53 change-resource-record-sets \
  --hosted-zone-id $ZONE_ID \
  --change-batch file://dns-change.json
​```

## Verify Propagation
​```bash
# Check multiple DNS servers
dig +short api.example.com @8.8.8.8
dig +short api.example.com @1.1.1.1
​```

## Rollback
Keep previous IP noted. Revert change-batch if needed.

9. Secret Rotation Runbook

# Secret Rotation

## Generate New Secret
​```bash
NEW_SECRET=$(openssl rand -base64 32)
echo "New secret generated (not displayed for security)"
​```

## Update in Vault
​```bash
vault kv put secret/api/database password="$NEW_SECRET"
​```

## Restart Application
​```bash
kubectl rollout restart deployment/api -n production
​```

## Verify
​```bash
kubectl logs deployment/api -n production | grep -i "database connected"
​```

10. Health Check Runbook

# System Health Check

## Application Health
​```bash
curl -s https://api.example.com/health | jq .
​```

## Database Health
​```bash
psql -h $DB_HOST -U $DB_USER -c "SELECT 1;"
​```

## Cache Health
​```bash
redis-cli -h $REDIS_HOST PING
​```

## Queue Health
​```bash
rabbitmqctl list_queues name messages consumers
​```

## All Checks Summary
Document results and note any anomalies.

Making Runbook Examples Executable

These runbook examples work as documentation. But documentation you can’t execute is documentation you don’t trust. Learn how to write a runbook that your team will actually use.

Stew transforms these runbook examples into interactive procedures. Run each command with a click. Track progress automatically. Never wonder which step you’re on.

Join the waitlist and make your runbooks executable.