Skip to main content

Incident Management

Incidents communicate service disruptions to your subscribers. A well-managed incident can turn a trust-breaking event into a demonstration of transparency and reliability.

Creating an incident

Manual

  1. Go to your status page → "New Incident"

  2. Fill in:

    • Title: Short, clear description (e.g. API experiencing elevated error rates)
    • Status: Investigating / Identified / Monitoring / Resolved
    • Affected components: Select which services are impacted
    • Severity: Minor / Major / Critical
    • Update: First public update
  3. Click "Create Incident" — subscribers are notified immediately

Automatic (from monitor alert)

Configure monitors to automatically create incidents when they fail:

monitor:
name: Production API
alert_policy:
create_status_page_incident: true
status_page_id: sp_abc123
affected_component: comp_api
incident_severity: major

Incident lifecycle

Investigating  →  Identified  →  Monitoring  →  Resolved
↑ ↑
Created All clear

Post an update at each stage. Subscribers are notified for every update.

Writing good updates

StageExample update
InvestigatingWe are aware of elevated error rates on the API and are investigating.
IdentifiedThe issue has been identified as a misconfigured load balancer. A fix is being deployed.
MonitoringThe fix has been deployed. We are monitoring to confirm stability.
ResolvedThe incident is fully resolved. All services are operational. We will publish a post-mortem within 48 hours.

Scheduling future maintenance

Post a scheduled maintenance notice in advance to set expectations:

  1. Status Page → New Incident → Scheduled Maintenance
  2. Set start and end time
  3. Write the maintenance notice

Subscribers receive a notification when you post it, and again when maintenance begins.


Post-mortems

After resolving a major incident, publish a post-mortem:

  1. Open the resolved incident → "Post-Mortem"

  2. Fill in the template:

    • Timeline — What happened, when
    • Root cause — What caused it
    • Impact — How many users affected, for how long
    • Resolution — What fixed it
    • Action items — What will prevent recurrence
  3. Publish — subscribers see a link in the resolution notification


Via the API

# Create an incident
curl -X POST https://api.alertifypro.com/v1/incidents \
-H "Authorization: Bearer WK_YOUR_API_KEY" \
-d '{
"title": "API experiencing elevated error rates",
"status": "investigating",
"status_page_id": "sp_abc123",
"component_ids": ["comp_api"],
"severity": "major"
}'

# Post an update
curl -X POST https://api.alertifypro.com/v1/incidents/inc_abc123/updates \
-d '{
"status": "identified",
"message": "Root cause identified. Fix in progress."
}'

# Resolve
curl -X PATCH https://api.alertifypro.com/v1/incidents/inc_abc123 \
-d '{ "status": "resolved" }'