PagerPal
Receive alerts, open incidents, notify the responder, escalate when nobody answers, and stop paging when the incident is owned or resolved.
PagerPal is a lightweight, self-hosted incident response tool for small teams. It receives alerts from monitoring systems, opens or updates incidents, notifies the current responder, escalates when nobody acknowledges, and stops paging when the incident is acknowledged or resolved.
PagerPal v1 is designed as a single-node appliance: one FastAPI application process, one database, and in-process background workers for notification retry and automatic escalation.
What PagerPal Connects#
PagerPal connects four operational surfaces:
- Monitoring systems send generic, Grafana, or CloudWatch/SNS webhook alerts.
- PagerPal validates alert source keys, deduplicates incidents, stores timeline events, and queues notification work.
- Notification providers deliver SMS, WhatsApp, or email through Infobip and/or SMTP.
- Operators and responders use the web UI or management API to acknowledge, resolve, reopen, escalate, and retry notifications.
Core Workflow#
- A monitoring system sends an alert webhook with
X-API-Key: <alert-source-key>. - PagerPal validates the alert source key.
- PagerPal creates a new incident or updates the matching open incident.
- PagerPal finds the current responder from team schedules and escalation policy configuration.
- PagerPal queues SMS, WhatsApp, and/or email notifications.
- The retry worker dispatches queued notifications when sending is enabled.
- The escalation worker advances unacknowledged incidents according to policy delays.
- Operators acknowledge, resolve, reopen, manually escalate, or retry delivery from the UI/API.
- Recovery webhooks can resolve matching open incidents and stop the paging loop.
Customer Documentation#
- Architecture - connection flow, security boundaries, deployment shape, and v1 limits.
- Operator guide - day-to-day operation, safe mode, incident actions, worker checks, and troubleshooting.
- API reference - authentication model, webhook examples, incident actions, and route inventory.
- Deployment notes - single-node Compose/AWS setup, database guidance, backups, and upgrade order.
- Domain glossary - customer-facing terms used across PagerPal.
Quick Start With Docker Compose#
Docker Compose uses safe local defaults if no .env file is present. Outbound notification sending is disabled by default.
docker compose up --build
Check the health endpoint:
curl -f http://127.0.0.1:8000/health
Open the UI:
http://127.0.0.1:8000/dashboard
If this is a fresh database, /login prompts for the first admin account. Existing deployments should sign in with a login-enabled User Account.
Safe Local Mode#
Use safe mode when testing UI workflows or screenshots. It disables real outbound notifications and background retry/escalation jobs:
NOTIFICATION_SENDING_ENABLED=false \
NOTIFICATION_RETRY_WORKER_ENABLED=false \
ESCALATION_WORKER_ENABLED=false \
uvicorn app.main:app --host 127.0.0.1 --port 8000
Seed demo data only in local/demo environments:
python seed_data.py
If the app is running on a different local port:
PAGERPAL_BASE_URL=http://127.0.0.1:8001 python seed_data.py
Production Setup Checklist#
Before using PagerPal for real paging:
- Copy
.env.exampleto.envand replace every placeholder. - Set a strong
SECRET_KEY. - Set
SESSION_COOKIE_SECURE=truefor HTTPS production. - Set
PAGERPAL_BASE_URLto the HTTPS URL operators use. - Set
ALLOWED_ORIGINSto explicit production origins. - Configure at least one notification provider: Infobip
INFOBIP_*for SMS/WhatsApp, orSMTP_*for email. - Keep
NOTIFICATION_SENDING_ENABLED=falseuntil a controlled test alert is verified. - Run
python -m alembic upgrade head. - Create the first admin with
python scripts/create_admin.py --email <admin-email>or the first-run/loginbootstrap screen. - Confirm
/healthreturns OK. - Confirm
/settingsreports expected notification provider and worker state. - Confirm exactly one app process has retry/escalation workers enabled.
- Configure database backups.
- Put HTTPS in front of PagerPal before exposing it to operators.
- Send a controlled test alert from
/alert-sourcesand verify incident creation, notification logs, acknowledgement, and resolution.
Configuration#
For deployment, copy the example environment file and replace all placeholders:
cp .env.example .env
$EDITOR .env
Important variables:
| Variable | Purpose | Notes |
|---|---|---|
PAGERPAL_PORT |
Host port used by Docker Compose | Defaults to 8000. |
ENVIRONMENT |
Runtime mode | Use development locally; use production for hosted deployments. |
DATABASE_URL |
SQLAlchemy database URL | SQLite works for local/small v1. PostgreSQL is recommended for production. |
PAGERPAL_BASE_URL |
Public base URL used in notification links | Set to the HTTPS URL operators use to open PagerPal. |
ALLOWED_ORIGINS |
Comma-separated CORS allowlist | Use * locally only; set explicit origins in production. |
SESSION_COOKIE_SECURE |
Marks session cookies Secure | Use true for production HTTPS deployments. |
NOTIFICATION_SENDING_ENABLED |
Enables/disables real outbound SMS, WhatsApp, and email notifications | Keep false until at least one provider is configured and tested. |
INFOBIP_BASE_URL |
Infobip API base URL | Must include https://. |
INFOBIP_API_KEY |
Infobip API key | Secret value. Do not commit it. |
INFOBIP_SMS_SENDER |
SMS sender name/number | Depends on Infobip account setup. |
INFOBIP_WHATSAPP_SENDER |
WhatsApp sender | Depends on Infobip account setup. |
INFOBIP_RECEIPT_TOKEN |
Optional token for Infobip delivery receipt webhooks | Required in production if the receipt endpoint is exposed. |
SMTP_HOST |
SMTP server for email notifications | Required only for email paging. |
SMTP_PORT |
SMTP port | Defaults to 587; use 465 for SMTP over SSL. |
SMTP_USER |
SMTP username | Optional if the relay does not require auth. |
SMTP_PASSWORD |
SMTP password | Secret value. Do not commit it. |
SMTP_FROM |
Email sender address | Required only for email paging. |
SECRET_KEY |
UI/form security secret | Replace before non-local deployment. |
NOTIFICATION_RETRY_WORKER_ENABLED |
Starts retry background job | Enable in exactly one app process. |
ESCALATION_WORKER_ENABLED |
Starts automatic escalation job | Enable in exactly one app process. |
Do not commit .env or any real credentials.
Webhook Credentials#
All alert ingestion endpoints require an active alert source API key. Create or regenerate alert sources in the UI under /alert-sources; PagerPal shows the full key only immediately after creation or regeneration. Default list/get responses show only masked_api_key.
Supported credential locations:
X-API-Key: <alert-source-key>X-Alert-Source-Key: <alert-source-key>Authorization: Bearer <alert-source-key>
Prefer headers. Avoid query-string credentials because they can appear in logs, screenshots, browser history, and proxy records.
Generic Alert Example#
curl -X POST http://127.0.0.1:8000/api/v1/alerts \
-H 'Content-Type: application/json' \
-H 'X-API-Key: <alert-source-key>' \
-d '{
"title": "CPU > 95% on prod-api-03",
"description": "CPU has been above 95% for 10 minutes",
"severity": "critical",
"team_id": 1,
"external_id": "prod-api-03-cpu-high"
}'
Deployment Shape#
The intended low-cost v1 hosted path is one small VM or container host running Docker Engine and Docker Compose:
- Provision one host with durable storage.
- Install Docker and the Docker Compose plugin.
- Clone PagerPal on the host.
- Copy
.env.exampleto.envand replace every placeholder. - Start PagerPal with
docker compose up -d --build. - Confirm
curl -f http://127.0.0.1:8000/healthsucceeds on the host. - Put a TLS-terminating reverse proxy or load balancer in front of port
8000before exposing PagerPal publicly.
For more detail, see Deployment notes.
Important v1 Limitation#
APScheduler runs inside the FastAPI process. Run exactly one app process with workers enabled:
uvicorn app.main:app --host 0.0.0.0 --port 8000 --workers 1
Do not run multiple worker-enabled containers, multiple Uvicorn workers, or autoscaled app instances until scheduler coordination is added. Duplicate schedulers can duplicate notification retries and escalation attempts.
Known Limits#
- v1 is a single-node design, not highly available yet.
- Background workers stop when the app process stops.
- Multi-process and multi-node deployments need scheduler coordination before production use.
- SQLite is acceptable for local/demo use; PostgreSQL is recommended for durable hosted deployments.
- SSO/SAML is not part of v1; authentication is local user accounts with role-based access.