Heartbeat (CRON) Monitors
Monitor scheduled jobs, cron tasks, and background workers by pinging a heartbeat URL.
Heartbeat & Cron Monitors
Overview
Heartbeat monitors work in reverse - your scheduled tasks ping StatusApp when they run. If StatusApp doesn't receive a ping within the expected timeframe (grace period), it triggers an alert. Perfect for monitoring backups, data syncs, cron jobs, and other scheduled tasks.
How It Works
Normal Monitoring (StatusApp pings your service):
- StatusApp → Checks your endpoint at intervals
- Your service responds
- StatusApp alerts if no response
Heartbeat Monitoring (Your service pings StatusApp):
- You create a CRON monitor → receive unique heartbeat URL
- Your scheduled job runs and pings the URL
- If StatusApp doesn't receive a ping within the grace period → incident created
Creating a Heartbeat Monitor
Via Dashboard
- Go to Monitors > Create Monitor
- Select CRON as the monitor type
- Enter a descriptive name (e.g., "Nightly Database Backup")
- Set the Grace Period (60 seconds to 24 hours)
- Configure notification channels
- Click Create Monitor
- Copy the generated heartbeat URL
Via API
curl -X POST https://ops.statusapp.io/api/v1/monitors \
-H "X-API-Key: your_api_key" \
-H "Content-Type: application/json" \
-d '{
"name": "Daily Backup Job",
"url": "https://placeholder.local",
"type": "CRON",
"gracePeriod": 3600,
"notifications": [
{
"type": "EMAIL",
"enabled": true,
"config": { "emails": ["alerts@example.com"] }
}
]
}'
The response includes a unique heartbeatUrl that you'll use to ping StatusApp.
Configuration
Grace Period
The grace period defines how long StatusApp waits for a ping before triggering an alert.
| Setting | Range |
|---|---|
| Minimum | 60 seconds (1 minute) |
| Maximum | 86400 seconds (24 hours) |
| Default | 300 seconds (5 minutes) |
Recommended Grace Periods by Job Frequency:
| Job Frequency | Grace Period |
|---|---|
| Every minute | 2-5 minutes |
| Every 5 minutes | 10-15 minutes |
| Hourly | 1-2 hours |
| Every 6 hours | 7-8 hours |
| Daily | 2-6 hours |
| Weekly | 24-48 hours |
Tip: Set the grace period to 150-200% of your job's expected duration to account for occasional slow runs.
Heartbeat API Endpoints
All heartbeat endpoints are available at:
https://ops.statusapp.io/api/v1/heartbeat/{heartbeatId}
The heartbeatId is extracted from your unique heartbeat URL.
Success Signal
Report successful job completion. This is the default ping.
GET /api/v1/heartbeat/{heartbeatId}
POST /api/v1/heartbeat/{heartbeatId}
HEAD /api/v1/heartbeat/{heartbeatId}
All HTTP methods work identically. Choose whichever is most convenient for your environment.
Optional Message (POST only, max 10KB):
curl -X POST https://ops.statusapp.io/api/v1/heartbeat/{id} \
-d '{"message": "Processed 1,234 records in 45 seconds"}'
Response:
{
"message": "OK"
}
Start Signal
Indicate that a long-running job has started. Useful for tracking job duration and distinguishing between "job didn't start" vs "job is still running".
GET /api/v1/heartbeat/{heartbeatId}/start
POST /api/v1/heartbeat/{heartbeatId}/start
Fail Signal
Explicitly report job failure. This immediately triggers an incident.
GET /api/v1/heartbeat/{heartbeatId}/fail
POST /api/v1/heartbeat/{heartbeatId}/fail
With Error Message:
curl -X POST https://ops.statusapp.io/api/v1/heartbeat/{id}/fail \
-H "Content-Type: application/json" \
-d '{"error": "Database connection timeout after 30s"}'
Log Signal
Record a log message without changing monitor status. Useful for tracking progress in long-running jobs.
POST /api/v1/heartbeat/{heartbeatId}/log
Content-Type: application/json
Request Body:
{
"message": "Processing batch 5 of 10..."
}
Rate Limiting
Heartbeat endpoints have a rate limit of 5 pings per minute per monitor. This prevents accidental flooding if a script loops incorrectly.
Integration Examples
Bash/Shell Script
Simple ping at end of job:
#!/bin/bash
# Your job logic here
/usr/local/bin/run-backup.sh
# Ping success (using exit code)
if [ $? -eq 0 ]; then
curl -s https://ops.statusapp.io/api/v1/heartbeat/abc123
else
curl -s https://ops.statusapp.io/api/v1/heartbeat/abc123/fail
fi
With start signal for long jobs:
#!/bin/bash
HEARTBEAT_URL="https://ops.statusapp.io/api/v1/heartbeat/abc123"
# Signal job start
curl -s "${HEARTBEAT_URL}/start"
# Run the job
if /usr/local/bin/backup.sh 2>/tmp/backup_error.log; then
# Success
curl -s "${HEARTBEAT_URL}"
else
# Failure with error details
ERROR=$(cat /tmp/backup_error.log | head -c 1000)
curl -s -X POST "${HEARTBEAT_URL}/fail" \
-H "Content-Type: application/json" \
-d "{\"error\": \"$ERROR\"}"
exit 1
fi
Crontab Entry
# Simple: backup job runs daily at 2 AM
0 2 * * * /path/to/backup.sh && curl -s https://ops.statusapp.io/api/v1/heartbeat/abc123
# With error handling
0 2 * * * /path/to/backup.sh && curl -s https://ops.statusapp.io/api/v1/heartbeat/abc123 || curl -s https://ops.statusapp.io/api/v1/heartbeat/abc123/fail
Node.js
const HEARTBEAT_URL = 'https://ops.statusapp.io/api/v1/heartbeat/abc123';
async function runScheduledJob() {
// Signal start
await fetch(`${HEARTBEAT_URL}/start`);
try {
// Your job logic
const result = await performBackup();
// Signal success with details
await fetch(HEARTBEAT_URL, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ message: `Backed up ${result.fileCount} files` })
});
} catch (error) {
// Signal failure with error
await fetch(`${HEARTBEAT_URL}/fail`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ error: error.message })
});
throw error;
}
}
Python
import requests
import sys
HEARTBEAT_URL = "https://ops.statusapp.io/api/v1/heartbeat/abc123"
def run_scheduled_job():
# Signal start
requests.get(f"{HEARTBEAT_URL}/start", timeout=10)
try:
# Your job logic
result = perform_backup()
# Signal success with message
requests.post(HEARTBEAT_URL,
json={"message": f"Backed up {result['size']} bytes"},
timeout=10
)
except Exception as e:
# Signal failure with error
requests.post(f"{HEARTBEAT_URL}/fail",
json={"error": str(e)},
timeout=10
)
raise
if __name__ == "__main__":
run_scheduled_job()
PHP
<?php
$heartbeatUrl = 'https://ops.statusapp.io/api/v1/heartbeat/abc123';
function pingHeartbeat($endpoint = '', $data = null) {
global $heartbeatUrl;
$url = $heartbeatUrl . $endpoint;
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_TIMEOUT, 10);
if ($data) {
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_POSTFIELDS, json_encode($data));
curl_setopt($ch, CURLOPT_HTTPHEADER, ['Content-Type: application/json']);
}
$response = curl_exec($ch);
curl_close($ch);
return $response;
}
// Signal start
pingHeartbeat('/start');
try {
// Your job logic
runBackup();
// Signal success
pingHeartbeat('', ['message' => 'Backup completed']);
} catch (Exception $e) {
// Signal failure
pingHeartbeat('/fail', ['error' => $e->getMessage()]);
throw $e;
}
Go
package main
import (
"bytes"
"encoding/json"
"net/http"
"time"
)
const heartbeatURL = "https://ops.statusapp.io/api/v1/heartbeat/abc123"
var client = &http.Client{Timeout: 10 * time.Second}
func pingHeartbeat(endpoint string, data map[string]string) error {
url := heartbeatURL + endpoint
if data != nil {
body, _ := json.Marshal(data)
_, err := client.Post(url, "application/json", bytes.NewBuffer(body))
return err
}
_, err := client.Get(url)
return err
}
func runScheduledJob() error {
// Signal start
pingHeartbeat("/start", nil)
err := performBackup()
if err != nil {
// Signal failure
pingHeartbeat("/fail", map[string]string{"error": err.Error()})
return err
}
// Signal success
pingHeartbeat("", map[string]string{"message": "Backup completed"})
return nil
}
Docker/Container
#!/bin/sh
HEARTBEAT_URL="https://ops.statusapp.io/api/v1/heartbeat/abc123"
# Signal start
curl -s --max-time 10 "${HEARTBEAT_URL}/start" || true
# Run your job
if your_job_command; then
curl -s --max-time 10 "${HEARTBEAT_URL}" || true
else
curl -s --max-time 10 -X POST "${HEARTBEAT_URL}/fail" \
-H "Content-Type: application/json" \
-d '{"error": "Job failed"}' || true
exit 1
fi
Kubernetes CronJob
apiVersion: batch/v1
kind: CronJob
metadata:
name: backup-job
spec:
schedule: "0 2 * * *"
jobTemplate:
spec:
template:
spec:
containers:
- name: backup
image: your-backup-image
env:
- name: HEARTBEAT_URL
valueFrom:
secretKeyRef:
name: monitoring-secrets
key: heartbeat-url
command:
- /bin/sh
- -c
- |
curl -s "${HEARTBEAT_URL}/start" || true
if /app/backup.sh; then
curl -s "${HEARTBEAT_URL}"
else
curl -s -X POST "${HEARTBEAT_URL}/fail" \
-d '{"error": "Backup failed"}'
exit 1
fi
restartPolicy: OnFailure
AWS Lambda
import urllib.request
import json
HEARTBEAT_URL = 'https://ops.statusapp.io/api/v1/heartbeat/abc123'
def ping_heartbeat(endpoint='', data=None):
url = f"{HEARTBEAT_URL}{endpoint}"
if data:
req = urllib.request.Request(
url,
data=json.dumps(data).encode(),
headers={'Content-Type': 'application/json'}
)
else:
req = urllib.request.Request(url)
urllib.request.urlopen(req, timeout=10)
def lambda_handler(event, context):
ping_heartbeat('/start')
try:
# Your Lambda logic
result = process_data(event)
ping_heartbeat('', {'message': f'Processed {len(result)} items'})
return {'statusCode': 200, 'body': 'Success'}
except Exception as e:
ping_heartbeat('/fail', {'error': str(e)})
raise
Best Practices
1. Use Start Signals for Long Jobs
For jobs that take more than a few seconds:
curl -s "${HEARTBEAT_URL}/start"
# ... long running job ...
curl -s "${HEARTBEAT_URL}"
This helps distinguish:
- Job didn't start (scheduler issue)
- Job started but is still running
- Job started but crashed mid-execution
2. Include Error Details
When signaling failure, include context:
curl -s -X POST "${HEARTBEAT_URL}/fail" \
-H "Content-Type: application/json" \
-d "{\"error\": \"Exit code $?. Log: $(tail -c 500 /var/log/job.log)\"}"
3. Set Appropriate Grace Periods
Calculate: grace_period = (expected_duration * 1.5) + start_delay_buffer
- Too short → false positives from slow runs
- Too long → delayed detection of missed jobs
4. Handle Network Failures Gracefully
Don't let heartbeat failures crash your job:
# Ping but don't fail if ping fails
curl -s --max-time 10 "${HEARTBEAT_URL}" || true
5. Secure Your Heartbeat URL
- Store in environment variables, not code
- Don't log it to user-facing outputs
- Don't commit to version control
- Rotate if exposed
6. Test Before Production
- Create the monitor
- Run your script manually
- Verify StatusApp shows the ping
- Test failure scenario with
/failendpoint
Troubleshooting
No Pings Received
Check:
- Heartbeat URL is correct (no typos)
- Job host can reach
ops.statusapp.io(firewall rules) - Script is actually running (check cron/scheduler logs)
- curl/wget is installed
Test connectivity:
curl -v https://ops.statusapp.io/api/v1/heartbeat/your_id
Rate Limited
Response: {"error": "Rate limited"} (HTTP 200)
Your script is pinging more than 5 times per minute. Check for:
- Loops in your script
- Multiple instances running simultaneously
- Overly aggressive retry logic
False Positives
Getting alerts when jobs are running successfully:
- Increase the grace period
- Ensure ping happens after job completion
- Check for script exits before the ping line
- Add logging around the ping call
Ping Not Updating Status
Verify:
- Monitor type is
CRON(not WEBSITE or API) - Heartbeat ID in URL matches your monitor
- Monitor is not paused
Common Use Cases
Database Backup
Name: Nightly Database Backup
Frequency: Daily at 2 AM
Grace Period: 2 hours (for large databases)
ETL Job
Name: Data Warehouse ETL
Frequency: Every 6 hours
Grace Period: 1 hour
Report Generation
Name: Daily Sales Report
Frequency: Daily at 6 AM
Grace Period: 30 minutes
Cache Cleanup
Name: Redis Cache Cleanup
Frequency: Hourly
Grace Period: 10 minutes
Queue Worker Health
Name: Queue Worker Heartbeat
Frequency: Every 5 minutes
Grace Period: 10 minutes
Next Steps
- Monitor Types Overview - Compare all monitor types
- Notification Channels - Configure alerts
- API Reference - Complete heartbeat API docs
- Incidents - Understanding incident handling
Start monitoring in 30 seconds
StatusApp gives you 30-second checks from 35+ global locations, instant alerts, and beautiful status pages. Free plan available.