Understanding Analytics
Explore uptime percentages, response time trends, and performance insights for your monitors.
Understanding Analytics and Metrics
Overview
StatusApp includes built-in analytics to track reliability, performance, and incident quality. Dashboards focus on what matters most: uptime, response time percentiles, incident frequency, MTTR, and root causes. This guide explains each metric, where to find it, and how to act on it.
Key Metrics
Uptime Percentage
Uptime % = (Successful Checks ÷ Total Checks) × 100
- Calculated per monitor, per region, and overall
- Includes DOWN and DEGRADED time
- Shown for 24h, 7d, 30d, and 90d periods
Industry benchmarks
- 99.9% ("three nines") ≈ 43.2 min downtime/month
- 99.99% ("four nines") ≈ 4.3 min downtime/month
- 99.999% ("five nines") ≈ 26 sec downtime/month
Response Time Percentiles
How fast your endpoint responds, shown as:
- P50 (Median): Typical user experience
- P90/P95: Slow end of normal traffic
- P99: Outliers; great for spotting regressions
- Average: Helpful but can hide spikes
Interpreting:
- Rising P95/P99 = emerging performance issue
- Spiky graph = inconsistent performance
- Flat, low percentiles = healthy service
MTTR (Mean Time To Resolution)
MTTR = (Sum of incident durations) ÷ (Number of resolved incidents)
- Measures how quickly you recover
- Tracked per monitor and overall
- Shown on Incident Analytics
Time to First Update
- Measures responsiveness in communication
- Tracks minutes from incident start to first public update
- Aim for < 15 minutes
Incident Frequency
- Count of incidents over time (daily/weekly/monthly)
- Highlights noisy services
- Correlate with deployments or traffic peaks
Top Root Causes
- Breakdown of root cause categories across incidents
- Identify systemic issues (e.g., DATABASE_ISSUE, DEPLOYMENT_ISSUE)
Dashboards
Overview Dashboard (Incidents)
Shows reliability and communication quality at a glance:
- Active vs Resolved incidents
- Average MTTR
- Time to first update
- Incident frequency over time
- Severity breakdown
- Top root causes
- Top monitors by incident count
Monitor Analytics
For each monitor, view:
- Current status (UP/DOWN/DEGRADED)
- Uptime % for 24h/7d/30d/90d
- Response time percentiles (P50/P95/P99)
- Region-level performance (where available)
- Incident history and duration
- Recent checks with status codes/errors
Status Page Reliability
When a monitor is on a status page, its uptime and incidents roll into public metrics:
- 90-day uptime chart (public)
- Daily uptime bars with color coding
- Public incident history
Time Ranges
- Quick ranges: 24h, 7d, 30d, 90d
- Custom range: choose start/end dates
- All charts and tables respect the selected range
Charts & Visuals
Uptime Timeline
- Green = operational
- Yellow = degraded
- Red = down
- Hover for exact timestamps and duration
Response Time Chart
- Line or area chart of response times
- P50/P95/P99 overlays for quick regression spotting
- Spikes highlight potential performance incidents
Incident Frequency & Severity
- Bar/line chart of incidents over time
- Severity breakdown (Critical/High/Medium/Low)
- Click through to the incident details
Root Cause Breakdown
- Pie/bar chart of root cause categories
- Track whether fixes reduce recurring causes
How to Use Analytics
Detect Regressions Early
- Watch P95/P99 for rising trends
- Compare 24h vs 7d to see fresh problems
- Correlate spikes with deployments
Improve Reliability
- Track MTTR over the last 30/90 days
- Find top failing monitors and prioritize fixes
- Reduce time to first update during incidents
Capacity & Performance Planning
- Monitor response time trends before traffic peaks
- Identify slow regions and add redundancy
- Use historical baselines for SLO/SLA targets
Communication Quality
- Keep "time to first update" under 15 minutes
- Ensure regular updates during incidents
- Measure resolution quality with root cause tracking
Exports
- Export incident lists and metrics (CSV) for reporting
- Export check logs for deep analysis (per monitor)
- Copy uptime numbers for status reports
Plan Notes
- Analytics available on all paid plans (Starter+)
- Response time percentiles and incident analytics on Pro+
- Exports and longer retention on Business/Enterprise
Quick Reference
If uptime drops:
- Check incident list for active issues
- Open monitor analytics to see failing regions
- Review recent deployments or config changes
If P95/P99 spike:
- Correlate with traffic/load
- Check slow regions
- Look for database or third-party latency
If MTTR is high:
- Improve alerting + on-call response
- Standardize runbooks
- Post updates faster to keep teams aligned
If incidents repeat:
- Review root cause breakdown
- Fix systemic issues (config, DB, deployments)
- Add automated checks for the specific failure mode
- Pie Chart: Percentage of each status
- Success Rate: HTTP 2xx responses
- Redirects: HTTP 3xx responses
- Client Errors: HTTP 4xx responses
- Server Errors: HTTP 5xx responses
Advanced Analytics
Performance Regression Detection
StatusApp automatically detects when performance degrades significantly:
- Establishes baseline from historical data
- Alerts when response times exceed baseline + threshold
- Helps identify issues early
- Configurable sensitivity levels
Anomaly Detection
- Identifies unusual patterns
- Detects unexpected downtime
- Flags performance spikes
- Learns from historical patterns
Prediction Models
- Forecast future performance
- Identify reliability trends
- Predict potential issues
- Plan capacity upgrades
Creating Reports
Manual Reports
- Go to Analytics > Reports
- Click Create Report
- Select monitors to include
- Choose time period
- Select metrics to display
- Generate PDF or email
Report Contents
- Executive summary
- Detailed metrics
- Charts and graphs
- Incident summary
- Trend analysis
- Recommendations
Scheduled Reports
- Create report
- Enable Auto-Generate
- Choose frequency (weekly, monthly)
- Set recipients
- EmailAutomatically sends at specified times
Report Distribution
- Email to team
- Email to stakeholders
- Email to customers
- Post to status page
- Export as PDF
Exporting Data
CSV Export
- Select time period
- Choose metrics
- Click Export
- Opens in Excel/spreadsheet
- Analyze with external tools
API Access
Programmatically retrieve analytics via the StatusApp API:
# Get monitor details including recent checks
curl https://ops.statusapp.io/api/v1/monitors/{id} \
-H "X-API-Key: your_api_key"
# Get incidents with filtering
curl "https://ops.statusapp.io/api/v1/incidents?status=resolved&limit=50" \
-H "X-API-Key: your_api_key"
See the API Reference Guide for complete documentation.
SLA Management
Defining SLAs
- Go to Settings > SLAs
- Click Create SLA
- Define uptime target (e.g., 99.9%)
- Set measurement period
- Assign monitors
SLA Tracking
- Real-time status vs. target
- Monthly SLA reports
- SLA breach alerts
- Historical compliance data
- Credit calculation
SLA Reporting
- Automatic reports generation
- Email to customers
- Post to status page
- API access
- Audit trail
Insights and Recommendations
StatusApp AI Insights
- Automatic analysis of metrics
- Performance recommendations
- Reliability improvements
- Cost optimization suggestions
When to Investigate
Look for these patterns:
Response Time Degradation
- Upward trend in response times
- Sudden spikes
- Increased variability
- Compare with infrastructure changes
Increasing Incident Frequency
- More incidents than usual
- Shorter MTBF
- Different types of failures
- Common time patterns
Availability Issues
- Declining uptime trend
- Repeated failures
- Similar failure times
- Failed monitor checks
Best Practices
Regular Review
- Weekly: Check critical service metrics
- Monthly: Generate full reports
- Quarterly: Analyze trends
- Annually: Review SLA compliance
Benchmarking
- Compare with industry standards
- Track improvement over time
- Identify weak areas
- Set realistic targets
- Celebrate improvements
Team Communication
- Share metrics with team
- Discuss performance trends
- Plan improvements
- Celebrate reliability wins
- Post-mortem on incidents
Action Items
- Identify Issues: Use analytics to find problems
- Root Cause Analysis: Understand why issues occur
- Implement Solutions: Fix underlying problems
- Monitor Results: Track improvement
- Iterate: Continuously improve
Troubleshooting Analytics
Data Not Showing Up
- Verify monitor is running
- Check time period selection
- Ensure monitor has historical data
- Verify permissions
Unexpected Downtime
- Review incident details
- Check monitor configuration
- Verify false positives
- Review logs
Inconsistent Metrics
- Check regional differences
- Verify time zone settings
- Review calculation methods
- Compare manual vs. automated
Next Steps
- Learn about incident management
- Set up alerts
- Create status pages
- Explore API
Start monitoring in 30 seconds
StatusApp gives you 30-second checks from 35+ global locations, instant alerts, and beautiful status pages. Free plan available.