Skip to content

Monitoring

Monitor USSO in production.

Health Endpoint

curl http://localhost:8000/health

Response:

{
  "status": "healthy",
  "database": "connected",
  "redis": "connected",
  "version": "0.1.0"
}

Metrics

USSO exposes Prometheus metrics at /metrics:

# Authentication metrics
usso_login_attempts_total
usso_login_success_total
usso_login_failure_total

# Token metrics
usso_token_issued_total
usso_token_verified_total
usso_token_expired_total

# API metrics
usso_http_requests_total
usso_http_request_duration_seconds

Logging

Log Levels

  • ERROR - Errors requiring attention
  • WARNING - Important events
  • INFO - General information
  • DEBUG - Detailed debugging

Log Format

{
  "timestamp": "2025-10-04T10:00:00Z",
  "level": "INFO",
  "message": "User login successful",
  "user_id": "user:abc123",
  "ip": "192.168.1.1",
  "user_agent": "Mozilla/5.0..."
}

Centralized Logging

ELK Stack:

services:
  app:
    logging:
      driver: "json-file"
      options:
        labels: "service=usso"

CloudWatch:

import watchtower
logging.getLogger().addHandler(watchtower.CloudWatchLogHandler())

Alerting

Key Alerts

  1. High Error Rate
  2. Threshold: > 5% errors
  3. Action: Investigate logs

  4. Slow Responses

  5. Threshold: p95 > 1s
  6. Action: Check database

  7. Failed Logins

  8. Threshold: > 100/min
  9. Action: Possible attack

  10. Database Issues

  11. Health check fails
  12. Action: Check connectivity

Alert Configuration

# Prometheus AlertManager
groups:
- name: usso
  rules:
  - alert: HighErrorRate
    expr: rate(usso_http_requests_total{status=~"5.."}[5m]) > 0.05
    annotations:
      summary: "High error rate detected"

Dashboards

Grafana Dashboard

Key panels: - Request rate - Error rate - Response time (p50, p95, p99) - Active sessions - Database connections

Example Query

# Request rate
rate(usso_http_requests_total[5m])

# Error rate
rate(usso_http_requests_total{status=~"5.."}[5m])

# Response time p95
histogram_quantile(0.95, rate(usso_http_request_duration_seconds_bucket[5m]))