Monitoring & Observability

RelayCore provides Prometheus-native metrics, structured audit logs, SSE real-time event streams, and health status snapshots — ready for integration with existing monitoring stacks.

Metrics Endpoints

EndpointFormatDescription
GET /api/v1/metricsJSONStructured metrics snapshot
GET /api/v1/metrics/prometheusPrometheus textStandard Prometheus scrape format
/_relay/metricsJSONIn-proxy embedded metrics
/_relay/metrics/prometheusPrometheus textEmbedded Prometheus format

Core Metrics

Flow Metrics

MetricTypeDescription
relaycore_flows_totalCounterTotal flows processed
relaycore_flows_in_memoryGaugeCurrent flows in memory
relaycore_flows_droppedCounterFlows dropped due to backpressure
relaycore_flow_events_lagged_totalCounterEvent broadcast lag events

Intercept Metrics

relaycore_intercepts_pendingGaugePending intercepts awaiting UI
relaycore_oldest_intercept_age_msGaugeOldest unhandled intercept age
relaycore_ws_pending_messagesGaugePending WebSocket messages

Rule & Script Metrics

relaycore_rule_exec_errorsCounterRule execution errors
script_hook_duration_us{hook}HistogramPer-hook execution duration (µs)
script_hook_invocations_total{hook}CounterCumulative hook invocations
script_hook_errors_total{hook}CounterCumulative hook errors
script_fetch_total{target,status}Counterrelay.fetch call count

Audit Metrics

relaycore_audit_events_totalCounterTotal audit events
relaycore_audit_events_failedCounterFailed audit recordings
relaycore_audit_events_lagged_totalCounterAudit broadcast lag events

Prometheus Configuration

# prometheus.yml
scrape_configs:
  - job_name: "relaycore"
    static_configs:
      - targets: ["127.0.0.1:8082"]
    metrics_path: "/api/v1/metrics/prometheus"
    scrape_interval: 15s

Health Check

# Status snapshot
curl http://127.0.0.1:8082/api/v1/status

# Response
{
  "phase": "Running",
  "is_running": true,
  "port": 8080,
  "uptime_seconds": 3600,
  "last_error": null
}

Lifecycle phases: CreatedStartingRunningStoppingStopped (Failed for crashes)

SSE Event Stream

GET /api/v1/events pushes real-time updates via Server-Sent Events:

Event TypeDescription
flowNew or updated flow
ws-messageWebSocket message update
http-bodyHTTP request/response body update
auditAudit event (rule change, intercept, etc.)
lifecycleProxy lifecycle change
laggedEvent channel lag warning
# Subscribe via curl
curl -N http://127.0.0.1:8082/api/v1/events

# JavaScript EventSource
const es = new EventSource("http://127.0.0.1:8082/api/v1/events");
es.addEventListener("flow", (e) => console.log(JSON.parse(e.data)));

Audit System

Audit logs track all control-plane operations and cannot be disabled:

Event TypeTrigger
rule_changedRule add/delete/modify
intercept_resolvedIntercept resolved (continue/drop/modify)
script_reloadedScript loaded or reloaded
policy_updatedPolicy changed (redaction etc.)

Audit actors: runtime / http / tauri / probe / cli

# Query recent 50 audit events
curl "http://127.0.0.1:8082/api/v1/audit?limit=50"

# Filter by type
curl "http://127.0.0.1:8082/api/v1/audit?kind=rule_changed&limit=20"

Grafana Integration

The benchmarks/grafana/ directory contains a ready-to-use Grafana Dashboard JSON. Pair with Prometheus for a complete monitoring panel.