Skip to content

openral dashboard — live debugging UI

A single-page, read-only debug pane over the OpenRAL OTel stream. Renders the most recent rskill.execute, skill.chunk_inference, and safety.check spans, rolling metric histograms, and an event log — live, no Jaeger required.

See also ADR-0017 for the "embedded receiver, not in-process exporter" design choice.

Run it

The dashboard is a openral subcommand. The HTTP port serves the UI, the SSE event stream, and an embedded OTLP/HTTP receiver. The default port is 4318 (the OTLP/HTTP standard); it used to be 8000, but that collided with mkdocs serve (just docs) and most FastAPI demos — see issue #132.

openral dashboard            # binds 127.0.0.1:4318 by default
# → stderr prints: OpenRAL dashboard: http://localhost:4318/ …

Then point any OpenRAL workload at it. Sim does not emit OTel by default — you opt in via the env vars below; an unconfigured openral sim run is silent on the dashboard side:

OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318 \
OTEL_EXPORTER_OTLP_PROTOCOL=http/protobuf \
  openral sim run --config scenes/benchmark/libero_spatial.yaml --rskill smolvla-libero

Open http://localhost:4318 in a browser. The connection indicator in the top right turns green within a few hundred ms.

One-keystroke demo — two options

Dashboard → sim (--inprocess)

The dashboard spawns the workload as a child with the OTLP env vars already configured. Pass the whole command as one shell-quoted string (shlex-tokenised):

openral dashboard \
  --inprocess "openral sim run --config scenes/benchmark/pusht.yaml --rskill diffusion-pusht"

Workload → dashboard (--dashboard on sim / deploy / benchmark)

Inverse path: the workload spawns the dashboard as a child on 4318 (override with --dashboard-port) and shuts it down on exit. Best when you already have the workload command memorised. All three entry points carry the same flag:

openral sim run --dashboard \
  --config scenes/benchmark/pusht.yaml \
  --rskill diffusion-pusht

openral deploy run --dashboard --config deployments/so100.yaml

openral benchmark run --dashboard \
  --suite libero_spatial \
  --rskill smolvla-libero

The workload prints OpenRAL dashboard attached: http://localhost:4318/ once the child reports healthy, then routes traces+metrics to it for the rest of the run. The child is SIGINT'd at exit after OTel finishes draining (no Connection refused retries on the way down).

What you see

  • Top bar — service name, run mode (sim / hardware / benchmark once PR #108 lands), short run id, connection status.
  • Live signals (3 cards)rSkill.execute, Inference, Safety. Each card shows the latest call's primary attribute (skill id / engine / check name), latency in ms, age, and the most relevant attributes. Card border turns red on Status.ERROR.
  • Counters — running totals of safety_violation, estop_requested, deadline_missed, sensor_stale span events.
  • Metrics — every histogram, counter, and gauge that comes over the wire, with p50/p95 (for histograms) and a sparkline of the last ~600 samples. Empty until a workload starts emitting; PR #108 adds openral.tick.duration, openral.inference.duration, openral.hal.*.duration, etc.
  • Event log — chronological feed of the last 60 events (spans, span events, and real log lines bridged from structlog over OTLP); ESTOP / safety violation rows render in red. Filter chips toggle the info / warn / error buckets (on by default) and a debug bucket (off by default — high-rate DEBUG such as world_state ~30 Hz would otherwise flood the bounded view; toggle it on when you need it). Log lines are bucketed by OTLP severity_number (DEBUG→debug, INFO→info, WARN→warn, ERROR/FATAL→error).

Endpoints

Path What it serves
GET / Single-page UI (vanilla JS + SSE, no npm)
GET /healthz {"status": "ok"} for compose healthchecks
GET /api/state One-shot JSON snapshot of the current store
GET /api/stream Server-Sent Events — every state update
POST /v1/traces OTLP/HTTP receiver — application/x-protobuf
POST /v1/metrics OTLP/HTTP receiver — application/x-protobuf
POST /v1/logs OTLP/HTTP receiver — log lines into the event log

Running alongside Jaeger + Prometheus

The dashboard is fine on its own. For post-hoc trace analysis or Prometheus-style metric history, also bring up the dev compose stack:

docker compose -f docker-compose.dev.yml up -d otelcol jaeger prometheus

Point your workload at http://localhost:4317 (the OTel Collector) to fan out: traces → Jaeger UI at http://localhost:16686, metrics → Prometheus at http://localhost:9090. The dashboard can still be attached on a different port for the live pane:

openral dashboard --port 4318
OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4317 \
  openral sim run ...   # → otelcol → jaeger + prometheus

(Pointing at both the dashboard and otelcol simultaneously is a v2 feature — the OTel SDK supports it via multiple exporters; v1 of the dashboard expects one endpoint at a time.)

Why this exists

OpenRAL emits all the right OTel spans and metrics by design (see ADR-0010 for the runner-side contract, and PR #108 for the metric surface), and Jaeger renders them beautifully — after the run. For the Day 30 demo and for on-robot debugging, the operator wants a live pane that updates as the robot moves. openral dashboard is that pane; it does not replace Jaeger for post-hoc analysis.