ADR-0019: rosbag2 ↔ LeRobotDataset v3 bridge
- Status: Accepted
- Date: 2026-05-24
- Amended: 2026-05-24 (see Amendments below)
- Related: CLAUDE.md §6.1 (Layer 7 — Observability), §1.11 (no mocks), §1.14 (docs travel with code), §7.2 (smallest viable PR / pre-approval), §12 (new top-level packages need an ADR); pairs with ADR-0017 (dashboard OTLP receiver); closes the rosbag2 placeholder in ADR-0010 §6.
Context
The Week-4 roadmap (docs/roadmap/index.md:131) calls for a "rosbag2 ↔
LeRobotDataset v3 recorder — every successful skill execution becomes a
sync-video + state + action-chunk row; openral dataset push
hf://openral/dataset-<name> with consent prompt." Today the repo
references rosbag2 in two places (this ADR set and runner backend
comments) but contains no rosbag2 writer, no
LeRobotDataset writer, and no camera-topic publisher on the hardware
runner. The OTel semconv module already reserves
openral.dataset.repo_id / episode_idx / frame_idx placeholders
(python/observability/src/openral_observability/semconv.py:143–145) —
they are meaningless until this bridge wires them.
The bridge is the durable counterpart of ADR-0017's transient observability fan-out. ADR-0017 lets a developer see a skill execution live; this bridge lets a developer replay it, fine-tune on it, and publish it later. The two ship together as the Week-4 observability deliverable.
Five orthogonal design questions need answers up front:
- Where does the code live? New top-level package, submodule under
openral_observability, or scattered acrossopenral_sim+openral_runner+tools/? - Which dataset format? LeRobotDataset v2.1 (file-per-episode, stable for ~12 months) or v3.0 (chunked, codebase_version="3.0", released April 2026 in lerobot v0.5.1).
- What happens to failed episodes? Discard (datasets = only
demonstrations of success) or persist with a
next.success=Falsetag (datasets = full distribution). - License posture on produced datasets. The OpenRAL code is Apache-2.0 across this layer, but a dataset carries an independent license string.
- PR sizing. The full bidirectional scope is ~1500–2000 LOC, exceeding CLAUDE.md §7.2's 800-LOC pre-approval threshold.
Decision
1. Package layout — new top-level python/dataset/ (openral_dataset)
The bridge ships as a new workspace package adjacent to
python/observability/, not nested inside it. Justifications:
- Lifecycle separation. Observability is transient telemetry exported on the wire (OTLP → collector → backend). Datasets are durable artifacts written to disk and Hugging Face Hub (parquet, mp4, mcap). Coupling them mixes a "configure SDK and forget" library with a "manage files, codecs, and licenses" library.
- Dependency surface.
openral_observabilitytoday depends on OTel SDK + structlog. The bridge needslerobot>=0.5.1,pyarrow,rosbag2_py,mcap,mcap-ros2-support, ffmpeg-via-lerobot — hundreds of MB of transitive deps. Pulling them into observability would slowuv syncfor every consumer of observability. - License posture. Telemetry is uniformly Apache-2.0. Datasets
carry per-dataset license strings (default
CC-BY-4.0, but consumers may legitimately shipCC-BY-NC,CC0, or a custom license). Keeping the producing code in its own package keeps the open-core boundary §1.9 cares about clean. - Roadmap framing.
docs/roadmap/index.md:131lists "Dataset bridge" as a peer to "OpenTelemetry" in the Week-4 deliverables, not as a sub-item. - §12 is not a barrier. The "STOP, propose in an ADR first" instruction is satisfied by this ADR; we're writing it either way.
openral_dataset imports openral_observability for the semconv
constants — the trace-correlation handle (span IDs in Tick.msg)
crosses the package boundary without difficulty.
This does not add a layer. It is a sub-responsibility of Layer 7 (Observability) per CLAUDE.md §6.1, factored out for separation of concerns. The 8-layer model stays.
2. Dataset format — LeRobotDataset v3.0
v3.0 is the current lerobot codebase_version ("3.0"), released
April 2026 with lerobot==0.5.1. Adopt it directly; do not ship a v2.1
writer.
Reasons:
- File-count scalability. v2.1 wrote one Parquet + one MP4 per episode. v3.0 batches multiple episodes per file (0–5 MB chunks). A 10 000-episode dataset goes from ~20 000 files to ~100 files — hundreds of times fewer inode lookups during training I/O.
- Random-access reads. v3.0's
meta/episodes/chunk-*/file_*.parquetcarries per-episode offsets into both Parquet data and MP4 video streams.lerobot.datasets.LeRobotDataset.delta_timestampsno longer has to load whole-episode payloads. - Codec stability. v3.0 locks codec parameters in
info.jsonafter the first episode, removing the v2.1 mid-dataset codec-skew bug. - Workspace already ships lerobot.
python/hal/pyproject.tomland the[dependency-groups] libero/metaworldgroups already pin lerobot; bumping the floor to>=0.5.1in the newopenral_datasetpackage is non-invasive.
lerobot.scripts.convert_dataset_v21_to_v30.py upstream covers
back-conversion if a v2.1 dataset shows up; we do not re-implement
it.
3. Failure policy — persist all episodes with next.success flag
Every episode is written. Successful and failed rollouts both produce
rows; next.success is the boolean flag. Top-level
meta/info.json["metadata"]["dataset_success_rate"] reports the
ratio so downstream consumers can filter.
Reasons:
- Imitation literature. Policies trained on success + failure consistently beat policies trained on success-only when the failure distribution is unbiased (ALOHA, RT-2 ablations). Failures are signal, not noise.
- Reasoner training. The replanning ladder (§6.6) needs labelled failures to learn substitute / goal-replan decisions.
- Consent decoupling. The decision to persist is independent of
the decision to publish. The consent gate lives at
openral dataset push(PR5), not at recorder time. Local--dataset-outwrites stay on disk under the user's control.
4. License posture — per-dataset string, default CC-BY-4.0
Each produced dataset carries a license field in
meta/info.json["metadata"]["license"], defaulting to CC-BY-4.0
(the LeRobot convention) and overridable via --dataset-license <spdx>
on openral sim run and openral dataset push. The package code stays
Apache-2.0; the data license is independent.
Datasets containing PII (human faces in camera frames, audio, biometric joint trajectories) MUST set a more restrictive license. The PR5 consent prompt enforces this disclosure.
5. PR sizing — pre-approved exception to §7.2
Total LOC for the bridge series is estimated at 1500–2000 across production code + tests + docs, over §7.2's 800-LOC informal threshold. The series is split into six discrete PRs so each one is reviewable as the smallest viable change in dependency order:
| PR | Scope | LOC est. |
|---|---|---|
| PR0 | This ADR + repo-state-map block + roadmap flip | ~150 |
| PR1 | openral_dataset package: RolloutRecorder, LeRobotDatasetSink, schema_map; sim wiring |
~400 |
| PR2 | SensorRosPublisher + new openral_sensors_ros lifecycle package |
~300 |
| PR3 | Rosbag2Sink + Tick.msg / Episode.msg IDLs + hardware episode API |
~500 |
| PR4 | Rosbag2ToLeRobotConverter + openral dataset from-bag |
~300 |
| PR5 | openral dataset push + consent prompt + _hf_publish de-dup |
~200 |
Each PR includes its own tests, docs, and docs/METHODS.md updates
per §1.14. Sim-side (PR1) ships first because it has no
ROS / GStreamer / hardware dependencies and exercises the full
RolloutRecorder → DatasetSink → LeRobot v3.0 path end-to-end.
Consequences
- New workspace package
openral_datasetatpython/dataset/. Added to[tool.uv.sources]in the rootpyproject.toml. lerobot>=0.5.1is now a first-class dep ofopenral_dataset(lazy-imported at sink instantiation so the package stays importable on hosts without lerobot, with a typedROSConfigErrorraised on construction without it).- Two new OTel semconv constants in
openral_observability/semconv.py:EVENT_EPISODE_CLOSED,DATASET_EPISODE_SUCCESS. The existingDATASET_REPO_ID/DATASET_EPISODE_IDX/DATASET_FRAME_IDXplaceholders are now live. SimRunneraccepts an optionalrecorder: RolloutRecorder | Nonekwarg. When set, the recorder is fed in parallel with the existing_EpisodeBuffer(additive — the buffer is not replaced; the per-episode video pipeline andRSkillEvalResultwriter stay unchanged).HardwareRunner(PR3) gets explicitepisode_start(task_string)/episode_end(*, success)methods. These also land asNotImplementedErrordefaults onInferenceRunnerBaseso future runners must opt in.- New ROS 2 package
openral_sensors_ros(PR2) lifts the camera-topic publisher out ofpython/runner/.../backends/gstreamer/ros_tee.pyand generalises it to non-GStreamer sources. The GStreamer zero-copy path is preserved; the new path is a parallel consumer for OpenCV / RealSense readers. - New IDLs
openral_msgs/Tickandopenral_msgs/Episode(PR3) extend the existingpackages/msgs/package. openral datasetCLI subgroup (PR4 / PR5) withfrom-bagandpushsubcommands.tools/rskill_publisher.pyis refactored to share_hf_publishhelpers withopenral dataset push(de-dup per §1.13).
Amendments — 2026-05-18 (post-merge revert)
After landing the PR series, a follow-up review concluded that the sink's first-frame state/action/camera shape derivation was the wrong contract: a buggy policy that emits wrong-shape actions would silently produce a malformed dataset. The bridge now requires every shape to be declared up-front:
- Hardware path:
RobotDescription.observation_spec/action_specandSensorSpec.intrinsicsare authoritative. - Sim path: per ADR-0007, the sim-specific contract lives on the
rSkill manifest (
state_contract.dimand the newly-addedaction_contract.dim); the camera shape comes from the scene config (SceneSpec.observation_height/width— sim renders all cameras at one resolution, often different from the physical sensor's intrinsics).
Concrete changes:
- New schema:
openral_core.schemas.ActionContractmirrorsStateContract.RSkillManifest.action_contractis the new optional field. Both contracts are required for any rSkill that wants bridge support. - Sink reverted:
LeRobotDatasetSink._create_datasetno longer takes afirst_frameargument. The features dict is resolved at__init__from the robot's specs + caller-provided overrides. Per-framewrite_framevalidates every shape strictly and raisesValueErroron mismatch. - CLI wires manifest contracts:
openral sim run --dataset-outloads the rSkill manifest, readsstate_contract.dim+action_contract.dim, and passes them asstate_shape/action_dimoverrides toLeRobotDatasetSink. The scene'sobservation_height/widthflow through ascamera_shape. - All 19 rSkill manifests under
rskills/now declarestate_contract+action_contract(act-aloha, ACT-LIBERO, diffusion-pusht, pi05-, smolvla-, xvla-libero, every RLDX variant, template). - All 11 robot manifests under
robots/already had intrinsics on every camera-bearing sensor (audit confirmed). No manifest changes needed there.
Smoke verification (real GPU + real VLA weights + real sim envs):
| Config | rSkill | State | Action | Result |
|---|---|---|---|---|
| PushT + Diffusion | diffusion-pusht | 2 | 2 | ✅ |
| Franka + LIBERO + pi05 | pi05-libero-nf4 | 8 | 7 | ✅ |
| Franka + LIBERO + SmolVLA | smolvla-libero | 8 | 7 | ✅ |
| Franka + LIBERO + xVLA | xvla-libero | 8 | 7 | ✅ |
| Franka + LIBERO + ACT | act-libero | 8 | 7 | ✅ |
| Sawyer + MetaWorld + SmolVLA | smolvla-metaworld | 4 | 4 | ✅ |
| Aloha + ACT (cube) | act-aloha | 14 | 14 | blocked: upstream dm_control × mujoco 3.8.0 |
| Aloha + ACT (insertion) | act-aloha-insertion | 14 | 14 | same |
| RoboCasa / GR1 / rldx1 sidecar | (skipped — separate venv / sidecar) | not in this verification scope |
The two Aloha failures are upstream env issues ('MjModel' object
has no attribute 'flex_bendingadr' from dm_control 1.0.41 reading a
mujoco 3.8.0 model). The bridge code path is correct — the same path
that ACT-LIBERO uses for action emission and the multi-robot bridge
test exercises against Aloha at the schema-binding layer (passes 4/4).
The MetaWorld config (scenes/benchmark/metaworld_push.yaml)
was updated from declaring observation_height/width: 256 to
480 because the MetaWorld backend adapter does not honour the
scene-level resize (the docstring claims it does; the code does
not). The bridge's strict shape validation caught the mismatch — a
real bug that previously would have silently produced a malformed
dataset.
Verification
Each PR ships its own verification commands per the bridge plan; the ADR itself is verified by:
mkdocs build --strict— markdown link integrity.docs/architecture/repo-state-map.htmlcarries the newpython/dataset/block adjacent topython/observability/.docs/roadmap/index.md:131flips from 🔵 planned to 🟡 in flight on PR0 acceptance, and to ✅ on PR5 merge.
PR-1 verification (canonical, sim-side):
uv run pytest python/dataset/tests -v
uv run openral sim run --config scenes/sim/libero_spatial.yaml \
--rskill rskills/mock-1 \
--n-episodes 2 \
--dataset-out /tmp/ds
uv run python -c "
from lerobot.datasets import LeRobotDataset
d = LeRobotDataset('/tmp/ds')
assert len(d) > 0
print(d.meta.info['metadata']['dataset_success_rate'])
"
Per CLAUDE.md §1.11: every test loads real
RobotDescription.from_yaml("robots/so100_follower/robot.yaml") and
exercises a real lerobot.datasets.LeRobotDatasetWriter. lerobot is
behind the libero / metaworld dependency groups today; tests
pytest.skip with a typed reason on hosts without it, never with a
mock.
Amendments — 2026-06-08 (three-tier scene paths)
ADR-0041 split scenes/ into deploy/sim/benchmark tiers and stripped
rSkill names from filenames. Two updates in this ADR:
- The MetaWorld config reference moved from
scenes/benchmarks/smolvla_metaworld_push.yamltoscenes/benchmark/metaworld_push.yaml(singularbenchmark/, rSkill name dropped). MetaWorld also has no SimScene-tier sibling post-refactor —metaworld_push.yamlexists only at the BenchmarkScene tier. - The regression-reproduction example above (
uv run openral sim run --config … --dataset-out …) switched simulator from MetaWorld to LIBERO, pointing atscenes/sim/libero_spatial.yaml. Reason:--dataset-outis exclusive toopenral sim run(a SimScene-tier command), and MetaWorld lacks a SimScene sibling. The MetaWorld- specific bug coverage referenced by this amendment is preserved in the test suite — the demo command just needs a SimScene to drive end-to-end. See ADR-0041 andscenes/README.mdfor the per-tier strict- CLI matrix.
Amendments — 2026-06-09 (per-frame OTel correlation — issue #109)
Closes the last deferred OTel piece from the 2026-05-17 amendment on
ADR-0010: per-frame (trace_id, span_id) on
written dataset rows. The reverse link (the openral.dataset.repo_id /
episode_idx / frame_idx span attributes) already shipped; this adds
the forward link so a row pivots back into its trace.
- Capture point.
RolloutRecorder.record_framereads the activerskill.tickspan's context (get_current_span().get_span_context()) and stamps the 32-hextrace_id+ 16-hexspan_idonto theDatasetFrame. Capture happens here — not inside a sink — because theRosbag2Sinkdefers its mcap write to a worker thread where the OTel context is no longer in scope; the ids must ride on the frame. Absent a valid span the fields degrade to"". - Persistence.
LeRobotDatasetSinkdeclarestrace_id/span_idas v3stringfeatures (plaindatasets.Value("string")parquet columns, readable without decoding episode videos).Rosbag2Sinkwrites the same ids into the/openral/tickrecord (the schema already declared the fields). - Offline fidelity.
record_frametakes optionaltrace_id/span_idoverrides;Rosbag2ToLeRobotConverterpasses each bag tick's original ids so an offline bag→LeRobot conversion preserves the source rollout's trace rather than stamping the convert run's own (empty) context. - Pivot.
openral_dataset.read_frame_trace(root, episode_idx, frame_idx)reads a row's(trace_id, span_id)straight from parquet, andopenral replay --frame <repo_id>/<ep>/<frame> --dataset-root <dir>resolves that trace_id as the bag↔span join key. Theopenral_observabilitybag reader learned the raw-trace_id+span_idTick convention (it previously assumed everyjsonschemapayload packed a full W3Ctraceparentin one field). - Dataset- and episode-level pointers. Because
trace_idis run-constant (everyrskill.tickshares the onecli.commandroot trace) butspan_idis per-tick, the sink also writes coarser pointers so a consumer need not scan the data parquet: the distincttrace_ids+n_tracesland inmeta/info.json["metadata"](dataset-level), and ameta/openral_traces.jsonsidecar maps everyepisode_index → trace_id(episode-level — kept out ofmeta/episodes/*.parquetbecause v3 drops string features from its per-episode stats). The episode map is the granularity that matters for datasets accumulated across multiple runs (resume-append), where episodes carry different traces. - Not done (separate PR). The optional SemVer-major
trace_id→traceparentrename +tracestateonopenral_msgs(with atools/schema_migrator/entry per CLAUDE.md §1.6) is out of scope and deliberately deferred.