Skip to content

Sim Environments

Scene YAMLs under scenes/ follow the three-tier hierarchy introduced by ADR-0041: DeployScene ⊆ SimScene ⊆ BenchmarkScene. Each tier has its own directory, its own loader-strictness gate, and its own CLI consumer. The conceptual overview, decision matrix, authoring guide, and per-backend scene.id catalogue all live in the in-tree scenes/README.md; this page is the per-file catalogue — one row per YAML.

Scene dependencies are auto-installed on first use. Bypass the install prompt in CI with OPENRAL_AUTO_INSTALL_DEPS=1.

Quick CLI

# DeployScene — env-only playground (reasoner picks the rSkill at runtime).
openral deploy sim --config scenes/deploy/openarm_tabletop.yaml

# SimScene — single rollout; supply the policy at the CLI.
openral sim run --config scenes/sim/libero_spatial.yaml --rskill smolvla-libero

# BenchmarkScene — paper-comparable single-scene eval; writes
# rskills/<vla>/eval/<scene_id>.json with reproduced_locally=true.
openral benchmark scene --config scenes/benchmark/libero_spatial.yaml \
                        --rskill smolvla-libero

# Benchmark suite — multi-scene aggregate (lives in benchmarks/, not scenes/).
openral benchmark run --suite libero_spatial --rskill smolvla-libero

Override flags (--task, --instruction, --max-steps, --n-episodes, --robot for free-axis scenes) work on every tier except benchmark run, which intentionally rejects them to guarantee suite reproducibility. See scenes/README.md for the full override matrix.

DeployScene catalogue (scenes/deploy/)

Env-only "robot + scene" pins. No task: block, no eval; the runtime reasoner picks the rSkill. Consumed by openral deploy sim.

Config Fixed / declared robot scene.id Backend Use
libero_pnp.yaml franka_panda (scene-fixed) libero_spatial LIBERO (robosuite + MuJoCo) Boot LIBERO in deploy mode so a reasoner can issue arbitrary pick-and-place commands
openarm_tabletop.yaml openarm (free-axis) openarm_tabletop_pnp Custom MJCF OpenArm bimanual tabletop sandbox; default top camera matches the mddoai dataset POV
robocasa_pnp.yaml panda_mobile (scene-fixed) robocasa/PickPlaceCounterToCabinet RoboCasa (MuJoCo) Mobile-base kitchen pick-and-place sandbox; reasoner-driven
so101_box.yaml so101_follower (scene-fixed) so101_box Custom MJCF 100×61.5×75 cm box arena + OAK-D Pro overhead + wrist camera; deploy sandbox

SimScene catalogue (scenes/sim/)

DeployScene + a single task: block. One CLI invocation, one or more EpisodeResults; sized for ad-hoc development and smoke tests. The policy is supplied at the CLI via --rskill <name> — scene YAMLs no longer pin a VLA. Consumed by openral sim run.

Config Fixed / declared robot scene.id task.id Notes
franka_libero_pnp.yaml franka_panda (scene-fixed) franka_libero_custom_bddl custom_milk/0 Custom BDDL (franka_libero_pnp.bddl) → robosuite OffScreenRenderEnv; pick milk into basket
libero_spatial.yaml franka_panda (scene-fixed) libero_spatial libero_spatial/0 LIBERO-Spatial smoke; ad-hoc sibling of scenes/benchmark/libero_spatial.yaml
openarm_tabletop.yaml openarm (free-axis) openarm_tabletop_pnp openarm/pnp_cube_to_drawer Bimanual cube-to-drawer; mirrors the mddoai dataset POV
robocasa_gr1_pnp_cup_to_drawer.yaml gr1 (scene-fixed) robocasa/gr1/PnPCupToDrawerClose robocasa/gr1/PnPCupToDrawerClose/0 RoboCasa GR1 humanoid tabletop pnp
robocasa_panda_mobile_kitchen.yaml panda_mobile (scene-fixed) robocasa/NavigateKitchen robocasa/NavigateKitchen/0 Mobile-base kitchen navigation; deploy sim Nav2 graph compatible
robocasa_pnp.yaml panda_mobile (scene-fixed) robocasa/PickPlaceCounterToCabinet robocasa/PickPlaceCounterToCabinet/0 RoboCasa kitchen pnp smoke
so101_tube_insertion.yaml so101_follower (scene-fixed) so101_box so101_box/tube_insertion Box-arena tube-insertion smoke; geometry/sensors/spawn ranges configurable via BoxSceneOptions
tabletop_cube_push.yaml so101_follower (free-axis default; pass --robot to override) tabletop_push tabletop_push/push_to_goal Robot-agnostic cube push-to-goal (ADR-0033)

BenchmarkScene catalogue (scenes/benchmark/)

SimScene + required metadata: BenchmarkMetadata (paper URL + honest_scope) + non-None seed and n_episodes. The shipped values match the canonical paper protocol; running openral benchmark scene against one of these writes rskills/<vla>/eval/<scene_id>.json with reproduced_locally=true. Consumed by openral benchmark scene. Most are also aggregated into a multi-scene suite (bare list[BenchmarkScene] per ADR-0042) under benchmarks/.

Config Fixed / declared robot scene.id task.id n_episodes Paper
aloha_insertion.yaml aloha_bimanual (scene-fixed) aloha_insertion aloha_insertion/0 200 ALOHA / ACT
aloha_transfer_cube.yaml aloha_bimanual (scene-fixed) aloha_transfer_cube aloha_transfer_cube/0 200 ALOHA / ACT
libero_spatial.yaml franka_panda (scene-fixed) libero_spatial libero_spatial/0 500 LIBERO
maniskill_pick_cube.yaml franka_panda (free-axis) maniskill3 maniskill3/PickCube-v1 500 ManiSkill3
metaworld_push.yaml sawyer (scene-fixed) metaworld metaworld/push 200 MetaWorld MT50
pusht.yaml pusht_2d (scene-fixed; 2-D pymunk) pusht pusht/0 200 Diffusion Policy
widowx_carrot_on_plate.yaml widowx (scene-fixed) simpler_env simpler_env/widowx_carrot_on_plate 200 SimplerEnv

The n_episodes and seed columns ship in the file at the paper-canonical value. Overriding --n-episodes on openral benchmark scene is allowed (useful for cheap smoke runs that don't claim paper-reproduction); the resulting RSkillEvalResult records the lowered count.

Multi-scene aggregations (e.g. all 10 LIBERO-Spatial tasks, all 50 MetaWorld tasks, all 4 SimplerEnv WidowX tasks) live in benchmarks/. A suite YAML is a bare list[BenchmarkScene] at the YAML root (ADR-0042); suite-level invariants (uniform robot_id, seed, n_episodes, and full metadata block) are enforced by openral_core.raise_on_invalid_suite.

Justfile shortcuts

The repo's Justfile groups sim-* recipes by which CLI they drive:

# SimScene-tier — `openral sim run --save-video` (debug smoke; no eval JSON).
just sim-libero                     # SmolVLA × LIBERO        (GPU + MUJOCO_GL)
just sim-xvla-libero                # xVLA × LIBERO           (Florence-2)
just sim-pi05-libero                # π0.5 × LIBERO           (≥8 GB VRAM)
just sim-act-libero                 # ACT × LIBERO            (paper protocol)
just sim-pi05-robocasa              # π0.5 × RoboCasa kitchen (≥8 GB VRAM)

# BenchmarkScene-tier — `openral benchmark scene --no-update-manifest \
#     --n-episodes 1 --save-dir` (paper protocol, single rollout for smoke).
just sim-metaworld --task metaworld/reach-v3
just sim-maniskill3                 # SAPIEN-backed PickCube-v1
just sim-simpler-widowx             # RLDX-1 × WidowX carrot-on-plate
just sim-act-aloha                  # ACT × gym-aloha bimanual cube-transfer
just sim-diffusion-pusht            # Diffusion Policy × gym-pusht (CPU)
just sim-custom                     # ACT × gym-aloha insertion (rskills/act-aloha-insertion)

just sim-audit runs tools/audit_sim_configs.py over the per-tier catalogue and reports row-by-row latency + success metrics. just sim-eval runs the full benchmark suites end-to-end.

See also

  • scenes/README.md — conceptual hierarchy, decision matrix, override flags, scene-id / fixed-robot tables, base_pose for free-axis scenes, rSkill compatibility, live MuJoCo viewer, policy_extras performance knobs.
  • Tutorial — Create a sim environment — long-form YAML authoring guide (new scene adapter, new robot manifest, custom policy).
  • ADR-0002 — original scene/eval design.
  • ADR-0041 — three-tier hierarchy (DeployScene ⊆ SimScene ⊆ BenchmarkScene) + loader strictness.
  • ADR-0009 — separation of sim run (debug) and benchmark * (paper-comparable eval).