Sim Environments

Scene YAMLs under scenes/ follow the three-tier hierarchy introduced by ADR-0041: DeployScene ⊆ SimScene ⊆ BenchmarkScene. Each tier has its own directory, its own loader-strictness gate, and its own CLI consumer. The conceptual overview, decision matrix, authoring guide, and per-backend scene.id catalogue all live in the in-tree scenes/README.md; this page is the per-file catalogue — one row per YAML.

Scene dependencies are auto-installed on first use. Bypass the install prompt in CI with OPENRAL_AUTO_INSTALL_DEPS=1.

Quick CLI

# DeployScene — env-only playground (reasoner picks the rSkill at runtime).
openral deploy sim --config scenes/deploy/openarm_tabletop.yaml

# SimScene — single rollout; supply the policy at the CLI.
openral sim run --config scenes/sim/libero_spatial.yaml --rskill smolvla-libero

# BenchmarkScene — paper-comparable single-scene eval; writes
# rskills/<vla>/eval/<scene_id>.json with reproduced_locally=true.
openral benchmark scene --config scenes/benchmark/libero_spatial.yaml \
                        --rskill smolvla-libero

# Benchmark suite — multi-scene aggregate (lives in benchmarks/, not scenes/).
openral benchmark run --suite libero_spatial --rskill smolvla-libero

Override flags (--task, --instruction, --max-steps, --n-episodes, --robot for free-axis scenes) work on every tier except benchmark run, which intentionally rejects them to guarantee suite reproducibility. See scenes/README.md for the full override matrix.

DeployScene catalogue (`scenes/deploy/`)

Env-only "robot + scene" pins. No task: block, no eval; the runtime reasoner picks the rSkill. Consumed by openral deploy sim.

Config	Fixed / declared robot	`scene.id`	Backend	Use
`libero_pnp.yaml`	`franka_panda` (scene-fixed)	`libero_spatial`	LIBERO (robosuite + MuJoCo)	Boot LIBERO in deploy mode so a reasoner can issue arbitrary pick-and-place commands
`openarm_tabletop.yaml`	`openarm` (free-axis)	`openarm_tabletop_pnp`	Custom MJCF	OpenArm bimanual tabletop sandbox; default top camera matches the mddoai dataset POV
`robocasa_pnp.yaml`	`panda_mobile` (scene-fixed)	`robocasa/PickPlaceCounterToCabinet`	RoboCasa (MuJoCo)	Mobile-base kitchen pick-and-place sandbox; reasoner-driven
`so101_box.yaml`	`so101_follower` (scene-fixed)	`so101_box`	Custom MJCF	100×61.5×75 cm box arena + OAK-D Pro overhead + wrist camera; deploy sandbox

SimScene catalogue (`scenes/sim/`)

DeployScene + a single task: block. One CLI invocation, one or more EpisodeResults; sized for ad-hoc development and smoke tests. The policy is supplied at the CLI via --rskill <name> — scene YAMLs no longer pin a VLA. Consumed by openral sim run.

Config	Fixed / declared robot	`scene.id`	`task.id`	Notes
`franka_libero_pnp.yaml`	`franka_panda` (scene-fixed)	`franka_libero_custom_bddl`	`custom_milk/0`	Custom BDDL (`franka_libero_pnp.bddl`) → robosuite `OffScreenRenderEnv`; pick milk into basket
`libero_spatial.yaml`	`franka_panda` (scene-fixed)	`libero_spatial`	`libero_spatial/0`	LIBERO-Spatial smoke; ad-hoc sibling of `scenes/benchmark/libero_spatial.yaml`
`openarm_tabletop.yaml`	`openarm` (free-axis)	`openarm_tabletop_pnp`	`openarm/pnp_cube_to_drawer`	Bimanual cube-to-drawer; mirrors the mddoai dataset POV
`robocasa_gr1_pnp_cup_to_drawer.yaml`	`gr1` (scene-fixed)	`robocasa/gr1/PnPCupToDrawerClose`	`robocasa/gr1/PnPCupToDrawerClose/0`	RoboCasa GR1 humanoid tabletop pnp
`robocasa_panda_mobile_kitchen.yaml`	`panda_mobile` (scene-fixed)	`robocasa/NavigateKitchen`	`robocasa/NavigateKitchen/0`	Mobile-base kitchen navigation; `deploy sim` Nav2 graph compatible
`robocasa_pnp.yaml`	`panda_mobile` (scene-fixed)	`robocasa/PickPlaceCounterToCabinet`	`robocasa/PickPlaceCounterToCabinet/0`	RoboCasa kitchen pnp smoke
`so101_tube_insertion.yaml`	`so101_follower` (scene-fixed)	`so101_box`	`so101_box/tube_insertion`	Box-arena tube-insertion smoke; geometry/sensors/spawn ranges configurable via `BoxSceneOptions`
`tabletop_cube_push.yaml`	`so101_follower` (free-axis default; pass `--robot` to override)	`tabletop_push`	`tabletop_push/push_to_goal`	Robot-agnostic cube push-to-goal (ADR-0033)

BenchmarkScene catalogue (`scenes/benchmark/`)

SimScene + required metadata: BenchmarkMetadata (paper URL + honest_scope) + non-None seed and n_episodes. The shipped values match the canonical paper protocol; running openral benchmark scene against one of these writes rskills/<vla>/eval/<scene_id>.json with reproduced_locally=true. Consumed by openral benchmark scene. Most are also aggregated into a multi-scene suite (bare list[BenchmarkScene] per ADR-0042) under benchmarks/.

Config	Fixed / declared robot	`scene.id`	`task.id`	`n_episodes`	Paper
`aloha_insertion.yaml`	`aloha_bimanual` (scene-fixed)	`aloha_insertion`	`aloha_insertion/0`	200	ALOHA / ACT
`aloha_transfer_cube.yaml`	`aloha_bimanual` (scene-fixed)	`aloha_transfer_cube`	`aloha_transfer_cube/0`	200	ALOHA / ACT
`libero_spatial.yaml`	`franka_panda` (scene-fixed)	`libero_spatial`	`libero_spatial/0`	500	LIBERO
`maniskill_pick_cube.yaml`	`franka_panda` (free-axis)	`maniskill3`	`maniskill3/PickCube-v1`	500	ManiSkill3
`metaworld_push.yaml`	`sawyer` (scene-fixed)	`metaworld`	`metaworld/push`	200	MetaWorld MT50
`pusht.yaml`	`pusht_2d` (scene-fixed; 2-D pymunk)	`pusht`	`pusht/0`	200	Diffusion Policy
`widowx_carrot_on_plate.yaml`	`widowx` (scene-fixed)	`simpler_env`	`simpler_env/widowx_carrot_on_plate`	200	SimplerEnv

The n_episodes and seed columns ship in the file at the paper-canonical value. Overriding --n-episodes on openral benchmark scene is allowed (useful for cheap smoke runs that don't claim paper-reproduction); the resulting RSkillEvalResult records the lowered count.

Multi-scene aggregations (e.g. all 10 LIBERO-Spatial tasks, all 50 MetaWorld tasks, all 4 SimplerEnv WidowX tasks) live in benchmarks/. A suite YAML is a bare list[BenchmarkScene] at the YAML root (ADR-0042); suite-level invariants (uniform robot_id, seed, n_episodes, and full metadata block) are enforced by openral_core.raise_on_invalid_suite.

Justfile shortcuts

The repo's Justfile groups sim-* recipes by which CLI they drive:

# SimScene-tier — `openral sim run --save-video` (debug smoke; no eval JSON).
just sim-libero                     # SmolVLA × LIBERO        (GPU + MUJOCO_GL)
just sim-xvla-libero                # xVLA × LIBERO           (Florence-2)
just sim-pi05-libero                # π0.5 × LIBERO           (≥8 GB VRAM)
just sim-act-libero                 # ACT × LIBERO            (paper protocol)
just sim-pi05-robocasa              # π0.5 × RoboCasa kitchen (≥8 GB VRAM)

# BenchmarkScene-tier — `openral benchmark scene --no-update-manifest \
#     --n-episodes 1 --save-dir` (paper protocol, single rollout for smoke).
just sim-metaworld --task metaworld/reach-v3
just sim-maniskill3                 # SAPIEN-backed PickCube-v1
just sim-simpler-widowx             # RLDX-1 × WidowX carrot-on-plate
just sim-act-aloha                  # ACT × gym-aloha bimanual cube-transfer
just sim-diffusion-pusht            # Diffusion Policy × gym-pusht (CPU)
just sim-custom                     # ACT × gym-aloha insertion (rskills/act-aloha-insertion)

just sim-audit runs tools/audit_sim_configs.py over the per-tier catalogue and reports row-by-row latency + success metrics. just sim-eval runs the full benchmark suites end-to-end.

Sim Environments

Quick CLI

DeployScene catalogue (scenes/deploy/)

SimScene catalogue (scenes/sim/)

BenchmarkScene catalogue (scenes/benchmark/)

Justfile shortcuts

See also

DeployScene catalogue (`scenes/deploy/`)

SimScene catalogue (`scenes/sim/`)

BenchmarkScene catalogue (`scenes/benchmark/`)