Layer 3 — World State
Part of the OpenRAL public-symbol inventory. Hand-curated;
(LNN)markers are refreshed bytools/refresh_methods_linenos.py.
python/state_adapter/src/openral_state_adapter/
ADR-0027 — layout adapter registry. Assembles per-checkpoint state vectors from manifest-declared StateContractBindings + live /tf + live /joint_states. Pure-Python, rclpy-free; the skill_runner wraps tf2_ros.Buffer.lookup_transform into the TfLookup Protocol at call time.
@dataclass TransformView— rclpy-free view of ageometry_msgs/TransformStamped. Fields:position: tuple[float, float, float],quaternion_xyzw: tuple[float, float, float, float].Protocol TfLookup—__call__(target_frame: str, source_frame: str) -> TransformView. Implementations MUST raise on missing transforms — assembler never silently substitutes identity.Protocol Assembler—__call__(bindings: StateContractBindings, joint_positions: dict[str, float], tf_lookup: TfLookup) -> NDArray[float32]. Pure-function signature every layout file implements.register(layout: StateLayout, assembler: Assembler) -> None— Bindassemblertolayoutin the package-global registry. Layout files call this at module load.registered_layouts() -> frozenset[StateLayout]— Snapshot of currently-registered layouts. Reasoner palette filter consults this to admit wrapped-task-space rSkills.assemble_state(layout, bindings, joint_positions, tf_lookup) -> NDArray[float32]— Look up the assembler forlayoutand run it. RaisesROSConfigErrorwhen no assembler is registered.assemble_human300_16d(bindings, joint_positions, tf_lookup) -> NDArray[float32]— RoboCasa365 / pi05_pretrain_human300 layout:[base_to_eef.pos(3), base_to_eef.quat(4), world_to_base.pos(3), world_to_base.quat(4), gripper_qpos(2)] = 16. Registered as"human300_16d"at import.
python/world_state/src/openral_world_state/aggregator.py
WorldStateAggregator — tf2-aware, injectable snapshot producer.
class WorldStateAggregator— Aggregates sensor data and producesWorldStatesnapshots. (L107)__init__(description, *, staleness_limit_s=DEFAULT_STALENESS_S, clock_fn=None)(L160)update_joint_state(state) -> None— Record a fresh joint reading. (L235)update_image(sensor_name, topic, stamp_ns) -> None— Record image arrival. (L246)update_ee_pose(ee_name, pose) -> None— Record EE pose from tf2. (L302)update_base_pose(pose, twist=None) -> None— Record base pose (and optional twist). (L316)update_battery(pct) -> None— Record battery %. (L333)set_error(component, status='error') -> None— Latch a forced diagnostic. (L357)clear_error(component) -> None— Remove a forced diagnostic. (L373)snapshot() -> WorldState— Produce a typed snapshot (hot path, acquires lock). Emits aworld_state.snapshotOTel span withopenral.world_state.components_stale+openral.world_state.has_latched_errorattributes, firesopenral.event.staleness_latched/openral.event.error_latchedevents on first transition, records per-componentopenral.world_state.staleness_mshistogram +openral.world_state.components_staleup-down counter. (L384)update_detected_objects(objects: list[DetectedObject]) -> None— Replace the remembered detected-object set (ADR-0035). Thread-safe; the nextsnapshot()call returns the new list. Called by the world-state lifecycle node's memory tick. (L342)_emit_snapshot_telemetry(span, diag, ages_ms) -> None— Internal: lift the snapshot diagnostics onto the OTel span + meter instruments. (L501)
python/world_state/src/openral_world_state/spatial_memory.py
SpatialMemory — ADR-0038 Phase 2 persistent object-centric scene-graph memory (advisory; never a safety input).
compute_approach_viewpoint(target, *, standoff_m=DEFAULT_STANDOFF_M, camera_frame_id=DEFAULT_CAMERA_FRAME, approach_from=None) -> ApproachViewpoint— Standoff posestandoff_mfromtarget(on theapproach_fromside, else −X), yawed so the gripper camera faces it. (ADR-0038)class SpatialMemory— AccumulatesWorldState.detected_objectsinto a queryableSceneGraph; pure-Python (typed BFS for traversal, JSON persistence — no graph-engine dep). (ADR-0038)__init__(*, assoc_distance_m=DEFAULT_ASSOC_DISTANCE_M, default_standoff_m=DEFAULT_STANDOFF_M, camera_frame_id=DEFAULT_CAMERA_FRAME, map_frame=DEFAULT_MAP_FRAME, embedder=None, min_text_similarity=DEFAULT_MIN_TEXT_SIMILARITY)—embedder(ADR-0038 §5, optionalTextEmbedder) enables open-vocab matching: object labels are embedded on creation and free-text queries match by CLIP cosine ≥min_text_similarity.upsert_node(node) -> None/add_edge(src, dst, kind) -> None— Mutators (edge endpoints must exist).ingest_detected_objects(objects, *, now_ns) -> list[str]— Fold detections into object nodes; instance association bytrack_idelse label+proximity (assoc_distance_m); updates pose/last_seen/observation_count, keeps higher confidence; embeds the label when an embedder is set. Returns touched node ids.recall_object(query: RecallObjectQuery, *, now_ns) -> RecallObjectResult— Rank object nodes bymax(label-match, CLIP-cosine)× confidence (proximity/recency tiebreak); an embedding hit needs cosine ≥min_text_similarity, a label substring always qualifies. Each match carries anApproachViewpoint+inside_container_id. Empty result = unknown (caller may raiseROSObjectNotInMemory).resolve_place(query: ResolvePlaceQuery, *, from_node_id=None) -> ResolvePlaceResult— Resolve a place/room/agent reference to a goal pose (an object resolves to itsat_place) + atraversable_toBFS path. RaisesROSObjectNotInMemorywhen unresolved.to_scene_graph() -> SceneGraph/from_scene_graph(graph, *, embedder=None) -> SpatialMemory(classmethod) — Snapshot / rebuild; rebuild re-embeds labels when an embedder is given (embeddings aren't serialized).save(path) -> None/load(path, *, embedder=None) -> SpatialMemory(classmethod) — JSON persistence via theSceneGraphcontract.- Module constants:
DEFAULT_ASSOC_DISTANCE_M,DEFAULT_STANDOFF_M,DEFAULT_CAMERA_FRAME,DEFAULT_MAP_FRAME,DEFAULT_MIN_TEXT_SIMILARITY.
python/world_state/src/openral_world_state/geometry.py
ADR-0044 Phase 1 — shared gaze geometry; promotes the look-at math previously triplicated in the sim scene composers.
ViewAxis(TypeAlias =Literal["-z", "+z", "+x"]) — Camera forward-axis conventions:"-z"MuJoCo cameras,"+z"ROS optical frames (REP-103),"+x"body-frame forward (the ADR-0038 approach-viewpoint convention).look_at_quat_wxyz(eye, target, *, up=(0,0,1), view_axis="-z") -> tuple[float, float, float, float]— Unit(w, x, y, z)quaternion orienting a camera ateyesoview_axispoints attarget. Degenerate fallbacks (never raises):target == eye→ MuJoCo straight-down flip for"-z"/ identity otherwise; gaze near-parallel toup→ +Y alternate up. (ADR-0044)compute_gaze_pose(camera_xyz, target_xyz, *, frame_id="map", up=(0,0,1), view_axis="+z") -> Pose6D— Full 6-DOF camera pose whose view axis hitstarget_xyz; therskill-moveit-look-atrSkill's goal (defaults to the optical-frame"+z"convention). (ADR-0044)rotation_to_quat_wxyz(rot: 3x3) -> tuple[w, x, y, z]— Public matrix→quaternion conversion (Shepperd's method); used byLookAtRskillto re-express a gaze pose for the camera's mount link. (ADR-0044)
python/world_state/src/openral_world_state/grid.py
ADR-0044 Phase 2 — occupancy-grid queries + approach-pose refinement (planning-layer proposal; the ADR-0030 Phase 6 kernel check_nav_goal gate stays the enforcement).
FREE_MAX(module constant,25) — Highestnav_msgs/OccupancyGridvalue still treated as free;-1unknown and anything above block placement and sight (conservative).class OccupancyGridIndex— Queryable view over one grid snapshot; ROS-free (from_msgduck-types the message).__init__(data (h,w) int8, *, resolution_m, origin_xy, origin_yaw=0.0); handles rotated origins. (ADR-0044)from_msg(msg) -> OccupancyGridIndex(classmethod) — decode a (duck-typed)nav_msgs/OccupancyGrid.world_to_cell(x, y) -> tuple[row, col] | None—Noneoff-grid.resolution_mproperty.is_free(x, y, *, inflation_m=0.0) -> bool— point + world-space inflation disc all free; off-grid (or disc off-grid) counts blocked.line_of_sight(a_xy, b_xy) -> bool— Bresenham; every cell strictly before the endpoint free (the target's own footprint cell is exempt — a mug shares its cell with its counter).refine_approach_pose(grid, viewpoint, target_xyz, *, inflation_m=0.25, max_radius_m=2.0, min_standoff_m=None, max_standoff_m=None) -> ApproachViewpoint | None— Return the viewpoint unchanged when already free + sighted; else ring-search outward for the nearest admissible point (standoff within[0.5x, 2.0x]ideal by default), re-aim viacompute_approach_viewpoint.None= no reachable viewpoint (caller reports honestly, never fabricates). (ADR-0044)
python/world_state/src/openral_world_state/scene_objects_span.py
ADR-0038 — publish the durable object nodes as a dashboard OTel span (advisory; never a safety input).
scene_objects_payload(graph) -> list[dict[str, object]]— Project aSceneGraph'sobject-kind nodes to JSON-friendly dicts (id,label,x,y,z,frame_id,confidence,last_seen_ns,observation_count,is_container); non-object nodes skipped.emit_scene_objects_span(graph, *, source_node) -> int— Emit oneworld.scene_objectsspan carrying the object list (attrs underopenral.world_state.scene_objects.*); returns the object count. Emitted by the Reasoner from its preloaded map today; the World-State node post-producer (ADR-0035 / PR #229).
python/world_state/src/openral_world_state/embedder.py
ADR-0038 §5 open-vocabulary text embedder (optional; uv sync --group clip).
class TextEmbedder(Protocol)—dim: int+embed_text(texts) -> NDArray[float32](L2-normalized).class OpenClipEmbedder— OpenCLIPViT-B-32-quickgelu/openaiweights (MIT).__init__(*, model_name=DEFAULT_CLIP_MODEL, pretrained=DEFAULT_CLIP_PRETRAINED, device=None)raisesROSConfigErrorif open-clip-torch/torch missing or weights unfetchable; lazy-imports torch/open_clip so the base install stays light.dim(512);embed_text(texts).- Module constants:
DEFAULT_CLIP_MODEL("ViT-B-32-quickgelu"),DEFAULT_CLIP_PRETRAINED("openai").
python/world_state/src/openral_world_state/object_lift.py
2D→3D object-center lift helpers — pure, ROS-free, unit-testable (ADR-0035).
class ObjectsLiftError(ValueError)— Raised by geometry helpers on degenerate inputs (zero-norm quaternion; occupancy buffer size mismatch). (L37)homogeneous_from_quat_xyz(translation, quat_xyzw) -> NDArray[np.float64]— Build a 4×4 homogeneous transform from a translation + xyzw quaternion (normalised). RaisesObjectsLiftErrorif the quaternion norm is effectively zero. (L41)decode_occupied_centers(*, origin, resolution, size_xyz, occupancy) -> NDArray[np.float64]— Occupied voxel centres(N, 3)in the grid frame; row-major x-fastest. Center of cell(ix, iy, iz)=origin + (index + 0.5) * resolution. Returns(0, 3)when none are occupied. RaisesObjectsLiftErroriflen(occupancy) != size_x * size_y * size_z. (L80)depth_cloud_to_centers_base(points_cloud, t_base_from_cloud, *, max_points=0) -> NDArray[np.float64]— #11 octomap-free fallback depth source forVoxelFrustumLifter: drops non-finite returns, uniformly subsamples tomax_points(0 = no cap), and maps the(N, 3)cloud from its optical frame into the base frame. Output is interchangeable withdecode_occupied_centersas the lifter'soccupied_centers_base. Returns(0, 3)when the cloud has no finite points. (L126)aabb_iou_3d(a, b) -> float— 3D axis-aligned bbox IoU in[0, 1]. Boxes are(xmin, ymin, zmin, xmax, ymax, zmax). Returns 0.0 for disjoint or degenerate boxes. This is the single IoU helper in the repo — reuse it; do not add another. (L162)class VoxelFrustumLifter(*, k_nearest=25, min_voxels=3)— Lift 2D detections to 3D object centres via voxel-frustum K-nearest. Frustum-projects/openral/world_voxelsoccupied centres into the RGB camera image plane (usingSensorSpec.intrinsics+ TF2 transforms), selects the K voxels nearest each box centre, and returns oneDetectedObjectper detection withpose.frame_id = "map". Skips a detection when fewer thanmin_voxelsvoxels survive the frustum filter. (L188)lift(*, detections, occupied_centers_base, intrinsics, frame_size, t_cam_from_base, t_map_from_base, map_frame="map") -> list[DetectedObject]— Core lift call (vectorised over voxels; per-detection masking after). Returns[]whenoccupied_centers_baseis empty (best-effort).track_idisNone; assigned later byObjectMemory. (L207)build_in_fov_predicate(*, intrinsics, t_cam_from_map) -> Callable[[DetectedObject], bool]— Build a predicate that returnsTrueiff the object'smap-frame centre is in front of the camera and projects within the image bounds ((0, 0)to(intrinsics.width, intrinsics.height)); behind-camera or out-of-bounds →False. Used byObjectMemory.tickfor FOV-guarded eviction. (L298)
python/world_state/src/openral_world_state/object_memory.py
IoU-gated spatial object memory — pure, ROS-free, unit-testable (ADR-0035).
class ObjectMemory(*, iou_threshold=0.3, max_misses=1)— IoU-gated spatial memory with freeze-on-match, in-FOV-guarded eviction, and out-of-FOV retention. Maintains a monotonictrack_idcounter. (L30)tick(candidates: list[DetectedObject], *, stamp_ns: int, in_fov: Callable[[DetectedObject], bool]) -> list[DetectedObject]— Run one association+eviction step. Greedy highest-confidence-first matching: same-label +aabb_iou_3d ≥ iou_threshold→ freeze (leave pose/bbox unchanged, bump confidence, reset miss count); no match → new track. Unmatched existing tracks:in_fovtrue →miss_count += 1; evict whenmiss_count >= max_misses;in_fovfalse → retain unchanged. Returns surviving tracks aslist[DetectedObject]. (L64)