Inference Runner (executor of S1, ADR-0010)
Part of the OpenRAL public-symbol inventory. Hand-curated;
(LNN)markers are refreshed bytools/refresh_methods_linenos.py.
The hardware-side counterpart to openral_sim — closes
WorldState → Skill.step → SafetyClient.check → HAL.send_action at
RobotEnvironment.rate_hz (default 30 Hz). Schemas land in M1 / PR A;
this M2 / PR B introduces the openral_runner package with the
cadence helpers. InferenceRunnerBase, SensorReader, the OpenCV /
ROS / GStreamer backends, the openral deploy CLI, and the
SkillExecutorNode ROS wrapper land in subsequent PRs C–J.
python/runner/src/openral_runner/clock.py
High-precision cadence helpers for the inference runner.
precise_sleep(duration_s: float) -> None— Hybrid sleep:time.sleepfor the bulk + busy-wait ontime.perf_counterfor the final ~1 ms. Mirrors lerobot'sprecise_sleepshape. Non-positiveduration_sis a no-op. (L29)sleep_until(deadline_perf_counter_s: float) -> None— Convenience wrapper taking an absolutetime.perf_counterdeadline. Used byInferenceRunnerBase.runto enforce cadence. (L60)- module constant
_BUSY_LOOP_THRESHOLD_S = 1e-3— Busy-wait threshold (mirrors lerobot's private constant inlerobot_record.py). (L26)
python/runner/src/openral_runner/protocol.py
Inference runner Protocol (ADR-0010 PR C). The structural contract every runner shape satisfies.
class InferenceRunner(Protocol)—@runtime_checkableProtocol. (L24)- attr
rate_hz: float— foreground tick rate. activate() -> None— open sensors / HAL / executor. (L38)tick() -> TickResult— run one tick (record), no cadence enforcement here. (L42)run(max_ticks: int | None = None) -> RunResult— rate-limited loop returning the aggregate. (L52)deactivate() -> None— release resources opened byactivate; idempotent. (L62)
python/runner/src/openral_runner/world_cloud_bridge.py
ADR-0030 — rclpy → OTLP bridge rendering the octomap occupied-voxel cloud (/octomap_point_cloud_centers) as a robot-frame oblique "chase-view" PNG for the dashboard world.pointcloud card. Pure render core is rclpy-free (tested without ROS).
- module constant
WORLD_CLOUD_TOPIC_DEFAULT = "/octomap_point_cloud_centers"— default occupied-voxel-centers PointCloud2 topic (octomap_server). (L52) crop_points_to_box(points, *, xy_m, z_min, z_max) -> NDArray[float32]— keep(N,3)points inside the local box around base_link. (L78)distance_to_rgb(dist_m, *, range_max_m) -> tuple[int,int,int]— near=warm→far=cool color ramp. (L94)encode_world_cloud_png(points_base, *, range_max_m=4.0, image_w=480, image_h=360, xy_m=2.0, z_min=-0.2, z_max=2.0) -> str— crop→oblique-pinhole project→rasterize→base64 PNG. Pure; PIL-only. (L141)world_cloud_span_attributes(*, points_base, frame_id, source_node, range_max_m, xy_m, z_min, z_max) -> dict[str,Any]— assemble theopenral.world_cloud.*span attributes. (L214)class WorldCloudBridge— constructed against a hostrclpy.node.Node; subscribes the centers cloud (TRANSIENT_LOCAL), TF2-transforms tobase_link, throttles to 1 Hz, emits aworld.pointcloudspan. MirrorsSlamMapBridge.destroy()releases the subscription. (L260)
python/runner/src/openral_runner/sensor_reader.py
:class:SensorReader Protocol — seam between per-sensor capture backends and the inference runner (ADR-0010 PR D).
class SensorReader(Protocol)—@runtime_checkableProtocol; concrete backends live underopenral_runner.backends. (L29)- attr
sensor_id: str— matchesSensorReaderConfig.sensor_id. - attr
is_open: bool— True betweenopen()andclose(). open() -> None— Acquire device, start background workers. Idempotent. (L51)close() -> None— Release device, join workers. Idempotent. (L60)read_latest(max_age_ms: int | None = None) -> SensorFrame— Non-blocking peek at the most recent buffered frame; raisesROSPerceptionStaleif no frame yet or freshest exceeds budget. (L67)
python/runner/src/openral_runner/backends/opencv_thread.py
:class:OpenCVThreadSensorReader — default backend (ADR-0010 PR D). Mirrors lerobot's per-camera-thread pattern.
- module constant
_COLOR_NDIM = 3— Number of dims for an OpenCV colour frame ((H, W, 3)); mono is(H, W). Used to deriveSensorFrame.channels. (L38) class OpenCVThreadSensorReader— Per-camera background-thread reader on top ofcv2.VideoCapture. Importscv2lazily insideopen()(theopencvoptional-extra). (L41)__init__(*, sensor_id, device, fps=30, width=None, height=None, encoding=BGR8, default_max_age_ms=100)— Stash config; rejects non-positivefps/default_max_age_ms. (L73)open() -> None— Opencv2.VideoCapture, pincv2.setNumThreads(1)(lerobot parity), spawn daemon thread. Idempotent. (L111)close() -> None— Stop event, join thread (2 s timeout), release capture. Idempotent. (L149)__enter__() / __exit__()— Context-manager sugar; callsopen/close. (L167)read_latest(max_age_ms: int | None = None) -> SensorFrame— Lock-protected snapshot of the_latest_frameslot; constructs aSensorFramewith inlined raw bytes; raisesROSPerceptionStaleon no-frame-yet or staleness,RuntimeErroron closed reader. (L178)_read_loop()— Background daemon:cv2.VideoCapture.read→_latest_frame + _latest_stamp_*_nsunder lock; sleeps1/fpson read failure / EOF. (L233)
python/runner/src/openral_runner/backends/__init__.py
Per-backend SensorReader implementations. Default OpenCVThreadSensorReader is always available; GStreamerSensorReader (PR I) + Ros2ImageSensorReader gate on optional deps.
OpenCVThreadSensorReader— lazy-exported via PEP 562__getattr__(M8 PR I/8) so importingopenral_runner.backends.gstreamerdoes NOT eagerly pull incv2. cv2 initialises glib state that segfaults a subsequentrclpy.Node()inside the x86-ros Docker image; the lazy split keeps the gstreamer-only path importable in ROS-enabled processes.__getattr__(name) -> Any— PEP 562 attribute hook; resolvesOpenCVThreadSensorReaderon first access viaimportlib.import_module. (L27)
python/runner/src/openral_runner/backends/gstreamer/pipeline.py
GStreamer pipeline-string builder + platform detection (ADR-0010 PR I/1, ADR-0011, ADR-0018 F6). Pure-Python — does not import gi at module load.
TEE_NAME: Final[str](L60) —"openral_cam_tee". Name of the per-cameratee— the perception-bus attach point (ADR-0037) the runtimeTeeManagerlooks up viaGst.Bin.get_by_nameto request pads for reasoner-activated consumers at runtime.LEAKY_BRANCH_QUEUE: Final[str](L67) —"queue leaky=downstream max-size-buffers=2". The single definition of the per-branch isolation policy (ADR-0018 §3), shared by the static builder and the runtimeTeeManager.leaky_branch(elements, *, tee_name=TEE_NAME) -> str(L70) — Returns oneteebranch<tee>. ! <leaky queue> ! <elements>. The shared branch-construction primitive so the static builder and the dynamicTeeManager(ADR-0037) build branches identically.class PipelineSpec(BaseModel)(L164) — Validated description of a GStreamer ingest pipeline. Fields:source, device, width, height, fps, encoded, enable_nvmm, enable_ros_tee, enable_event_tee, appsink_name, ros_appsink_name, event_appsink_name, event_rate_hz, max_buffers. ADR-0018 F6 added the three event-tee fields; the validator onevent_appsink_nameenforces valid GStreamer element names.class Platform(str, Enum)(L104) —TEGRA | NVIDIA_DESKTOP | CPU_ONLY.class Source(str, Enum)(L140) —USB | CSI | RTSP | FILE | TESTSRC.detect_platform() -> Platform(L250) —lru_cached; reads/etc/nv_tegra_release, probesgst-inspect-1.0 nvh264dec.inspect_element_present(element_name) -> bool(L280) — Genericgst-inspect-1.0 --existsprobe with timeout.nvmm_convert_element() -> str | None(L310) — Probes for the host's NVMM colour-convert element:nvvideoconvert(DeepStream/x86) preferred, elsenvvidconv(Tegra/L4T), elseNone.ensure_appsink_name(pipeline, name) -> str(L330) — Rewrites a trailingappsinkto carryname=<name>.build_pipeline_string(spec, platform=None) -> str(L382) — Materialises the pipeline string; emits a 2- or 3-legtee name=openral_cam_teewhenenable_ros_tee/enable_event_teeare set, assembling each leg vialeaky_branchso a stalled observability / detector branch never backpressures the policy._build_event_tee_branch(spec, platform) -> str(L607) — ADR-0018 F6 — Returns the event leg of thetee: lifts NVMM to system memory, pinsformat=BGR, rate-caps viavideoratetoevent_rate_hz, terminates inappsink name=event_sink._build_ros_tee_branch(spec, platform) -> str(L591) — Returns the observability leg (system memory BGRappsink name=ros_sink).
python/runner/src/openral_runner/backends/gstreamer/perception_tee.py
Perception event tee for GStreamerSensorReader (ADR-0018 F6). Pulls frames from the event leg's appsink, runs EventDetectors, publishes openral_msgs/PromptStamped on /openral/perception/<kind>. rclpy lazy-imported in start() so the module stays import-safe on hosts without a sourced ROS env.
- module constant
TOPIC_PREFIX: Final[str] = "/openral/perception"(L62) — Locked by ADR-0018 §1; full topic isf"{TOPIC_PREFIX}/{detector.kind}". class EventDetector(Protocol)(L75) —kind: str,detect(frame_bgr, width, height, sensor_id) -> PerceptionEventMetadata | None,summarise(metadata) -> str.class MotionDetector(L107) — Pure-Python frame-diff motion detector over a BGR appsink (BT.601 luma, mean abs delta). Numpy lazy-imported indetect.__init__(*, threshold=0.02, downsample=1).class SceneChangeDetector(L215) — Grayscale-histogram scene-change detector (chisqr_altdistance, 32 bins).__init__(*, threshold=0.5).class _TokenBucket(L302) — Per-(sensor, kind)rate-limit primitive; mirrorsopenral_observability.failure_bus._TokenBucketbut independently implemented to keep the runner free of an observability-package dep.class PerceptionEventPublisher(L334) — Owns one event-sink appsink for one sensor; fans out to onePublisherper detector kind. Constructor enforces uniquekinds, absolutetopic_prefix, positiverate_hz. QoS:BEST_EFFORT + VOLATILE + KEEP_LAST=10per ADR-0018 §1. Methods:start(),stop(),is_started[property],dropped_counts[property].
python/runner/src/openral_runner/backends/gstreamer/tee_manager.py
Runtime tee-branch manager for the GStreamer perception bus (ADR-0037). Attaches / detaches consumer branches on a running pipeline's named tee (pipeline.TEE_NAME) via dynamic pad add/remove — the mechanism the S2 reasoner drives through ExecuteRskill. Imports gi at load (requires the gstreamer extra).
class BranchHandle(L67) — Opaque dataclass handle to an attached branch (name+ the privateteepad / branch bin); returned byattach, passed back todetach.class TeeManager(L83) —__init__(pipeline, *, tee_name=TEE_NAME)(raisesROSConfigErrorif the tee is absent).branch_count[property] (L117).attach(elements, *, name) -> BranchHandle(L122) — requests a tee pad, parsesLEAKY_BRANCH_QUEUE ! <elements>into a bin, links + syncs it live; rolls back on link failure.detach(handle)(L83) — IDLE-probe unlink + release-pad + NULL teardown; idempotent, blocks until removed (bounded by_DETACH_TIMEOUT_S).
python/runner/src/openral_runner/backends/gstreamer/objects_detector.py
CPU-tier object detector for the ADR-0037 perception event tee. Implements EventDetector via ONNXRuntime on system-memory BGR frames (RT-DETR / D-FINE ONNX signature). onnxruntime lazy-imported at construction time. Zero-copy NVMM tiers are ADR-0037 PR5b follow-ups; requesting them raises ROSConfigError.
class DetectorTier(str, Enum)(L75) —CPU_ONNX = "cpu_onnx",NVINFER = "nvinfer",NVMM_AGGREGATOR = "nvmm_aggregator",VLM_SIDECAR = "vlm_sidecar",ZEROSHOT_HF = "zeroshot_hf". Execution tier for the ADR-0037 object detector;VLM_SIDECARis the out-of-process open-vocab VLM tier (2026-06-09 amendment) andZEROSHOT_HFis the in-process Transformers zero-shot tier run over a fixed vocabulary (2026-06-12 amendment) — both reuse theCPU_ONNXBGR appsink branch.select_detector_tier(platform=None) -> DetectorTier(L121) — Probesgst-inspect-1.0 nvinfer(→NVINFER), then checks forPlatform.TEGRA(→NVMM_AGGREGATOR), elseCPU_ONNX.nvinferprobe always wins over explicitplatform.identify_rtdetr_outputs(named_shapes: list[tuple[str, tuple[Any, ...]]]) -> tuple[str, str](L209) — Tier-agnostic helper: from a list of(name, shape)output pairs, returns(logits_name, boxes_name). Among 3-D outputs, the one with last-dim==4 is boxes; if both (or neither) end in 4, falls back to index order (0=logits, 1=boxes). RaisesROSConfigErrorif fewer than two 3-D outputs are present.postprocess_rtdetr(logits, boxes, *, labels, model_id, sensor_id, score_threshold, frame_width, frame_height) -> ObjectsMetadata | None(L247) — Tier-agnostic decode (CLAUDE.md §13): sigmoid→argmax→threshold, cxcywh normalised→xyxy pixels, degenerate-bbox guard, label-index bounds check (warns), sorts descending by confidence, returnsNoneon zero survivors. Accepts(N,C)/(1,N,C)logits and(N,4)/(1,N,4)boxes.class ObjectsDetector(L346) —EventDetectorimplementation.__init__(onnx_path, *, labels, model_id, input_size=(640,640), score_threshold=0.5, device="cpu"). Delegates logits/boxes identification toidentify_rtdetr_outputs.detect(frame_bgr, width, height, sensor_id) -> ObjectsMetadata | None— BGR→RGB, NN-resize, float32/255, NCHW, ORT inference, delegates postprocessing topostprocess_rtdetr.summarise(metadata) -> str— aggregates label counts as"Nx label"string.make_objects_detector(onnx_path, *, labels, model_id, tier=None, **kwargs) -> ObjectsDetector | NvmmObjectsDetector(L554) — Auto-selects tier viaselect_detector_tier()whentier=None; returnsObjectsDetectorforCPU_ONNX; returnsNvmmObjectsDetector(lazy import) forNVMM_AGGREGATOR; raisesROSConfigErrorforNVINFER(spike-gated ADR-0037 PR5b PR D); raisesROSConfigErrorfor unknown tiers.
python/runner/src/openral_runner/backends/gstreamer/trt_nvmm.py
Clean-room zero-copy TensorRT executor (ADR-0037 PR5b) — runs a TRT engine directly on a CUDA device pointer (NvBufSurface.dataPtr) with no GPU->CPU copy. An nvrtc-compiled CUDA kernel (rgba_to_nchw_norm, built to a SASS CUBIN for the local sm_<cc> — no PTX JIT) converts the pitch-padded RGBA frame to planar float32 NCHW (/255) straight into the engine input buffer; engine + kernel run on the device's CUDA primary context (made current by cudaSetDevice; cuInit initializes the driver API). Deserializes engine bytes from TensorRTRuntime.serialized_engine. Imports cuda-python (cuda.bindings driver/runtime/nvrtc) + tensorrt lazily at construction — no pycuda, no shared context — so the NVMM tier deploys in the lean ds-on image (no nvcc/g++); requires the tensorrt group + nvrtc.
class TrtNvmmExecutor(L69) —__init__(engine_bytes, *, input_size=(h,w), device_index=0)— API unchanged from the prior pycuda impl; internals reworked to nvrtc + cuda-python. Selects the device viacudaSetDevice(device_index)(initializes + makes the primary context current), nvrtc-compiles the kernel for the device'ssm_<cc>to a SASS CUBIN (nvrtcGetCUBIN, no PTX JIT) and loads it viacuModuleLoadData/cuModuleGetFunction, deserializes the engine (trt.Runtime(...).deserialize_cuda_engine), creates the execution context, sets the input shape(1,3,h,w), allocates the input + per-output device buffers (cudaMalloc) viaset_tensor_address, and creates a cudart stream — with aBaseExceptionguard (_free_resources) that frees partially-allocated buffers / unloads the module / destroys the stream before re-raising (a failed__init__returns no instance toclose()); raisesROSConfigErrorif cuda-python/trt missing, engine deserialization fails, or the engine lacks exactly one input,ROSRuntimeErroron a CUDA/nvrtc setup failure.infer_rgba_devptr(src_ptr, *, width, height, pitch) -> dict[str, np.ndarray](L284) — launches the kernel fromsrc_ptr(pitch-strided rows; args packed as numpy scalars kept alive across the launch) into the input buffer viacuLaunchKernel, runsexecute_async_v3, copies outputs dtoh (cudaMemcpyAsync), syncs the stream, returns name->array; raisesROSConfigErroron a frame-size mismatch,ROSRuntimeErroron a CUDA failure or ifexecute_async_v3returns False.output_shapes() -> list[tuple[str, tuple[int, ...]]](L280) —(name, shape)per engine output, for output identification.close()(L70) — frees device buffers, unloads the kernel module, destroys the stream (shared best-effort_free_resourceshelper; non-zero CUDA teardown returns are logged, not raised); idempotent.
python/runner/src/openral_runner/backends/gstreamer/nvmm_detector.py
Clean-room NVMM zero-copy object detector (ADR-0037 PR5b) — composes TensorRTRuntime.serialized_engine, TrtNvmmExecutor (device-pointer inference + RGBA→NCHW kernel), and the shared postprocess_rtdetr / identify_rtdetr_outputs decode. Consumes an NvBufSurfaceHandle (the GPU dataPtr of an NVMM frame) and emits ObjectsMetadata — same output as the CPU tier with no GPU→CPU copy. Requires the tensorrt group (cuda-python + tensorrt) + nvrtc; the TrtNvmmExecutor it wraps uses nvrtc + cuda-python (no pycuda), so this tier deploys in the lean ds-on image.
class NvmmObjectsDetector(L36) —__init__(onnx_path, *, labels, model_id, input_size=(640,640), score_threshold=0.5, device_index=0, quantization=None)— validates labels non-empty, score_threshold in [0,1], ONNX path exists; builds the TRT engine viaTensorRTRuntime(device="cuda:<N>", rskill_id=model_id).serialized_engine(path), constructsTrtNvmmExecutor, identifies logits/boxes names viaidentify_rtdetr_outputs(executor.output_shapes())(closing the executor if identification raises, since a partial__init__returns no instance toclose()); raisesROSConfigErroron bad args or missing ONNX.detect_nvmm(handle, sensor_id) -> ObjectsMetadata | None(L104) — callsTrtNvmmExecutor.infer_rgba_devptrwith the handle'sgpu_ptr/width/height/pitch, then delegates topostprocess_rtdetr; returnsNonewhen no detection passes threshold.close()(L37) — delegates toTrtNvmmExecutor.close(); idempotent.
python/runner/src/openral_runner/backends/gstreamer/detector_factory.py
gi-free dispatch seam (ADR-0037 2026-06-09 amendment) so the manifest→detector-backend selection is unit-testable without a live pipeline. DetectorRunner delegates construction here. No gi/onnxruntime/zmq/torch at import; the pytorch branch lazy-imports LocateAnythingDetector and the zeroshot_hf branch lazy-imports OmDetTurboDetector.
weights_source_from_manifest(manifest) -> str(L97) — Resolves the HF repo the backend loads: preferssource_repo, falls back toweights_uri, elsenvidia/LocateAnything-3B; strips thehf://scheme and any@revisionto a bareorg/name.build_manifest_detector(manifest, *, onnx_path=None, tier=None) -> tuple[Any, DetectorTier](L110) — Dispatches onmanifest.detector.enginefirst, thenmanifest.runtime:engine: zeroshot_hf→OmDetTurboDetector(lazy import) +DetectorTier.ZEROSHOT_HF(noonnx_path); elseruntime: pytorch→LocateAnythingDetector(lazy import) +DetectorTier.VLM_SIDECAR(noonnx_path);onnx/tensorrt→make_objects_detector(onnx_path, ..., input_size=(net_h,net_w), score_threshold=...)+ the resolved tier. RaisesROSConfigErrorif the manifest is not akind:detectorwith a detector block, or an ONNX runtime is requested without anonnx_path.class DetectorNodeWiring(frozen dataclass) +detector_node_wiring(mode: DetectorMode) -> DetectorNodeWiring— ADR-0051 pure (rclpy-free, unit-testable) policy the perception node consumes:continuous→run_continuous_leg=True, serve_on_demand=False(publish leg, no query service);on_demand→run_continuous_leg=False, serve_on_demand=True(locate_in_view service +detector_querytopic, no continuous publishing).
python/runner/src/openral_runner/backends/gstreamer/omdet_turbo_detector.py
In-process Transformers open-vocabulary detector (ADR-0037 2026-06-12 amendment) — omlab/omdet-turbo-swin-tiny-hf (Apache-2.0). One backend serves both ADR-0051 detector modes (the manifest's detector.mode declares intent): continuous (fixed labels, unprompted background producer — omdet-turbo-indoor) or on_demand (prompted locator via set_query/detect_with_query — omdet-turbo-locator). Same detect(frame_bgr, width, height, sensor_id) -> ObjectsMetadata | None interface as ObjectsDetector, so it reuses the CPU BGR appsink branch (DetectorTier.ZEROSHOT_HF). Loads under the runtime's own transformers>=5 (no sidecar). torch/transformers/numpy/PIL lazy-imported (the omdet group); conversion + query parsing are pure functions (unit-testable, no GPU).
build_objects_metadata_from_results(*, labels, scores, boxes_xyxy, width, height, model_id, sensor_id, score_threshold) -> ObjectsMetadata | None(L63) — Pure (no torch): from decoded per-detectionlabels/scores/pixelboxes_xyxy, drops sub-threshold + degenerate/near-full-image (≥98%) boxes, clips + corner-orders to frame, sorts descending by confidence;Noneon zero survivors. RaisesROSConfigErroron length mismatch.query_to_classes(query) -> list[str](L148) — Pure (ADR-0051): parse a free-text on-demand query into OmDet's multi-label class list (split on commas /</c>; a single phrase is one class; whitespace dropped). RaisesROSConfigErrorif empty.class OmDetTurboDetector(L180) —__init__(*, labels, model_id, weights_source, score_threshold=0.3, nms_threshold=0.5, device="auto")— stores config; model/processor load deferred to firstdetect()(lazy, side-effect-free;device="auto"→ CUDA when available else CPU).set_query(text)— retarget the persistent vocabulary (thedetector_querytopic; on-demand).detect(frame_bgr, width, height, sensor_id) -> ObjectsMetadata | None— over the current vocabulary.detect_with_query(frame_bgr, width, height, sensor_id, query) -> ObjectsMetadata | None— one-shot detect forqueryWITHOUT mutating the persistent vocabulary (the read-onlylocate_in_viewservice, ADR-0043). Both delegate to_detect_classes(BGR→RGB PIL, processor over the class list,model(**inputs)underno_grad,post_process_grounded_object_detection, →build_objects_metadata_from_results).close()— releases the model +cuda.empty_cache()if loaded on GPU; idempotent.
python/runner/src/openral_runner/backends/gstreamer/locateanything_detector.py
Open-vocabulary detector backend (ADR-0037 2026-06-09 amendment) backed by the LocateAnything-3B sidecar. Same detect(frame_bgr, width, height, sensor_id) -> ObjectsMetadata interface as ObjectsDetector, so it reuses the CPU BGR appsink branch. Connects lazily on first detect(); auto-spawns the sidecar (ping → Popen → poll → close) mirroring the RLDX adapter. The model runs in an isolated transformers==4.57.1 venv (tools/locateanything_sidecar.py); this is the ZMQ/msgpack client. Parsing is pure-function + main-env (unit-testable, no GPU). No zmq/numpy/PIL at import (all lazy).
parse_grounding_answer(answer, *, fallback_label="object", norm=1000) -> list[tuple[str, tuple[int,int,int,int]]](L53) — Parses<ref>label</ref>+ 4-coord<box>tokens in document order; each box binds to the most recent<ref>. Coords stay normalized[0,norm], corner-ordered. Drops exact duplicates and degenerate boxes (side < 2% or area ≥ 85% of the image — the repeated-box tail a looping decode emits).build_objects_metadata(answer, *, width, height, model_id, sensor_id, fallback_label="object", norm=1000) -> ObjectsMetadata | None(L95) — Scalesparse_grounding_answerboxes intowidth×heightpixels (clipped), buildsObjectDetection2Datconfidence=1.0(grounding model — no per-box score, CLAUDE.md §1.2);Noneif no valid detections.class LocateAnythingDetector(L151) —__init__(*, labels, model_id, weights_source="nvidia/LocateAnything-3B", host="127.0.0.1", port=5757, query=None, auto_spawn=True, boot_timeout_s=1200.0, request_timeout_s=180.0, max_side=1024, max_new_tokens=1024, mode="hybrid")(L149) — stores config; static defaultquery = "</c>".join(labels); no connection (lazy).set_query(text)— runtime open-vocab override for the continuous leg.detect(frame_bgr, width, height, sensor_id) -> ObjectsMetadata | None— one-shot detect of the persistent query (delegates todetect_with_query).detect_with_query(frame_bgr, width, height, sensor_id, query) -> ObjectsMetadata | None(ADR-0043) — one-shot detect forqueryWITHOUT mutating the persistent query; used by thelocate_in_viewservice so an on-demand reasoner query doesn't change what the continuous leg grounds.close()— closes the socket and terminates the sidecar if spawned; idempotent.
python/runner/src/openral_runner/backends/gstreamer/qwen_scene_vlm.py
Scene-VLM backend (ADR-0047) backed by the Qwen3.5-4B sidecar — the scene-reasoning counterpart of LocateAnythingDetector. Returns text, not ObjectsMetadata (a reasoning aid for task-progress / success verification, not a localizer). Same ZMQ lifecycle (lazy connect, auto-spawn, teardown only the child). No zmq/numpy/PIL at import (all lazy).
class QwenSceneVlm—__init__(*, model_id, weights_source="Qwen/Qwen3.5-4B", host="127.0.0.1", port=5759, auto_spawn=True, boot_timeout_s=1200.0, request_timeout_s=180.0, max_side=1024, max_new_tokens=256)— stores config; no connection (lazy).query(frame_bgr, width, height, question) -> str— encode BGR→PNG, RPC{"op":"query",...}, return the whitespace-stripped answer; raisesROSConfigErroron empty question or sidecar error.close()— closes the socket + terminates the spawned sidecar; idempotent.build_scene_vlm(manifest, *, host="127.0.0.1", port=5759) -> QwenSceneVlm— build from akind:"vlm"manifest;model_id=manifest.name,weights_sourcefromweights_uri(the deployable pre-quant checkpoint) stripped ofhf:///@rev. RaisesROSConfigErrorifmanifest.kind != "vlm". Lazy.
openral_runner.backends.reward (ADR-0057 reward monitor)
class Frame(frozen dataclass) — one buffered camera frame:stamp_ns: int,bgr: bytes,width: int,height: int.class RollingFrameBuffer—__init__(*, window_s, max_frames=256, stale_after_s=3.0)— transport-agnostic node-side ring of recent frames (sim + real).push(frame)— append + evict frames older thanwindow_srelative to the newest / overmax_frames.window(seconds) -> list[Frame]— frames within the lastseconds(capped towindow_s).is_stale(now_ns) -> bool— True if no fresh frame withinstale_after_s.__len__. Pure stdlib (no numpy/torch); unit-tested without ROS.trend(series: list[float]) -> float— least-squares slope per sample (0.0 for < 2 points); used for progress/success trend +stalled.class RobometerReward—__init__(*, model_id, weights_source="robometer/Robometer-4B", host="127.0.0.1", port=5769, auto_spawn=True, boot_timeout_s=1200.0, request_timeout_s=180.0, num_bins=100, success_threshold=0.5)— ZMQ client + auto-managed lifecycle for the stateless reward sidecar (mirrorsQwenSceneVlm).score(frames, task) -> (progress, success)— RPC{"op":"score",...}, per-frame normalized arrays; raisesROSConfigErroron empty clip/task or mismatched frame sizes.assess(frames, task) -> dict— score + summarize (progress_now,success_now,progress_trend,success_trend,stalled,succeeded,frames_seen).close()— socket close + sidecar teardown; idempotent.build_reward_monitor(manifest, *, host="127.0.0.1", port=5769) -> RobometerReward— build from akind:"reward"manifest;weights_sourcefromweights_uristripped ofhf:///@rev; carriesnum_bins+success_thresholdfrom theRewardContract. RaisesROSConfigErrorifmanifest.kind != "reward". Lazy.
python/runner/src/openral_runner/backends/gstreamer/detector_runner.py
Runtime glue (ADR-0037) that wires a kind: detector rSkill to a live camera pipeline — loads the DetectorContract, delegates backend construction to build_manifest_detector (ONNX CPU/NVMM tiers or the VLM_SIDECAR open-vocab tier), attaches the appropriate branch to the bus tee via TeeManager, and fires the on_detection callback for each non-None ObjectsMetadata. Imports gi + DetectorTier/build_manifest_detector + nvmm_convert_element eagerly at load.
class DetectorRunner(L60) —__init__(pipeline, manifest, *, onnx_path=None, sensor_id, on_detection, tee_name=TEE_NAME, tier=None)(L101) — validatesmanifest.kind == "detector"+manifest.detector is not None(raisesROSConfigError); caches_net_w/_net_hfromDetectorContract.input_sizefor the NVMM caps; delegates tobuild_manifest_detector(manifest, onnx_path=onnx_path, tier=tier)→(detector, tier)(gi-free dispatch;onnx_pathoptional,Nonefor the VLM sidecar tier); createsTeeManager.start()(L178) — selects branch string + handler by tier: NVMM_AGGREGATOR resolves the platform's NVMM converter (nvvideoconvert/nvvidconv) vianvmm_convert_element()(raisesROSConfigErrorif neither registered) and attaches the NVMM RGBA appsink +_on_sample_nvmm; every other tier (CPU_ONNX, VLM_SIDECAR, ZEROSHOT_HF) attachesvideoconvert ! video/x-raw,format=BGR ! appsink+_on_sample_bgr; raisesROSRuntimeErrorif appsink not found after attach._on_sample_bgr(appsink) -> int(L242) — pulls BGR sample, format assert, buffer.map/unmap, callsdetector.detect, fireson_detectionon non-None; errors guarded._on_sample_nvmm(appsink) -> int(L301) — pulls NVMM sample,wrap_buffer, callsNvmmObjectsDetector.detect_nvmm, fireson_detection; always unmaps; errors guarded.stop()(L59) — disconnects signal + detaches branch + callsdetector.close()if present; idempotent.
python/runner/src/openral_runner/__init__.py
Public surface of the inference runner. Imports are PEP 562 lazy (M8 PR I/8): heavy symbols (InferenceRunnerBase, factory.*, HardwareRunner, safety.*) are resolved on first attribute access so importing any subpackage does not eagerly drag in torch (582 modules) or trigger downstream glib conflicts.
- light eager imports:
precise_sleep,sleep_until,InferenceRunner(Protocol),SensorReader(Protocol). _LAZY_ATTRS: dict[str, tuple[str, str]]—attr → (module, name)map driving the__getattr__resolver. (L80)__getattr__(name) -> Any— Resolves heavy symbols on first access (torch / glib-sensitive deferral). (L95)
python/runner/src/openral_runner/factory.py
Wires RobotEnvironment YAML → live HardwareRunner (ADR-0010 PR G). The single seam the openral deploy --config CLI goes through.
SKILL_REGISTRY: dict[str, Callable[[dict[str, object]], rSkillBase]]—vla.id→ skill factory. Today:hello,gpu_passthrough(M8 PR I/10). (L135)SENSOR_BACKEND_REGISTRY: dict[str, Callable[[SensorReaderConfig], SensorReader]]—backendid → reader factory. Today:opencv_thread,gstreamer. (L296)_to_int(value, *, field, sensor_id) -> int— YAMLobject→intcoercion helper used across factories; rejects bools explicitly. (L64)_repo_root_from(start) -> Path— Walk upwards fromstartto locate the repo root for manifest resolution. (L85)_load_robot_description(robot_id) -> RobotDescription— Resolverobots/<id>/robot.yaml;build_runnerfeeds it toopenral_hal.build_hal(description, mode="real")(ADR-0031 — the manifest'shal.realentry is the single source of truth; the oldHAL_REGISTRY+transport.digital_twintwin path is gone, usedeploy simfor twins). (L97)_make_gpu_passthrough_skill(extra) -> rSkillBase— BuildsGpuPassthroughSkill; recognisedextra:sensor_id(default"wrist_rgb"),n_joints,horizon,device(default"cuda", raises if unavailable). (L112)_make_opencv_thread_reader(cfg) -> SensorReader— BuildsOpenCVThreadSensorReaderfrom aSensorReaderConfig; requiresbackend_params.device. (L141)_make_gstreamer_reader(cfg) -> SensorReader— BuildsGStreamerSensorReaderfrom aSensorReaderConfig. Translatespublish_to_ros/publish_topic/publish_rate_hz→PipelineSpec.enable_ros_tee. (M8 PR I/2 + I/4.) (L181)build_runner(env: RobotEnvironment) -> tuple[HardwareRunner, rSkillBase]— Composes HAL + skill +WorldStateAggregator+SensorReader[]+NullSafetyClientinto aHardwareRunner. Returns the runner and the skill so the caller drives the skill lifecycle. RaisesROSConfigErroron unknown registry ids. (L303)
python/runner/src/openral_runner/hardware.py
:class:HardwareRunner — concrete InferenceRunnerBase subclass composing HAL + Skill + WorldStateAggregator + SensorReaders + SafetyClient (ADR-0010 PR F).
class HardwareRunner(InferenceRunnerBase)— First end-to-end closer of theWorldState → Skill → safety → HALloop on real hardware / digital twins. The runner is the safety-supervisor boundary per CLAUDE.md §10: catchesROSSafetyViolationfrom the SafetyClient, records it on theTickResult, withholds theHAL.send_actioncall (does not re-raise because withholding IS the mitigation today). (L76)__init__(*, hal, skill, aggregator, sensor_readers=(), safety_client=None, recorder=None, thumbnail_hz=25.0, **base_kwargs)— Caller must pre-configure()+activate()the skill; runner manages HAL + reader open/close. Defaultssafety_clienttoNullSafetyClient.thumbnail_hzgates dashboard JPEG-thumbnail emission per camera (0 disables), decoupled fromrate_hz. ADR-0019 PR3: optionalrecorderis aopenral_dataset.RolloutRecorder; when set,episode_start/episode_enddrive its lifecycle and every tick fans out viarecord_frame. (L128)episode_start(task_string: str) -> int— ADR-0019 PR3: open a new episode on the attached recorder; returns the newepisode_idx(or-1when no recorder is attached). RaisesRuntimeErrorif called twice withoutepisode_end. (L188)episode_end(*, success: bool) -> None— ADR-0019 PR3: close the current recorder episode with the success flag. No-op when no recorder is attached. RaisesRuntimeErrorif called withoutepisode_start. (L216)activate() -> None—super().activate()+hal.connect()+ open everySensorReader. (L242)deactivate() -> None— Close everySensorReader(best-effort; logs + continues),hal.disconnect(),super().deactivate(). (L260)_tick_impl(tick_idx) -> TickResult— Five-phase tick: sensors → world_state → inference → safety → hal. Per-phase*_mspopulated on theTickResult;InferenceRunnerBase.ticklifts them onto therskill.tickOTel parent span. Each sensorread_latestcall is wrapped in asensors.read_latestspan that recordsopenral.sensors.age_ms(frame age at read time) onto theopenral.sensors.age_mshistogram. WrapsHAL.read_statein ahal.read_statespan andHAL.send_actionin ahal.send_actionspan (labels:openral.hal.adapter,openral.hal.robot.model,openral.hal.control_mode); recordsopenral.hal.read_state.duration+openral.hal.send_action.durationhistograms keyed by adapter. CatchesROSPerceptionStaleper reader and emitsopenral.event.sensor_stale+openral.sensors.stale_readscounter. CatchesROSSafetyViolationat the supervisor boundary and emitsopenral.event.safety_violation+record_exception+openral.safety.violationscounter (labeled by exception type and severity). (L331)_tracer[@property] — Per-calltrace.get_tracer("openral")(never cached at__init__, would bind to the provider live at construction time). (L233)_hal_adapter_label— Lower-cased class name of the HAL adapter, used as the closed-setopenral.hal.adaptervalue on spans + metrics. (L172)
python/runner/src/openral_runner/safety.py
:class:SafetyClient stub (ADR-0010 PR E) — Python-side seam for the future C++ safety kernel (CLAUDE.md §6 Layer 6).
class SafetyClient(Protocol)—@runtime_checkableProtocol.check_action(action)returnsNoneto allow or raisesROSSafetyViolationto reject. The inference runner catches at its supervisor boundary; never silently caught per CLAUDE.md §10. (L48)- attr
envelope: SafetyEnvelope— the envelope checked against. check_action(action: Action) -> None(L64)class NullSafetyClient— no-op stub that always allows. Every call opens asafety.checkOTel span atseverity="info"carryingcontrol_mode,horizon,envelope_max_ee_speed_m_s,envelope_max_force_n. Used by digital-twin runs and pre-hardware tests so traces show the seam is wired before the C++ kernel arrives. (L80)__init__(envelope: SafetyEnvelope | None = None)— defaults to a stockSafetyEnvelope. (L107)check_action(action: Action) -> None(L111)
python/runner/src/openral_runner/base.py
Shared base for inference runners (ADR-0010 PR C). Subclasses override _tick_impl.
_percentile(samples: list[float], q: float) -> float— Linear-interpolation percentile (0.0for empty list). Used by_build_run_result. (L39)class InferenceRunnerBase(ABC)— Owns the rate-limited loop,rskill.tickOTel parent span,RunResultaggregation, deadline-overrun policy. (L57)__init__(*, rate_hz=30.0, deadline_overrun_policy=WARN, runner_name="inference_runner", latency_budget_ms=None, save_dir=None)— Rejectrate_hz <= 0. (L93)activate() -> None— Reset tick counter; mark active. (L115)deactivate() -> None— Stop ticking; idempotent. (L120)_tick_impl(tick_idx: int) -> TickResult[@abstractmethod] — Subclass hook; the base wraps it in arskill.tickspan. (L127)episode_start(task_string: str) -> int— ADR-0019 PR3: optional explicit episode boundary; default raisesNotImplementedError.HardwareRunneroverrides to drive the recorder;SimRunneroverrides as a no-op (sim derives episode boundaries fromenv.stepflags). (L164)episode_end(*, success: bool) -> None— ADR-0019 PR3: optional explicit episode boundary; default raisesNotImplementedError. Seeepisode_start. (L189)_should_terminate() -> bool— Subclass early-exit hook (default False) consulted after each tick insiderun().SimRunneroverrides to stop oncen_episodescomplete. (L141)tick() -> TickResult— Span-wrapped single-tick entry; attaches per-stage timings asskill.{tick_ms, inference_ms, sensors_ms, world_state_ms, safety_ms, hal_ms, action_applied, safety_violations}attributes plus sim-onlyskill.{step_idx, episode_idx, reward, terminated, truncated}when set, plusopenral.tick.idx. Recordsopenral.tick.duration/openral.inference.durationhistograms (label:skill.id) and incrementsopenral.safety.violations{check_name="runtime", severity="violation"}for each violation on the tick. (L208)run(max_ticks: int | None = None) -> RunResult— Rate-limited loop usingsleep_until. AppliesDeadlineOverrunPolicy(warn/drop/raise). Recordslatency_budget_msviolations and incrementsopenral.tick.budget_violationsper violation. Honors_should_terminate()after each tick. (L287)_current_trace_id() -> str | None[@staticmethod] — Active OTel trace id (hex) or None. (L348)_on_deadline_overrun(result: TickResult) -> None— Apply policy: structlog warn / drop / raiseROSDeadlineMissed. Always incrementsopenral.tick.deadline_missesand emitsopenral.event.deadline_missedon the current parent span; onRAISE, also callsrecord_exception+set_status(ERROR)on the parent span before re-raising. (L358)_build_run_result(results, *, budget_violations, trace_id) -> RunResult— Aggregate per-tick records intoRunResult(mean / p99). (L411)