sandbox_v2: transport cleanup + docs (transport T5)

Bring the wire-protocol docs to the current protobuf + pluggable-transport
reality and tick the transport trackers.

* OVERVIEW.md / architecture.html: rewrite the transport row, the spawn
  prose, and the channel section to describe the three-layer
  Channel/Codec/Transport split, ProtobufCodec as the production wire,
  the Ready-frame handshake (no stdout text marker), length-prefixed
  framing, and stdio + unix transports (websocket reserved/future). Drop
  the stale --url ws:// example and JSON-line wording.
* channel.py docstrings (both mirrors): ProtobufCodec is the production
  codec; JsonCodec is the registry-free channel-core test/debug wire.
* protocol.py docstring: messages are typed protobuf (REGISTRY +
  sandbox_v2.proto); the payload shapes listed are the logical contract.
* sandbox.py: SandboxRuntime docstrings note the --url-selected transport
  (stdio default, unix opt-in, ws reserved).
* whats-changed.md: tick the protobuf-wire + typed-handlers boxes (T2
  360e454330) and pluggable-transports box (T3 1eaa79d261).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
Paulus Schoutsen
2026-06-03 09:10:36 -04:00
parent 1eaa79d261
commit 42560c6cd0
7 changed files with 134 additions and 66 deletions
@@ -8,9 +8,10 @@ dispatch core:
semaphore, ``register`` / ``call`` / ``push`` / ``close``. It speaks in
:class:`Frame` objects and never touches raw bytes.
* :class:`Codec` — turns a :class:`Frame` into bytes and back.
:class:`JsonCodec` is the line-compatible default (one JSON object per
frame); :mod:`.codec_protobuf` adds the protobuf wire on top of the same
seam.
:class:`~.codec_protobuf.ProtobufCodec` is the production wire (a typed
protobuf ``Frame`` envelope; the codec owns the ``type → message`` registry
so this dispatch core stays codec-agnostic). :class:`JsonCodec` (one JSON
object per frame) is retained only as the channel-core test/debug wire.
* :class:`Transport` — moves whole frame blobs over some byte channel.
:class:`StreamTransport` length-prefixes each frame (4-byte big-endian
length + body) over an :class:`asyncio.StreamReader` /
@@ -172,9 +173,11 @@ class Codec(Protocol):
class JsonCodec:
"""One-JSON-object-per-frame codec.
Line-compatible with the original wire shape (sans the trailing
newline, which the length prefix replaces). Kept as the default for
tests and debugging; production rides :class:`ProtobufCodec`.
The registry-free test/debug wire: it passes frame payloads through as
plain JSON (no ``type``-to-proto lookup), so the concurrency-critical
channel core can be exercised with synthetic message types and arbitrary
dict/int payloads. Production rides :class:`ProtobufCodec`; this stays
for the channel-core tests only.
"""
def encode(self, frame: Frame) -> bytes:
@@ -1,10 +1,20 @@
"""Phase 5 wire-protocol constants and payload helpers.
"""Wire-protocol message-type constants.
The integration and the sandbox runtime exchange JSON-line messages over
the :class:`Channel` set up in Phase 4. Each message type is namespaced
``sandbox_v2/…``. Both sides share the same names — kept here on the HA
side and mirrored verbatim in :mod:`hass_client.protocol` so neither has
to import the other.
The integration and the sandbox runtime exchange typed protobuf messages
over the :class:`Channel`. Each message type is namespaced ``sandbox_v2/…``;
this module holds the type-string constants. Both sides share the same
names — kept here on the HA side and mirrored verbatim in
:mod:`hass_client.protocol` so neither has to import the other.
The wire is protobuf (default codec :class:`~.codec_protobuf.ProtobufCodec`):
each ``type`` maps to a request/result proto message pair in
:mod:`.messages` (the `REGISTRY`), generated from
``sandbox_v2/proto/sandbox_v2.proto``. The payload shapes described below
are the *logical* contract for each call — they are carried as those typed
proto messages, not free-form dicts (only genuinely dynamic fields, e.g.
``service_data`` / state attributes / serialized voluptuous schemas, cross
as ``Struct`` / ``ListValue``). The line-oriented :class:`~.channel.JsonCodec`
is retained only as the channel-core test/debug wire.
Main → Sandbox calls:
+18 -10
View File
@@ -41,7 +41,7 @@ inside the sandbox.
| | v1 (`sandbox/`) | v2 (`sandbox_v2/`) |
|---|---|---|
| Routing | `entry.options["sandbox"]` set by hand | Computed at runtime from manifest + platform inspection ([`classifier.py`](../homeassistant/components/sandbox_v2/classifier.py)) |
| Transport | Live websocket connection back to main | JSON-line `Channel` over the subprocess's stdin/stdout |
| Transport | Live websocket connection back to main | Protobuf `Channel` over a pluggable transport (stdio by default, unix socket opt-in; websocket later) |
| Entity bridge | Bespoke `sandbox/update_state` + `sandbox/entity_command_result` (Option A) | Shared `sandbox_v2/call_service` (Option B) — see [`docs/entity-bridge-decision.md`](docs/entity-bridge-decision.md) |
| Config flow | Forwarded through host integration | Runs inside the sandbox; main owns the canonical `ConfigEntry` store |
| Auth | System-user token, full HA scope | System-user token (scope enforcement deferred until the sandbox→main connection lands) |
@@ -76,8 +76,8 @@ The design choices and the failure modes of v1 they fix are recorded in
└─────────────────────────┬─────────────────────────────────────────┼───────────────────┘
│ │
│ subprocess.Popen │ Channel
│ python -m hass_client.sandbox_v2 │ (JSON lines over
│ --name … --url … --token … │ stdin/stdout)
│ python -m hass_client.sandbox_v2 │ (protobuf frames over
│ --name … --url … --token … │ stdio / unix socket)
▼ │
┌──────────────────────────── Sandbox subprocess ──────────────────────────────────────┐
│ sandbox_v2/hass_client/hass_client/sandbox.py │
@@ -147,15 +147,23 @@ is:
```
python -m hass_client.sandbox_v2 \
--name <name> \
--url ws://localhost:8123/api/websocket \
--url stdio:// \
--token <sandbox access token>
```
The runtime prints `sandbox_v2:ready` on stdout once its
`HomeAssistant` instance is up; the manager parses that marker, then
opens a `Channel` over the subprocess's remaining stdin/stdout traffic.
Everything after the marker is JSON-line framed (one message per line,
length-delimited at the newline).
`--url` selects the control-channel transport: `stdio://` (the default —
frames ride the subprocess's stdin/stdout) or `unix://<path>` (the
manager opens a unix socket and the runtime dials back). `ws://` / `wss://`
are reserved for the deferred websocket transport and rejected for now.
The runtime opens the channel and sends a `Ready` frame
(`sandbox_v2/ready`) as its first message; the manager treats its arrival
as "running" (there is no stdout text marker — stdout carries nothing but
channel frames). Frames are protobuf (a `Frame` envelope carrying one
typed message per `type`; `JsonCodec` is kept only for channel-core tests)
and length-prefixed (4-byte big-endian length + body) on the stream
transports. The three-layer split is `Channel` (dispatch core) → `Codec`
(`Frame` ↔ bytes; `ProtobufCodec` in production) → `Transport`
(`StreamTransport` length-prefixing over stdio / unix).
### Health & crash recovery
@@ -452,7 +460,7 @@ only thing that differs is how the channel pair is materialised.
| Plugin | Wire | When to use |
|---|---|---|
| `hass_client.testing.pytest_plugin` | in-memory channel pair, `SandboxRuntime` as an asyncio task | fast feedback, freezer-safe |
| `hass_client.testing.conftest_sandbox` | real stdio JSON-line (`python -m hass_client.sandbox_v2`) | pins the subprocess boundary, freezer tests auto-skip |
| `hass_client.testing.conftest_sandbox` | real stdio protobuf channel (`python -m hass_client.sandbox_v2`) | pins the subprocess boundary, freezer tests auto-skip |
The compat lane runner
[`run_compat.py`](run_compat.py) drives either plugin against a list of
+66 -25
View File
@@ -366,7 +366,9 @@
>
</li>
<li>
<a href="#channel">The channel &mdash; wire protocol over stdio</a>
<a href="#channel"
>The channel &mdash; protobuf wire over pluggable transports</a
>
</li>
<li><a href="#flow">Config-flow forwarding</a></li>
<li>
@@ -438,8 +440,8 @@
<p>
Main holds the canonical view of the world. Each sandbox group is a
separate Python process spawned lazily on first need. The two sides
communicate over a JSON-line channel pinned to the subprocess's
stdin/stdout. Inside the sandbox, a private
communicate over a protobuf channel that rides a pluggable transport
(stdio by default, unix socket opt-in). Inside the sandbox, a private
<code>HomeAssistant</code> instance hosts the integration's real
<code>async_setup_entry</code>, <code>ConfigFlow</code>, entities,
services, and events.
@@ -882,7 +884,7 @@
font-size="10"
font-style="italic"
>
stdio JSON
stdio protobuf
</text>
<text
x="500"
@@ -1279,7 +1281,7 @@
fill="#a5f3fc"
font-size="9.5"
>
JSON-line framing, request/response,
protobuf framing, request/response,
</text>
<text
x="755"
@@ -1383,7 +1385,7 @@
>Any platform in <code>SANDBOX_INCOMPATIBLE_PLATFORMS</code> &rarr;
main.</strong
>
Audio / byte-stream platforms the JSON channel can't ferry:
Audio / byte-stream platforms the control channel can't ferry:
<code>stt</code>, <code>tts</code>, <code>conversation</code>,
<code>assist_satellite</code>, <code>wake_word</code>,
<code>camera</code>.
@@ -1463,16 +1465,21 @@
<pre><code>python -m hass_client.sandbox_v2 \
--name &lt;name&gt; \
--url ws://localhost:8123/api/websocket \
--token &lt;sandbox access token&gt; \
[--share-states] [--share-entity-registry] [--share-areas]</code></pre>
--url stdio:// \
--token &lt;sandbox access token&gt;</code></pre>
<p>
The runtime prints <code>sandbox_v2:ready</code> on stdout once its
<code>HomeAssistant</code> instance is up. The manager parses that
marker, then opens a JSON-line <code>Channel</code> over the remaining
stdin/stdout traffic. Everything after the marker is length-delimited at
the newline &mdash; one message per line.
<code>--url</code> selects the control-channel transport:
<code>stdio://</code> (the default &mdash; frames over the subprocess's
stdin/stdout) or <code>unix://&lt;path&gt;</code> (the manager opens a
unix socket and the runtime dials back). <code>ws://</code> /
<code>wss://</code> are reserved for the deferred websocket transport
and rejected for now. The runtime opens the channel and sends a
<code>Ready</code> frame (<code>sandbox_v2/ready</code>) as its first
message; the manager treats its arrival as &ldquo;running&rdquo; &mdash;
there is no stdout text marker, so stdout carries nothing but channel
frames. Frames are protobuf and length-prefixed (4-byte big-endian
length + body) on the stream transports.
</p>
<h3>Health and crash recovery</h3>
@@ -1553,19 +1560,52 @@
</div>
<!-- ======================================================== -->
<h2 id="channel">5. The channel &mdash; wire protocol over stdio</h2>
<h2 id="channel">
5. The channel &mdash; protobuf wire over pluggable transports
</h2>
<p>
The channel is split into three layers so the wire format and the byte
transport can each change without touching the concurrency-critical
dispatch core:
</p>
<ul>
<li>
<strong><code>Channel</code></strong> &mdash; the dispatch core
(pending-id map, inflight semaphore, register / call / push / close).
It speaks <code>Frame</code> objects and never touches raw bytes.
</li>
<li>
<strong><code>Codec</code></strong> &mdash; <code>Frame</code> &harr;
bytes. <code>ProtobufCodec</code> is the production wire (a typed
protobuf <code>Frame</code> envelope; the codec owns the
<code>type&nbsp;&rarr;&nbsp;message</code> registry). The
line-oriented <code>JsonCodec</code> is kept only as the channel-core
test/debug wire.
</li>
<li>
<strong><code>Transport</code></strong> &mdash; moves whole frame
blobs. <code>StreamTransport</code> length-prefixes each frame (4-byte
big-endian length + body) over a reader/writer pair (stdio, unix
socket); a future <code>WebSocketTransport</code> drops in via
<code>Channel.from_transport</code>.
</li>
</ul>
<p>
Both sides of the bridge share an identical
<code>protocol.py</code> &mdash; the constants are mirrored verbatim in
<code>protocol.py</code> (type-string constants, mirrored verbatim in
<code>homeassistant/components/sandbox_v2/protocol.py</code> and
<code>sandbox_v2/hass_client/hass_client/protocol.py</code>. Each
message is a single JSON object on its own line over the subprocess's
stdio:
<code>sandbox_v2/hass_client/hass_client/protocol.py</code>) and an
identical <code>messages.py</code> / generated <code>_pb2</code> pair. A
<code>call_service</code> request, for example, is a
<code>CallService</code> message (<code>domain</code>,
<code>service</code>, plus <code>Struct</code> <code>target</code> /
<code>service_data</code> for the dynamic fields) wrapped in the
<code>Frame</code> envelope.
</p>
<pre><code>{"id": 17, "type": "sandbox_v2/call_service", "domain": "light", "service": "turn_on", "target": {"entity_id": ["light.kitchen"]}, "service_data": {}}</code></pre>
<p>Three message shapes ride the channel:</p>
<ul>
@@ -1583,7 +1623,7 @@
</li>
<li>
<strong>Graceful close.</strong> Either end can flush a final reply
and then close stdin/stdout cleanly.
and then close the channel cleanly.
</li>
</ul>
@@ -2182,7 +2222,8 @@
<tr>
<td><code>hass_client.testing.conftest_sandbox</code></td>
<td>
Real stdio JSON-line (<code>python -m hass_client.sandbox_v2</code
Real stdio protobuf channel (<code
>python -m hass_client.sandbox_v2</code
>)
</td>
<td>Pins the subprocess boundary; freezer tests auto-skip</td>
@@ -2292,8 +2333,8 @@
<h4>Sandbox lifecycle</h4>
<p>
<code>SandboxManager</code> spawns one subprocess per group lazily;
restart-on-crash with a 3/60&nbsp;s budget;
<code>sandbox_v2:ready</code> handshake.
restart-on-crash with a 3/60&nbsp;s budget; <code>Ready</code>-frame
handshake.
</p>
</div>
@@ -147,9 +147,11 @@ class Codec(Protocol):
class JsonCodec:
"""One-JSON-object-per-frame codec.
Line-compatible with the original wire shape (sans the trailing
newline, which the length prefix replaces). Kept as the default for
tests and debugging; production rides :class:`ProtobufCodec`.
The registry-free test/debug wire: it passes frame payloads through as
plain JSON (no ``type``-to-proto lookup), so the concurrency-critical
channel core can be exercised with synthetic message types and arbitrary
dict/int payloads. Production rides :class:`ProtobufCodec`; this stays
for the channel-core tests only.
"""
def encode(self, frame: Frame) -> bytes:
+13 -10
View File
@@ -12,10 +12,12 @@ Composes the sandbox's per-process services:
registrations and ``<owned_domain>_*`` events up to main, gated by
:class:`ApprovedDomains` (Phase 6).
The handshake: open the stdio channel, send a :data:`MSG_READY` frame
as the first message, warm-load restore state, register handlers, then
idle until SIGTERM (or until main asks for a graceful shutdown over the
channel — see Phase 9's :meth:`SandboxRuntime._handle_shutdown`).
The handshake: open the control channel (transport selected by the
``--url`` scheme — ``stdio://`` by default, ``unix://<path>`` to dial back
to the manager's unix socket), send a :data:`MSG_READY` frame as the first
message, warm-load restore state, register handlers, then idle until
SIGTERM (or until main asks for a graceful shutdown over the channel — see
Phase 9's :meth:`SandboxRuntime._handle_shutdown`).
"""
import asyncio
@@ -54,12 +56,13 @@ ChannelFactory = Callable[[], Awaitable[Channel | None]]
class SandboxRuntime:
"""Runtime: Ready-frame handshake + length-prefixed control channel.
The websocket URL/token still come in on the CLI for forward-compat
with the deferred WS transport (the scoped sandbox token will travel
that path), but today the runtime only uses the stdin/stdout control
channel that the manager opens. The handshake is a :data:`MSG_READY`
frame sent as the channel's first message — there is no stdout text
marker.
The control-channel transport is chosen from the ``--url`` scheme:
``stdio://`` (default — frames over the process's stdin/stdout) or
``unix://<path>`` (dial back to the manager's unix socket). ``ws://`` /
``wss://`` are reserved for the deferred websocket transport and
rejected for now; the token still travels the CLI for forward-compat
with it. The handshake is a :data:`MSG_READY` frame sent as the
channel's first message — there is no stdout text marker.
"""
def __init__(
+7 -6
View File
@@ -70,14 +70,15 @@
opt-in is future work (`docs/design-share-states.md`).
## For sandbox / core contributors
- [ ] **Protobuf wire format.** Messages are protobuf (`Frame` envelope + typed
- [x] **Protobuf wire format.** Messages are protobuf (`Frame` envelope + typed
per-message bodies; `Struct`/`ListValue` only for voluptuous schemas and
`service_data`). `.proto` source + generated `_pb2` checked in; regen script
provided. (`plan-transport.md`)
- [ ] **Pluggable transports.** stdio + unix socket now; the `Transport` seam
accepts the deferred websocket drop-in (lands with share-states).
(`plan-transport.md`)
- [ ] **Handlers consume typed protobuf messages** (no dict adapters).
provided. (`plan-transport.md` T2 `360e4543300`)
- [x] **Pluggable transports.** stdio (default) + unix socket now; the
`Transport` seam accepts the deferred websocket drop-in (lands with
share-states). (`plan-transport.md` T3 `1eaa79d261e`)
- [x] **Handlers consume typed protobuf messages** (no dict adapters).
(`plan-transport.md` T2 `360e4543300`)
- [ ] **Test Dockerfile** for running the client runtime against main.
(`plan-docker.md`)
- [ ] v1 reference code lives only in git history now.