mirror of
https://github.com/NousResearch/hermes-agent.git
synced 2026-06-24 03:29:17 +00:00
docs(relay): add §3.4 — obligations on a future scale-to-zero behaviour layer (#51633)
The contract already documents the scale-to-zero PRIMITIVES (§3.2 going-idle/ buffered-flip, §3.3 wake poke) and what's out of scope. This adds the missing half: the contract FROM the primitives TO the behaviour layer — the guarantees a separate scale-to-zero workstream must honour to consume them safely (register a wakeUrl before suspend; drain+ack before teardown; keep the reconnect loop live; treat suspended != down in the health model; don't assume exactly-once/ prompt wake; suspend only when genuinely idle, composing with the existing drain machine). Docs-only; lets the independent scale-to-zero stream build against a written contract instead of re-reading the connector.
This commit is contained in:
@@ -259,6 +259,57 @@ the wake SIGNAL so a future scale-to-zero behaviour layer can rely on "buffered
|
||||
> primitive is "buffered event for a sleeping instance ⇒ its wakeUrl gets poked";
|
||||
> WHAT makes the instance sleep (and wake-to-serve) is the behaviour layer.
|
||||
|
||||
### 3.4 Obligations on a future scale-to-zero behaviour layer
|
||||
|
||||
§3.2 and §3.3 ship the **primitives**; this section is the **contract a separate
|
||||
scale-to-zero behaviour workstream must honour to consume them safely.** It owns
|
||||
the *decision* to suspend, the actual machine suspend, and the platform/health
|
||||
model — none of which live here — but it MUST hold these guarantees, which the
|
||||
primitives assume:
|
||||
|
||||
1. **Register a `wakeUrl` before the instance can ever be suspended.** A
|
||||
suspended instance with no registered `wakeUrl` is a black hole — buffered
|
||||
inbound never triggers a poke, so it sleeps through its own traffic until
|
||||
something else reconnects it. The behaviour layer MUST ensure a reachable
|
||||
wake target is registered (self-hosted: `--wake-url`; managed: stamped) as a
|
||||
precondition of allowing suspend. A wake URL that is unreachable while the
|
||||
machine is suspended (e.g. points at the suspended machine itself with no
|
||||
platform autostart in front) is equivalent to none.
|
||||
2. **Drain through `going_idle` → await `going_idle_ack` BEFORE tearing down the
|
||||
socket or suspending.** Never suspend with an un-acked flip in flight. The
|
||||
ack is the connector's confirmation that delivery for this instance is now
|
||||
buffered-only; a machine that suspends after sending `going_idle` but before
|
||||
the ack can drop the inbound that races the flip. The gateway already gates
|
||||
socket teardown on the ack (Q-5.3c); the suspend step MUST sit *after* a
|
||||
clean drain completes, not race it.
|
||||
3. **Keep the NET-NEW reconnect loop live as a precondition of suspend.** The
|
||||
wake→drain contract is "poke ⇒ the gateway re-dials ⇒ the connector drains on
|
||||
the reconnect handshake." If the reconnect loop is disabled, a poke lands on a
|
||||
machine that never re-dials and the buffer strands. The behaviour layer must
|
||||
not suspend an instance whose relay transport won't reconnect on wake.
|
||||
4. **Treat suspended ≠ down in the health model (Q-5.3b).** A suspended instance
|
||||
is healthy-asleep, not failed. The health/monitoring layer MUST distinguish
|
||||
the two (e.g. via the platform machine-state) so a suspended instance is not
|
||||
restarted, alerted on, or reaped as unhealthy — that would defeat the suspend
|
||||
and can race the wake/drain.
|
||||
5. **The wake poke is best-effort and rate-limited — do not assume exactly-once
|
||||
or immediate wake.** At most one poke per cooldown window per instance, and a
|
||||
failed poke is swallowed. The behaviour layer must not rely on the poke as a
|
||||
guaranteed/prompt signal; correctness still rests on "the gateway drains
|
||||
whenever it next reconnects." A belt-and-suspenders wake (e.g. a scheduled
|
||||
job that also reconnects) is the behaviour layer's call, not the primitive's.
|
||||
6. **Suspend only when genuinely idle — and idle is connector-observable, not
|
||||
gateway-guessed.** WHAT counts as idle (no in-flight turn + no inbound for N
|
||||
min) is the behaviour layer's policy, but it must compose with the existing
|
||||
drain machinery (`gateway_state` running→draining) rather than introduce a
|
||||
parallel relay-only idle path — the same integration constraint §3.2 places
|
||||
on `going_idle`.
|
||||
|
||||
These are guarantees the behaviour layer OWES the primitives; the primitives owe
|
||||
the behaviour layer only what §3.2/§3.3 already specify (a flip-on-going_idle,
|
||||
a durable per-instance buffer + ack-gated reconnect drain, and a poke on the
|
||||
first buffered event for a flipped instance).
|
||||
|
||||
---
|
||||
|
||||
## 4. Outbound: action set
|
||||
|
||||
Reference in New Issue
Block a user