Commit Graph

12642 Commits

Author SHA1 Message Date
Teknium
ed0e2ab371 chore(providers): remove dead cloudcode-pa quota-fallback branches
The google-antigravity and google-gemini-cli OAuth providers were removed
in #50492. They were the only producers of a cloudcode-pa:// base_url, so
the account-level-quota early-returns in _pool_may_recover_from_rate_limit
and _credential_pool_may_recover_rate_limit are now unreachable.

- Drop the dead cloudcode-pa:// checks and the now-unused provider/base_url
  params on _pool_may_recover_from_rate_limit (only caller updated).
- Prune the obsolete CloudCode-specific regression tests; keep the live
  single/multi-entry pool-rotation invariants (#11314).
2026-06-23 11:26:03 -07:00
Teknium
70d28b62fb feat(cli): track background subagents in the status bar (#51441)
The classic prompt_toolkit status bar already shows two background
indicators: ▶ N (/background agent threads) and ⚙ N (shell processes
spawned by terminal(background=true)). Background/async subagents
(delegate_task batches and background single delegations) had no
indicator despite being long-running work the user should be able to
see at a glance.

Add a third indicator ⛓ N sourced from
tools.async_delegation.active_count() — the count of delegations still
in the 'running' state. Renders in the plain-text builder and the
styled-fragment builder across the same width tiers as the other two
(omitted on the narrow <52 tier), guarded so a raising active_count()
leaves the snapshot at 0.
2026-06-23 11:09:08 -07:00
Teknium
6cc07b6cd0 feat(discord): render reasoning as -# subtext via display.reasoning_style (#51168)
Adds a per-platform display.reasoning_style setting (code | blockquote |
subtext) controlling how the show_reasoning summary renders on the gateway.
Discord defaults to "subtext" (-# small grey metadata text); every other
platform keeps the fenced code block. Resolves through the existing
display.platforms.<platform>.reasoning_style override chain.
2026-06-23 10:44:02 -07:00
xxxigm
f32be4439c test(install): assert no system-browser auto-detect + snap override repair
Replace the old "skips download when a system browser exists" assertions with
tests for the new behavior:
- no PATH scan for browser command names, and the "use the system browser" path
  is gone;
- find_system_browser consults only an explicit AGENT_BROWSER_EXECUTABLE_PATH
  override (which still skips the bundled download);
- strip_snap_browser_override runs on both install paths and a /snap/* path is
  rejected, so already-affected installs auto-recover on update.
2026-06-23 10:38:15 -07:00
xxxigm
97888fed48 fix(install): drop system-browser fallback + auto-repair stale snap override
The installer scanned PATH/well-known locations for a Chrome/Chromium binary
and, when found, skipped the bundled Playwright Chromium download and wrote that
path into ~/.hermes/.env as AGENT_BROWSER_EXECUTABLE_PATH. On Snap-based systems
`command -v chromium` resolves to /snap/bin/chromium, whose sandbox blocks
agent-browser's control socket under /tmp -- so every browser_navigate hung
until the 60s timeout fired ("opening web page failed").

Drop the system-browser fallback entirely (per maintainer direction):
find_system_browser()/Find-SystemBrowser now honor ONLY an explicit, user-set
AGENT_BROWSER_EXECUTABLE_PATH override -- no PATH scan, no well-known-path scan.
A /snap/* path is rejected even when set explicitly, since its confinement is
the bug. Applied to both install.sh (Linux/macOS) and install.ps1 (Windows).

Crucially, also auto-repair already-affected installs: the bad snap path
persists in .env and is read directly by the runtime, and the installer skips
re-config when AGENT_BROWSER_EXECUTABLE_PATH is already set ("already
configured"), so a plain reinstall/update never recovered an existing user. New
strip_snap_browser_override() removes a snap-pointing AGENT_BROWSER_EXECUTABLE_PATH
(and its auto-written comment) from .env on every install/update, run from both
browser-setup paths (install_node_deps and ensure_browser), so updating is
enough to recover. A deliberately-set non-snap override is left untouched.

docker/stage2-hook.sh is intentionally untouched: it discovers the bundled
Playwright Chromium, not a system browser.
2026-06-23 10:38:15 -07:00
ethernet
0089bd820f fix(ci): classify should default to no MCP 2026-06-23 10:32:27 -07:00
wnuuee1
9fd2b2cb9f fix(desktop): replace native title tooltips with styled Tip component 2026-06-23 10:19:30 -07:00
ethernet
a0471e2464 fix(ci): only run supplychain checks in pr 2026-06-23 09:46:25 -07:00
ethernet
c820eb6a5a ci: remove unused windows installer job 2026-06-23 09:30:50 -07:00
ethernet
05c896cf52 ci: refactor paths & clones
ci: centralize path-gating behind single orchestrator + all-checks-pass
gate

Replace the scattered per-workflow detect-changes pattern with a single
ci.yml orchestrator that runs the classifier once, then conditionally
calls sub-workflows via workflow_call based on lane outputs. A final
all-checks-pass job (if: always()) aggregates all results so branch
protection only needs to require one check.

Changes:
- New .github/workflows/ci.yml orchestrator (detect + conditional calls
  + all-checks-pass gate)
- Extend classify_changes.py with scan/deps/mcp_catalog lanes, absorbing
  supply-chain-audit's internal changes job
- Update detect-changes/action.yml to expose the new lane outputs
- Convert all 10 PR-gated sub-workflows to workflow_call-only triggers,
  removing their push/pull_request triggers and per-step detect-changes
  guards (gating now happens at the orchestrator level)
- lint.yml + supply-chain-audit.yml receive event_name as a
workflow_call
  input to replace github.event_name (which is "workflow_call" inside
  called workflows)
- supply-chain-audit.yml: remove internal changes job + *-gate jobs
  (orchestrator handles gating, booleans arrive as inputs)
- contributor-check.yml: remove internal filter step
- Update test_classify_changes.py for 6-lane output + new supply-chain
  test cases
2026-06-23 09:30:50 -07:00
Brooklyn Nicholson
56b4ef74a6 ci: make dependency installs resilient to transient flakes
`npm ci` / `uv sync` / toolchain header fetches occasionally die on
transient network blips — e.g. node-pty's node-gyp fetching Node headers
(an undici assert) during the typecheck job's `npm ci`, which killed the job
before `tsc` ever ran. "Re-run and it goes green" is exactly what CI should
do itself.

- New reusable `.github/actions/retry` composite action wraps a command and
  retries on failure (3x / 10s, command passed via env so it can't inject).
  Applied to every PR-path network install: npm ci (typecheck, desktop
  build, docs site), uv sync (tests, e2e), uv tool install (lint),
  pip install (docs site).
- typecheck now runs `npm ci --ignore-scripts`: `tsc` needs only sources +
  type defs, so skipping install scripts drops node-pty's native rebuild
  (whose header fetch was the flake) and is faster. Validated locally — tsc
  passes for ui-tui, apps/shared, and apps/desktop with scripts skipped.
- ripgrep download uses `curl --retry`.

Docker (main-only) and the release/windows workflows are intentionally left
for a follow-up.
2026-06-23 09:30:50 -07:00
Brooklyn Nicholson
2977e74543 ci: build Docker on main + release only, never on PRs
The image build + smoke test + integration suite are the heaviest jobs in CI
(~9-11 min) and ran on every PR. Gate them to push-to-main and release: a
broken build surfaces on the main push, while the cheap pre-merge guards
(docker-lint hadolint/shellcheck, uv-lockfile-check) still run on PRs to
catch the common Dockerfile/lockfile breakage. Steps skip on PRs so the job
stays green; the dead PR-only arm64 cache-warm build is removed.
2026-06-23 09:30:50 -07:00
Brooklyn Nicholson
45540cfb5e ci: run only the lanes a PR affects (python/frontend/site)
Heavy PR checks run on every PR because the workflows deliberately avoid
`on.paths` filters — a path-gated workflow leaves its required check pending
forever when no matching file changes, blocking merge. So a docs-only PR
still spins up the TypeScript matrix, the full Python suite, and ruff/ty.

Keep every workflow triggering on every PR (checks always report) but gate
the expensive *steps* on what the PR touches. Skipping a step (not the job)
leaves the job green, so required checks never hang — the same idiom already
proven in contributor-check.yml.

A classifier (scripts/ci/classify_changes.py) maps the PR diff to three
lanes — python, frontend, site — surfaced as step outputs by a composite
action (.github/actions/detect-changes). Fail-open: an empty diff or any
.github/ change runs everything; python is a denylist (skipped only when
every file is provably prose or a frontend-only package); skills/**/SKILL.md
counts as python-relevant since the skill-doc tests read that tree. Non-PR
events always run the full pipeline.
2026-06-23 09:30:50 -07:00
Teknium
351afd353d docs(computer-use): document Windows UIPI elevated-window limitation (#51121)
A Medium-integrity Hermes agent cannot drive High-integrity (admin)
windows on Windows — UIPI blocks UIA enumeration and mouse injection
(SOM returns 0 elements, clicks silently no-op, screenshots still work,
keyboard partially bypasses). OS constraint affecting every Windows
automation stack, not a cua-driver bug. Document the symptom + the
run-elevated workaround. Closes #49067.
2026-06-23 08:41:33 -07:00
kshitijk4poor
5ecf3bf0e0 fix(slack): report ext-matched audio mimetype for rerouted voice clips
Follow-up to the salvaged voice-clip fix: the rerouted video/mp4 branch
used {".m4a": "audio/mp4"}.get(ext, "audio/mp4"), whose sole key's value
equals the default, so it always returned "audio/mp4" regardless of the
cached extension (dead lookup + a throwaway dict per inbound voice clip).

Replace it with a module-level _SLACK_EXT_TO_AUDIO_MIME map so the reported
media_type matches the bytes we cached (e.g. a clip cached as .wav now
reports audio/wav instead of audio/mp4). STT routing already keys on the
audio/ prefix + cached filename extension, so behavior is unchanged; this
just removes the dead construct and keeps the reported mimetype coherent.
2026-06-23 14:44:12 +05:30
Ben
2196584161 fix(slack): transcribe in-app voice messages (audio/mp4) instead of failing
Slack in-app voice clips ("record a clip") arrive as MP4/AAC containers
(mimetype audio/mp4, filename audio_message*.mp4), and Slack sometimes
labels them video/mp4. The inbound audio handler derived the cache
extension from the mimetype and fell back to ".ogg" for anything not in
{.ogg,.mp3,.wav,.webm,.m4a} — so audio/mp4 voice messages were cached as
.ogg. OpenAI STT (whisper-1, gpt-4o-transcribe) sniffs the container from
the FILENAME extension, so it received MP4 bytes named .ogg and rejected
them. WhatsApp .ogg and uploaded .m4a worked only because their extension
happened to match the bytes.

Fix:
- _resolve_slack_audio_ext(): pick the cache extension from the real
  filename first, then a mimetype map (audio/mp4 -> .m4a), defaulting to
  .m4a — never the bogus .ogg fallback. Mirrors the video branch and the
  audio map already in gateway/platforms/bluebubbles.py.
- _is_slack_voice_clip(): detect audio-only clips mislabeled video/mp4
  via the slack_audio subtype / audio_message* filename, and route them
  through the audio path (cached as audio, reported as audio/*) so they
  reach STT instead of video understanding. Genuine videos (and
  slack_video screen recordings) are left on the video path.

Verified end-to-end against a real audio-only MP4: old path cached it as
.ogg (ffprobe shows MP4 bytes -> container mismatch -> OpenAI rejects);
new path caches it as .mp4 (extension matches bytes -> accepted).

Adds inbound-audio tests (previously none): helper unit tests plus
_handle_slack_message E2E coverage for audio/mp4, video/mp4-mislabeled
voice clips, and a real video staying on the video path. Confirmed the
two voice-message tests fail without the fix (mutation check).
2026-06-23 14:44:12 +05:30
Ben Barclay
45bc4fb37f feat(relay): declare relevance policy to the connector + document the management plane (#51248)
The gateway half of Phase 6 Unit ζ: project the agent's existing relevance
knobs into the connector's platform-agnostic vocabulary and declare them at boot
over the /relay/policy route, so the SAME mention-gating / free-response /
allow-bots behavior the agent applies directly also governs relay delivery (and
excluded chatter never wakes a scaled-to-zero agent).

- gateway/relay/__init__.py:
  - relay_relevance_policy(): project require_mention -> requireAddress,
    free_response_channels -> freeResponseScopes, {PLATFORM}_ALLOW_BOTS in
    {mentions,all} -> allowOtherBots. Reads the fronted platform's config block
    + bridged top-level keys. Returns None when all-default (the connector's
    quiet default already matches) or no concrete platform is fronted.
  - send_relay_policy(): POST /relay/policy authenticated with the gateway's own
    per-gateway upgrade token (make_upgrade_token — same bearer as the WS
    upgrade), so the connector attaches it to the authenticated instance, never
    a body-asserted id. Re-declares every boot (self-healing, full replace).
    NEVER raises, NEVER blocks boot — relevance is an optimization layered on
    the δ/ε authorization gate. Reuses the per-gateway secret + the
    /relay/provision host; no new inbound surface, no new credential.
  - _policy_url(): ws(s)://…/relay -> http(s)://…/relay/policy.
- gateway/run.py: call send_relay_policy() after register_relay_adapter()
  succeeds (the secret is resolved by then).
- docs/relay-connector-contract.md: new §7 documenting per-instance delivery +
  the management plane (/manage/* + /relay/policy) + the relevance-declaration
  contract; versioning renumbered to §8. Contract conformance test stays green
  (§2/§3 tables untouched).

Tests: +12 (projection mapping incl. comma-string + top-level fallback; send
auth/skip/fail-soft/non-200). Full relay suite 118 pass. The connector route is
already E2E-proven (connector repo gateway_policy_driver.py); this adds the real
gateway send-path it pairs with.

This completes Phase 6 (Team Gateway per-user isolation) end to end.
2026-06-23 18:43:19 +10:00
brooklyn!
211ba9c7d3 feat(agent): one-shot LLM helper + llm.oneshot gateway RPC (#51261)
A "one-shot" is a single stateless model call that runs OUTSIDE any conversation:
it never touches session history, never breaks prompt caching, and returns plain
text. UI surfaces need this for small generative chores — a commit message from a
diff, a rename suggestion, a summary — where an agent turn would pollute the
thread and hand-rolling an LLM call at every call site would be worse.

- `agent/oneshot.py`: `run_oneshot(...)` over the existing auxiliary-client
  plumbing (same path as title generation). Two call shapes: explicit
  instructions/input, or a registered `template` + `variables` (templates own the
  prompt engineering so it stays consistent across CLI/TUI/desktop). Ships a
  `commit_message` template. Model selection inherits the live session via
  `main_runtime`, else the configured aux `task` backend.
- `tui_gateway/server.py`: `llm.oneshot` RPC (long-handler) inheriting the
  session's model when `session_id` resolves.

Stateless by construction — no session mutation, cache untouched.
2026-06-23 08:01:50 +00:00
brooklyn!
af7b7f6322 feat(agent): expose coding-context project facts as structured data + project.facts RPC (#51259)
Follow-up to the coding-context posture (#43316): that PR detects each repo's
verify loop (manifests, package manager, exact test/lint/build commands, context
files) and bakes it into the system-prompt snapshot — but only as a string, for
the model. Non-prompt consumers (the desktop verify UI) had no way to read it
without re-sniffing and drifting from the prompt.

Split detection from rendering, keeping one source of truth:

- `detect_project_facts(root) -> ProjectFacts` (frozen) holds the structured
  facts; `_project_facts()` now renders it into the same snapshot lines, so the
  prompt block stays byte-identical (cache-safe).
- `project_facts_for(cwd)` resolves the workspace root (git, else marker) and
  returns the structured facts, or None outside a workspace.
- `project.facts` gateway RPC surfaces it to any client (desktop/TUI/ACP).

Tests assert the structured output and that the UI-facing commands never drift
from what the prompt block renders (one detector feeds both).
2026-06-23 08:00:01 +00:00
Teknium
bb7ff7dc30 revert(cron): return cron job storage to per-profile (reverts #32117 + #50993) (#51116)
* Revert "fix(cron): scope job execution to its owning profile (#32091 follow-up) (#50993)"

This reverts commit 660e36f097.

* Revert "fix(cron): anchor cron storage at the default root home (not the active profile)"

This reverts commit a5c09fd176.
2026-06-22 17:53:50 -07:00
brooklyn!
2a10b8384a Merge pull request #51103 from NousResearch/bb/desktop-tool-preview-cleanup
fix(desktop): manual tool previews via status stack
2026-06-22 19:29:29 -05:00
Brooklyn Nicholson
7daa6d83fc style(desktop): soften inline code and expanded tool chrome
Drop the inline-code border; halve the expanded tool block radius.
2026-06-22 19:23:07 -05:00
Brooklyn Nicholson
48a8f84169 fix(desktop): toggle preview rail and open in browser
Status row opens/closes the preview pane; external link uses a dedicated
file:// browser bridge (openExternal, not openPath).
2026-06-22 19:22:11 -05:00
Brooklyn Nicholson
d0af7fc954 feat(desktop): detect tool previews into composer status stack
Register previewable artifacts from the tool row, feed a session-scoped store,
and render compact rows above the composer. Remove the inline preview card.
2026-06-22 19:22:11 -05:00
Brooklyn Nicholson
cb17a9efb2 fix(desktop): stop auto-opening tool previews
Drop gateway-event preview registration so HTML artifacts from tool results
no longer pop the rail. De-dupe the inline preview card label.
2026-06-22 19:21:20 -05:00
Eri Barrett
ba9e3a491b feat(memory): Honcho OAuth connect — desktop and CLI flows + token refresh (#44335)
* feat(memory): OAuth token storage and refresh for the Honcho provider

* feat(memory): refresh the Honcho OAuth token in the client and session

* feat(memory): zero-CLI loopback OAuth authorization flow

* feat(memory): generic memory-provider OAuth connect endpoints

* feat(desktop): memory-provider OAuth connect link

* feat(memory): CLI OAuth sign-in with source-tagged authorize links

* fix(memory): IP-literal loopback redirect and consent config_path on the authorize link

* fix(memory): profile-scope the memory-provider OAuth endpoints

* refactor(desktop): generic memory-provider OAuth client functions

* docs(memory): trim OAuth module docstrings to the invariants

* docs(memory): document OAuth connect as an optional auth method

* fix(memory): send home-relative display path to consent, not the absolute path

* perf(memory): cache OAuth token expiry in memory to skip the hot-path disk read

* fix(memory): log OAuth refresh failures at warning, not debug

* feat(memory): fall back to an OS-assigned loopback port when 8765 is taken

* test(memory): cover the desktop Connect launcher, status, and provider dispatch

* fix(desktop): keep the memory-provider dropdown one size regardless of connect state

* fix(desktop): move the memory connect link to the description line, leaving the dropdown untouched

* refactor(memory): move OAuth connect routes out of web_server into a memory-layer router

* refactor(desktop): import MemoryConnect directly, drop the single-export barrel

* fix(memory): launch CLI OAuth sign-in right after the auth choice, not after the wizard

* fix(desktop): auto-clear the OAuth error state instead of leaving it sticky

* test(honcho): isolate auth-method prompt from deployment-shape wizard tests

main's wizard suite scripts the cloud prompts without the OAuth auth-method step; auto-answer it in the shared helper so the answer lists stay shape-only.

* docs(honcho): document query-adaptive reasoning level (reasoningHeuristic)

README never mentioned reasoningHeuristic and listed reasoningLevelCap as an orphaned cap with the wrong default (— vs "high"). Add the query-adaptive scaling note + the reasoningHeuristic/reasoningLevelCap rows (grouped under Dialectic & Reasoning), matching the wording already on the hosted honcho.md page, and add a pointer from the memory-providers overview.

* fix(honcho): default the CLI peer prompt to the OAuth consent name

The CLI runs the grant with apply_config=False, so the peerName the user just entered at consent was dropped and the wizard's 'Your name' prompt fell back to $USER. Surface it as a transient OAuthCredential.consent_peer_name (set even when config isn't merged) and seed the prompt default from it.

* feat(honcho): split OAuth client_id by surface (cli=hermes-agent, desktop=hermes-desktop)

resolve_endpoints now picks the client_id from the initiating surface and
threads it through authorize -> token exchange -> persisted grant -> refresh,
so the CLI and desktop register as distinct OAuth clients. Surface-specific
env overrides (HONCHO_OAUTH_CLIENT_ID_CLI/_DESKTOP) win over the generic
HONCHO_OAUTH_CLIENT_ID, which still overrides every surface.

* feat(honcho): show OAuth vs API key in status; detect existing OAuth in setup

status now prints 'Auth: OAuth (clientId, token valid Xm/expired)' instead of
masking the OAuth access token as a generic API key; setup notes an existing
OAuth grant when re-run.

* docs(honcho): drop 'shared pool' wording from unified observation mode help

* fix(honcho): cross-process lock around OAuth refresh to prevent grant revocation

The in-process threading lock can't stop a sibling process (another profile or
the desktop app sharing honcho.json) from replaying the single-use refresh
token and tripping reuse-detection, which revokes the whole grant. Guard the
read-refresh-persist section with an OS file lock on <config>.lock so only one
process rotates at a time; the others re-read the freshly-persisted token.
Best-effort: platforms without flock degrade to in-process serialization.

* refactor(honcho): one OAuth client (hermes-agent) for all surfaces

Collapse the per-surface client_id split. CLI and desktop now use a single
client_id (hermes-agent); consent branding/UI still adapt via the source query
param. One grant identity means no clientId-vs-refresh-token desync that could
get the grant revoked. HONCHO_OAUTH_CLIENT_ID still overrides for self-hosting.

* fix(honcho): per-session resolves to session_id, never remapped by title

Reorder resolve_session_name so stable identifiers win over labels: gateway
per-chat key first, then the per-session session_id, then the cwd map / title.
A (possibly auto-generated) title can no longer remap a live per-session
conversation onto a second Honcho session mid-stream — fixes the desktop, which
is per-conversation via session_id. Consequence: a gateway's per-chat key now
also wins over a title (titles never remap a stable id).
2026-06-22 19:16:47 -05:00
brooklyn!
672ea1f894 Merge pull request #50994 from NousResearch/hermes/hermes-9fb04abd
fix(computer-use): working vision capture + whole-screen/desktop target on Windows
2026-06-22 19:02:04 -05:00
Brooklyn Nicholson
833710d33e Merge remote-tracking branch 'origin/main' into pr-50994
# Conflicts:
#	tools/computer_use/cua_backend.py
2026-06-22 18:48:07 -05:00
brooklyn!
116331dd3f Merge pull request #51094 from NousResearch/bb/desktop-thread-timeline
feat(desktop): conversation timeline rail for long threads
2026-06-22 18:41:13 -05:00
brooklyn!
760fd9513e Merge pull request #51078 from NousResearch/bb/fix-vision-capture
fix(computer-use): vision capture returns an image on cua-driver >=0.5.x
2026-06-22 18:37:18 -05:00
brooklyn!
6780cee679 Merge pull request #51072 from NousResearch/bb/desktop-computer-use
feat(computer-use): add a cross-platform readiness preflight to the desktop
2026-06-22 18:37:07 -05:00
Brooklyn Nicholson
3fffecbdaf feat(desktop): add timeline rail for long chat threads
Adds a compact right-edge prompt timeline for long desktop chat sessions, with hover previews, click-to-jump, active/hover row states, and pane hover-reveal suppression so the rail can live at the hard edge without opening side panels.
2026-06-22 18:34:07 -05:00
brooklyn!
9bacd7d4bb Merge pull request #51096 from NousResearch/bb/desktop-oversized-image-replay
fix(agent): shrink anthropic-native image history
2026-06-22 18:30:18 -05:00
brooklyn!
b90f1e4ac0 Merge pull request #51093 from NousResearch/bb/desktop-string-stack-overflow
fix(desktop): avoid stack overflow on embedded image replay
2026-06-22 18:26:34 -05:00
Brooklyn Nicholson
88e136448d fix(agent): shrink anthropic-native image history
Retry image-size rejections by rewriting Anthropic base64 image source blocks, not just OpenAI-style image_url parts.
2026-06-22 18:23:21 -05:00
Brooklyn Nicholson
a6b670d4a2 fix(desktop): avoid stack overflow on embedded image replay
Replace the giant embedded-image regex with a bounded scanner so opening sessions with multi-megabyte data URLs does not crash the renderer.
2026-06-22 18:19:36 -05:00
Brooklyn Nicholson
3c1058e2e9 fix(computer-use): set stdin=DEVNULL on cua-driver subprocess calls
The subprocess-stdin guard (TUI gateway fd-inheritance protection) flagged
the `permissions grant` call. None of the cua-driver probes/grant read
stdin, so DEVNULL is correct; apply it to the shared `_run` helper and the
grant call.
2026-06-22 17:59:18 -05:00
Brooklyn Nicholson
2dfcead683 feat(computer-use): make the preflight cross-platform (win/linux)
The card was macOS-only. cua-driver also runs on Windows and Linux, so
fold `cua-driver doctor` (cross-platform binary/health probes) into a
single OS-aware `ready` signal:

- macOS: ready == both TCC grants; keeps the permission rows + grant flow.
- Windows/Linux: no TCC toggles, so ready == driver health, with a
  per-OS note (SmartScreen/UIAccess on Windows; X11/XWayland on Linux).

`computer_use_status()` replaces the macOS-only `permissions_status()` and
surfaces `platform`, `ready`, `can_grant`, and the doctor `checks` (non-ok
ones render as warnings). CLI `permissions status`, the REST endpoint, and
the desktop card all key off the one payload. Grant stays macOS-only (400
elsewhere — nothing to grant).
2026-06-22 17:48:43 -05:00
Brooklyn Nicholson
807b696295 fix(computer-use): vision capture returns an image on cua-driver >=0.5.x
Vision mode called a `screenshot` MCP tool that cua-driver dropped in
0.5.x (full-window PNG capture was folded into `get_window_state`). The
driver replied "Unknown tool: screenshot", so `images` came back empty,
`png_b64` stayed None, and capture returned a 0x0 result with no image
on every call. `som`/`ax` were unaffected because they already use
`get_window_state`, which masked the regression.

Route vision by capability:
- driver advertises `screenshot` (older builds) -> use it (no AX walk)
- otherwise -> call `get_window_state` but discard the AX tree/elements,
  returning only the PNG so vision stays free of element noise
- capabilities not yet discovered -> try `screenshot`, fall back to
  `get_window_state` on an empty image, so the path self-heals

Add `_image_from_tool_result` to pull the PNG from either an MCP image
content-part or `structuredContent.screenshot_png_b64`, and use it on
the som path too so the image won't silently drop on driver builds that
deliver it via structuredContent instead of a content part.

Verified live (vision: 1568x954, 0 elements; som: image + 527 elements)
and with unit coverage of all four routing cases.
2026-06-22 17:41:42 -05:00
Brooklyn Nicholson
0223ea5f59 feat(computer-use): surface macOS permission preflight in the desktop
Computer Use already worked through the desktop backend (the cua-driver
toolset enables + installs via Settings -> Skills & Tools), but there was
no in-app way to see or grant the two macOS permissions it needs, so "give
a model my Mac" was tribal knowledge.

The grants attach to cua-driver's OWN TCC identity (com.trycua.driver /
the installed CuaDriver.app), not Hermes -- so no app entitlement is
involved. cua-driver 0.5+ exposes `permissions status/grant`, which we wrap:

- tools/computer_use/permissions.py: thin client over the two subcommands
- hermes computer-use permissions {status,grant}: CLI parity
- GET /api/tools/computer-use/status, POST .../permissions/grant: desktop REST
- ComputerUsePanel: live Accessibility + Screen Recording state with a
  Grant button (dialog attributed to CuaDriver), shown in the expanded
  Computer Use toolset row. Binary install stays in the existing provider
  post-setup runner.

Follow-ups: i18n the card copy; a "Stop driver" control (cua-driver stop)
for the runaway-`serve` case.
2026-06-22 17:33:52 -05:00
Teknium
87c4a5ebb8 feat(background-review): aux-model selector for the self-improvement review (#49252)
Adds auxiliary.background_review.{provider,model} (default auto = main chat
model — unchanged). Set it to a different, cheaper model and the post-turn
self-improvement review runs there for ~3-5x lower cost.

Cache-aware by design: the main chat is warm in the prompt cache, so the
default full-history replay on the main model is cheap cache reads — left
exactly as-is. A different model can't reuse that cache (different key), so
when (and only when) routed to a different model the fork replays a compact
digest instead of the full transcript, minimising what it cold-writes on the
aux model. Same model -> full replay; different model -> digest.

Quality holds in benchmarks: memory capture identical, skill near-identical.
Nothing changes unless you opt in by naming a different model.

Co-authored-by: Hermes Agent <noreply@nousresearch.com>
2026-06-22 14:54:53 -07:00
Teknium
660e36f097 fix(cron): scope job execution to its owning profile (#32091 follow-up) (#50993)
The #32091 fix moved every profile's cron jobs into one shared root store,
but never wired the execution-scoping half it recommended: a job still ran
under whichever profile's ticker picked it up, not its owning profile. So a
job created under `hermes -p donna` could execute with the root profile's
.env / config.yaml / credentials.

- jobs.py: create_job auto-captures the active profile (explicit profile=
  override available) and stores it on the job; resolve_profile_home() maps a
  profile name to its HERMES_HOME; legacy jobs backfill to 'default'.
- scheduler.py: run_job applies the job's profile via a scoped HERMES_HOME
  override (env var + in-process ContextVar) before any .env/config/script
  load, restored in finally. tick() routes profile-mismatched jobs to the
  single-worker sequential pool so the env mutation can't race.
- cronjob tool threads profile through (NOT exposed in the model schema, to
  avoid cross-profile privilege escalation); hermes cron add gains --profile.

E2E verified against a temp HERMES_HOME with a real profile dir: a root-profile
ticker runs a profile='donna' job with HERMES_HOME=donna during execution and
restores the ticker env afterward.
2026-06-22 14:54:28 -07:00
Tranquil-Flow
15880da8bb fix(file_tools): resolve tilde using profile home for file operations (#48552)
File tools (read_file, write_file, patch, list_directory, etc.) used
os.path.expanduser() which reads the gateway process HOME env var.
In Docker/systemd/s6 deployments where the gateway HOME differs from
interactive sessions, tilde expanded to the wrong directory.

Add _expand_tilde() helper that delegates to get_subprocess_home() when
available, falling back to os.path.expanduser(). Replace all 9
expanduser() call sites in file_tools.py with _expand_tilde().
2026-06-23 03:17:47 +05:30
kshitijk4poor
c080b2dc3e fix(gateway): redact credentials from TUI approval prompts (#48456)
Follow-up to #50767, which redacted the chat-platform (_approval_notify_sync)
and SSE/API (_approval_notify) approval transports. The TUI JSON-RPC transport
is the third egress and was missed: three register_gateway_notify callbacks in
tui_gateway/server.py emitted the raw approval_data — including the unredacted
command Tirith flagged — straight to the TUI client via _emit.

Route all three registrations through a new module-level _emit_approval_request()
helper that redacts payload['command'] via the shared
gateway.run._redact_approval_command seam before emitting, matching the pattern
used for the other two transports. Completes the whole-bug-class fix for #48456.

Tests: assert the helper emits a redacted command (real credential pattern),
handles missing/None command, and a wiring guard that no registration emits the
raw payload directly (only the helper may). Both mutation-checked.

The #48456 fix series originated from @liuhao1024's #48462 — credit to them for
the original report and chat-platform fix; this completes the remaining transport.

Co-authored-by: liuhao1024 <sunsky.lau@gmail.com>
2026-06-23 03:14:18 +05:30
kshitijk4poor
0e69cd4b37 fix(memory): honor configured char limits in the no-agent on-disk store
Follow-up to the /memory approve fresh-store fix. Both the CLI fallback and
the messaging-gateway handler built a bare MemoryStore() with the hardcoded
default char limits (2200/1375), ignoring the user's configured
memory.memory_char_limit / user_char_limit. A live agent honors those
overrides (agent/agent_init.py), so an approval applied without a live agent
could accept a write the user's lower cap would reject, or vice versa.

Extract a shared tools.memory_tool.load_on_disk_store() factory that reads
the configured limits (falling back to defaults if config can't load) and
wire both the CLI and gateway handlers to it, closing the gap on both
surfaces and de-duplicating the construction block.
2026-06-23 03:10:53 +05:30
Max Hsu
3147cbb136 fix(memory): apply /memory approve against a fresh store when no live agent
The CLI /memory slash handler (cli_commands_mixin._handle_memory_command)
passed self.agent._memory_store straight through, which is None when the
command runs without a live agent — e.g. /memory approve from the Desktop
GUI. The shared write-approval handler then returns "memory store
unavailable" and applies nothing, even with built-in memory enabled and
pending writes present.

Fall back to a freshly loaded on-disk MemoryStore when no live store is
available, mirroring the gateway path (gateway/slash_commands.py). It
persists to the same MEMORY/USER.md and creates MEMORY.md on the first
approved write.

Fixes #46783

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-23 03:10:53 +05:30
kshitijk4poor
100e7be20e fix(security): deny root-level credential stores in media delivery
The media-delivery denylist in gateway/platforms/base.py enumerated only
.env/auth.json/credentials/config.yaml under HERMES_HOME, so other
credential stores that live at the root fell through and could be
auto-attached to chat replies. The reported case: the Google Workspace
skill's google_token.json refreshes every turn, bumping its mtime to
'now', which kept passing the strict-mode recency window and re-sent the
OAuth token on every reply.

Extend the explicit per-file denylist to mirror the canonical credential
set already enforced by the read/write guards in agent/file_safety.py:
google_token.json, google_oauth_pending.json, auth/google_oauth.json,
.anthropic_oauth.json, webhook_subscriptions.json, cache/bws_cache.json,
auth.lock, and the pairing/ token directory.

Targeted per-file additions (not a blanket ~/.hermes deny, which was
declined in #32090/#34425 because it would block skills/, logs/, and
ad-hoc agent-written deliverables). mcp-tokens/ (#37222) and
state.db/kanban.db (#41071) are left to their sibling targeted PRs.

Reported-by: xxxigm (#50912)
2026-06-23 02:56:48 +05:30
Teknium
e9b86f352f fix(discord): delete obsolete slash commands before creating new ones
Discord enforces a hard 100-command limit per app and rejects an upsert that would push the live total over 100 (error 30032), which silently breaks ALL slash commands. The sync deleted obsolete commands AFTER creating new ones, so an app already at the cap momentarily exceeded it and the whole sync failed.

Reorder: delete no-longer-desired commands up front, then create/update. Removes the now-redundant trailing delete loop. Adapts @infinitycrew39 PR #50890 to current main (the original adapter diff no longer applied after the platform refactor); test commit cherry-picked with authorship preserved.
2026-06-22 13:58:33 -07:00
infinitycrew39
91c465f6e7 test(discord): add regression test for 100-command sync limit
Add a test to verify that _safe_sync_slash_commands deletes obsolete
commands before creating new ones. This ensures we never temporarily
exceed Discord's 100-command limit during sync, which would trigger
error 30032 and break all slash commands.

This test guards against the regression where sync could fail even though
the registration cap was properly enforced.
2026-06-22 13:58:33 -07:00
helix4u
ae7e857420 fix(cron): deliver max-iteration fallback reports 2026-06-22 13:57:59 -07:00