Commit Graph

4673 Commits

Author SHA1 Message Date
Peter Steinberger
ca200eb480 fix(providers): stabilize runtime normalization hooks 2026-04-04 19:34:56 +01:00
Peter Steinberger
e3ac0f43df feat(qwen): add qwen provider and video generation 2026-04-04 19:34:56 +01:00
Vincent Koc
6f2e804182 fix(agents): prefer background completion wake over polling (#60877)
* fix(agents): prefer completion wake over polling

* fix(changelog): note completion wake guidance

* fix(agents): qualify quiet exec completion wake

* fix(agents): qualify disabled exec completion wake

* fix(agents): split process polling from control actions
2026-04-05 03:17:10 +09:00
Peter Steinberger
83fe8efe3d fix(test): isolate ollama runtime test seams 2026-04-04 19:03:00 +01:00
Ayaan Zaidi
ef7c84ae92 style: trim live model switch comment noise 2026-04-04 22:42:30 +05:30
Ayaan Zaidi
e4bd4b8b49 style(agents): trim exec routing comments 2026-04-04 22:41:22 +05:30
Ayaan Zaidi
cde1e2d3a1 fix: preserve compaction split after trailing tool results 2026-04-04 22:34:05 +05:30
Ayaan Zaidi
3f7bd3bd7b fix: split before unfinished compaction tool turns 2026-04-04 22:30:27 +05:30
wangchunyue
08992e1dbc fix: keep tool calls paired during compaction (#58849) (thanks @openperf)
* fix(compaction): keep tool_use and toolResult together when splitting messages

* fix: keep displaced tool results in compaction chunks

* fix: keep tool calls paired during compaction (#58849) (thanks @openperf)

* fix: avoid stalled compaction splits on aborted tool calls (#58849) (thanks @openperf)

---------

Co-authored-by: Ayaan Zaidi <hi@obviy.us>
2026-04-04 22:12:43 +05:30
openperf
d98eaba4c3 fix(agents): resolve exec host=node routing regression and elevated gateway override 2026-04-04 21:59:57 +05:30
wangchunyue
17f086c021 fix: handle subagent live model switches (#58178) (thanks @openperf)
* fix(agents): handle LiveSessionModelSwitchError in subagent execution

Add retry loop for cross-provider model switches in the subagent
command path, mirroring the existing logic in agent-runner-execution.ts.

- Wrap runWithModelFallback in a while(true) loop inside agentCommandInternal
- Catch LiveSessionModelSwitchError and update provider, model,
  fallbackProvider, fallbackModel, providerForAuthProfileValidation,
  sessionEntry.authProfileOverride, and storedModelOverride before retrying
- Guard storedModelOverride update: only set when the model genuinely
  changed (compared before mutation) or a session override already existed
- Reset lifecycleEnded flag so the retried iteration can emit lifecycle events
- Add comprehensive tests covering retry success, error propagation,
  lifecycle reset, auth-profile forwarding, and fallback override state

Fixes #57998

* fix(agents): include provider change in storedModelOverride guard

* fix(agents): validate allowlist and clear stale compaction count on live model switch

* fix(agents): remove broken allowlist guard on live model switch

* fix(agents): address security review — bound retry loop, validate allowlist, redact error in lifecycle events

* fix(agents): restore error observability in lifecycle events using err.message

* fix(agents): sanitize log inputs and shallow-copy sessionEntry on live model switch

* fix(agents): enforce allowlist on empty set and sanitize error message

* fix: handle subagent live model switches (#58178) (thanks @openperf)

---------

Co-authored-by: Ayaan Zaidi <hi@obviy.us>
2026-04-04 21:56:11 +05:30
Altay
ae460eff84 fix(failover): scope openrouter-specific matchers (#60909) 2026-04-04 18:24:03 +03:00
Aaron Zhu
983909f826 fix(agents): classify generic provider errors for failover (#59325)
* fix(agents): classify generic provider errors for failover

Anthropic returns bare 'An unknown error occurred' during API instability
and OpenRouter wraps upstream failures as 'Provider returned error'. Neither
message was recognized by the failover classifier, so the error surfaced
directly to users instead of triggering the configured fallback chain.

Add both patterns to the serverError classifier so they are classified as
transient server errors (timeout) and trigger model failover.

Closes #49706
Closes #45834

* fix(agents): scope unknown-error failover by provider

* docs(changelog): note provider-scoped unknown-error failover

---------

Co-authored-by: Aaron Zhu <aaron@Aarons-MacBook-Air.local>
Co-authored-by: Altay <altay@uinaf.dev>
2026-04-04 18:11:46 +03:00
Peter Steinberger
4dbc66b1ed fix: remove bundled channel startup reentry 2026-04-04 15:39:12 +01:00
Peter Steinberger
b9201e8333 refactor: share announce test runtime seams 2026-04-04 23:38:36 +09:00
Peter Steinberger
f5cc6a101b style: reflow system prompt tool summary 2026-04-04 23:36:46 +09:00
Peter Steinberger
c09e128587 fix(gateway): include talk secrets in CLI pairing defaults (#56481) (thanks @maxpetrusenko) 2026-04-04 23:18:54 +09:00
Max P
8262078ee5 fix(agents): inherit completion announce delivery target (#56481) 2026-04-04 23:18:54 +09:00
Vincent Koc
1e7f9e8746 test(providers): cover transport family matrix 2026-04-04 23:14:02 +09:00
Peter Steinberger
e509c5c3ea fix(ci): avoid readonly embedded session mutation 2026-04-04 15:09:50 +01:00
Peter Steinberger
470b4452ce fix(ci): drop stale browser runtime imports 2026-04-04 15:09:49 +01:00
Peter Steinberger
dd771f1dc6 fix(ci): repair plugin boundary and bootstrap regressions 2026-04-04 15:09:48 +01:00
Peter Steinberger
c5c5c77ebb fix(ci): restore contract-safe core imports 2026-04-04 15:09:48 +01:00
Rockcent
b2f972e364 fix(failover): OpenRouter 403 Key limit exceeded triggers billing fallback (#59892)
Merged via squash.

Prepared head SHA: 7f8265231c
Co-authored-by: rockcent <128210877+rockcent@users.noreply.github.com>
Co-authored-by: altaywtf <9790196+altaywtf@users.noreply.github.com>
Reviewed-by: @altaywtf
2026-04-04 17:03:21 +03:00
Vincent Koc
9cc300be78 fix(ci): restore main follow-up checks 2026-04-04 22:51:31 +09:00
Peter Steinberger
f9717f2eae fix(agents): align runtime with updated deps 2026-04-04 22:40:08 +09:00
Vincent Koc
71ea82a4f4 fix(build): restore portable provider runtime types 2026-04-04 21:56:48 +09:00
Vincent Koc
b3faf20d91 perf(agents): avoid repeated subagent registry rescans 2026-04-04 21:56:48 +09:00
Vincent Koc
b742909dca fix(agents): prefer cron for deferred follow-ups (#60811)
* fix(agents): prefer cron for deferred follow-ups

* fix(agents): gate cron scheduling guidance

* fix(changelog): add scheduling guidance note

* fix(agents): restore exec approval agent hint
2026-04-04 21:11:27 +09:00
Vincent Koc
c75f82448f fix(google-cli): parse gemini json response and stats (#60801)
* fix(google-cli): restore gemini json reporting

* fix(google-cli): fall back to stats when usage is empty

* fix(changelog): note gemini cli cache reporting
2026-04-04 20:27:22 +09:00
Peter Steinberger
3dda75894b refactor(agents): centralize run wait helpers 2026-04-04 20:22:16 +09:00
moktamd
2701e75f40 fix: disable thinking for MiniMax anthropic-messages streaming
MiniMax M2.7 returns reasoning_content in OpenAI-style delta chunks
({delta: {content: "", reasoning_content: "..."}}) when thinking is
active, rather than native Anthropic thinking block SSE events. Pi-ai's
Anthropic provider does not handle this format, causing the model's
internal reasoning to appear as visible chat output.

Add createMinimaxThinkingDisabledWrapper that injects
thinking: {type: "disabled"} into the outgoing payload for any MiniMax
anthropic-messages request where thinking is not already explicitly
configured, preventing the provider from generating reasoning_content
deltas during streaming.

Fixes #55739
2026-04-04 20:21:37 +09:00
Peter Steinberger
1037af01ad style(agents): normalize runtime prompt formatting 2026-04-04 12:19:08 +01:00
Peter Steinberger
f3aad63f4e style(providers): normalize import and wrap formatting 2026-04-04 12:19:08 +01:00
Peter Steinberger
3207c5326a refactor: share native streaming compat helpers 2026-04-04 12:18:45 +01:00
Peter Steinberger
605f48556b refactor(browser): share lifecycle cleanup helpers 2026-04-04 12:17:46 +01:00
Peter Steinberger
c3f415ad6e fix: preserve node system.run approval plans 2026-04-04 20:16:53 +09:00
Peter Steinberger
53c33f8207 fix: forward node exec approval plans 2026-04-04 20:16:19 +09:00
Jasmine Zhang
b838ecf885 fix: add 60s timeout to MiniMax VLM fetch call
The VLM image analysis fetch had no timeout, causing sessions to hang
indefinitely when the MiniMax API is slow or unresponsive. Other
vision/model API calls in the codebase already use timeouts. Adds
AbortSignal.timeout(60_000) consistent with image upload workloads.

Fixes #54139
2026-04-04 20:15:13 +09:00
Peter Steinberger
91bac7cb83 fix(usage): restore provider auth fallback 2026-04-04 12:10:45 +01:00
Peter Steinberger
9b352ab5b0 test: isolate session status from provider runtime leak 2026-04-04 12:08:05 +01:00
Peter Steinberger
b7411ad594 refactor(cron): share descendant run quiescence wait 2026-04-04 20:07:33 +09:00
Peter Steinberger
7b6334b0f4 refactor(agents): share run wait reply helpers 2026-04-04 20:07:33 +09:00
Vincent Koc
d766465e38 fix(google): add direct cachedContent support (#60757)
* fix(google): restore gemini cache reporting

* fix(google): split cli parsing into separate PR

* fix(google): drop remaining cli overlap

* fix(google): honor cachedContent alias precedence
2026-04-04 20:07:13 +09:00
Peter Steinberger
3b09b58c5d test: cover browser cleanup for cron and subagents (#60146) (thanks @BrianWang1990) 2026-04-04 20:03:57 +09:00
BrianWang1990
72b2e413d6 fix(browser): clean up browser tabs/processes when cron tasks and subagents complete
When cron tasks or subagents use browser automation, the browser
processes were not cleaned up after the task completed. This caused
orphaned Chrome processes (PPID=1) to accumulate over time.

Root cause: closeTrackedBrowserTabsForSessions was only called during
session-reset/session-delete (via ensureSessionRuntimeCleanup), but
isolated cron runs and subagent completions never triggered these paths.

Fix: Add browser tab cleanup in two places:
1. server-cron.ts: wrap runCronIsolatedAgentTurn in try/finally to
   ensure browser tabs are cleaned up after every cron run.
2. subagent-registry-lifecycle.ts: call closeTrackedBrowserTabsForSessions
   when a subagent run completes, before the announce cleanup flow.

Both cleanup calls are best-effort (caught errors) so they never mask
the actual task result or break the completion flow.

Fixes #60104
2026-04-04 20:03:57 +09:00
Peter Steinberger
0b1c9c7057 fix: stabilize codex auth ownership and ws fallback cache 2026-04-04 20:03:15 +09:00
Peter Steinberger
e4ea3c03cf fix: scope live model switch pending state (#60266) (thanks @kiranvk-2011) 2026-04-04 19:45:53 +09:00
kiranvk2011
b36a3a3295 fix: add .catch() to fire-and-forget stale-flag clear to prevent unhandled rejection 2026-04-04 19:45:53 +09:00
kiranvk2011
e8f6ceedd4 fix: clear stale liveModelSwitchPending flag when model already matches
When the liveModelSwitchPending flag is set but the current model already
matches the persisted selection (e.g. the switch was applied as an override
and the current attempt is already using the new model), the flag is now
consumed eagerly via a fire-and-forget clearLiveModelSwitchPending() call.

Without this, the stale flag could persist across fallback iterations and
later cause a spurious LiveSessionModelSwitchError when the model rotates
to a fallback candidate that differs from the persisted selection.

Also expands JSDoc on shouldSwitchToLiveModel to document the stale-flag
clearing and deferral semantics.
2026-04-04 19:45:53 +09:00