- docs/tools/tts.md: alphabetize providers in three places that listed
them: the supported-providers table (Azure Speech ... Xiaomi MiMo),
the configuration Tabs (12 provider presets in A-Z), and the field
reference AccordionGroup. Top-level fields stay first; provider
tabs/accordions follow strict alphabetical order. Wording, schema,
and defaults unchanged.
- docs/docs.json: add tools/tts to the main Tools sidebar group
(slotted between trajectory and video-generation, matching the
alphabetical neighborhood with image-generation, music-generation,
video-generation). Previously tts only appeared under
Nodes > Media capabilities, which was a discoverability gap for
readers looking for TTS alongside the other generation tools.
The TTS doc had grown to 1008 lines with 11 separate flat 'X primary'
config blocks, a 100-line dense 'Notes on fields' bullet list, and
the new provider-personas feature (#70748) buried near the bottom.
Restructure for readability and feature visibility:
- Lead with a Steps-based 'Quick start' so first-time readers can
enable TTS in 4 explicit steps.
- Replace the 13-bullet provider list with a single 'Supported
providers' table that names auth env vars and per-provider notes
inline. Add a Warning callout for the Microsoft/edge legacy alias.
- Collapse the 11 'X primary' config blocks into one Tabs component
('OpenAI + ElevenLabs', 'Google Gemini', 'Azure Speech',
'Microsoft (no key)', 'MiniMax', 'Inworld', 'xAI', 'Volcengine',
'Xiaomi MiMo', 'OpenRouter', 'Gradium', 'Local CLI') so users see
one preset at a time and the page is scannable.
- Promote 'Personas' to its own top-level section with two examples
(minimal and the Alfred provider-neutral persona), and add a new
'How providers use persona prompts' AccordionGroup covering Google
(promptTemplate audio-profile-v1, personaPrompt), OpenAI
(instructions auto-mapping), and Other providers, plus a fallback
policy table.
- Note that agents.list[].tts.persona overrides global persona
per-agent (covers the recent feat(tts) per-agent voice-override
work).
- Convert the 100-line 'Notes on fields' wall into a per-provider
AccordionGroup using ParamField, so the field reference is
scannable and field types/defaults are visually distinct.
- Sentence-case headings, drop redundant body H1, fold the flow
diagram inline with Auto-TTS behavior, and refresh the Output
formats section to a table-first layout.
- Schema fields (label/description/provider/fallbackPolicy/prompt
with profile/scene/sampleContext/style/accent/pacing/constraints
and providers map) verified against src/config/types.tts.ts; all
defaults and env-var fallbacks preserved verbatim.
Net diff: 585 insertions, 684 deletions across the same surface
area.
Two recent commits added user-facing surface that left signature-style
references in docs stale:
- 4428661779 Alvin Tang (#20721, thanks @alvinttang) extends the
configured model 'input' modality set to also accept 'audio' and
'video', matching what providers like LM Studio already report.
docs/plugins/manifest.md model-fields table listed only
'text | image | document', so add 'audio' and 'video'.
- 44da034516 Vincent (thanks @oc-factus) adds a bounded openclaw.agent
attribute on the openclaw.tokens counter so per-agent dashboards can
group usage. docs/gateway/opentelemetry.md metric reference omitted
it; add it to the attrs list.
Honor the parent `models auth --agent <id>` flag across auth write commands: `add`, `login`, `setup-token`, `paste-token`, and `login-github-copilot`.
The auth helpers now resolve the requested configured agent before choosing the auth-profile store and provider workspace, while preserving default-agent behavior when `--agent` is omitted.
Validation:
- `pnpm test src/cli/models-cli.test.ts src/commands/models/auth.test.ts`
- `pnpm test src/commands/models/auth.test.ts`
- `pnpm docs:check-mdx`
- `pnpm check:changed`
- `pnpm check`
- `pnpm build`
- `pnpm test src/cli/run-main.test.ts`
Full `pnpm test` was also run; it failed in unrelated `src/cli/run-main.test.ts` assertions during the full-suite order, while the exact file passes on both latest main and this branch. The PR diff only touches models auth CLI/auth files, docs, and changelog.
Fixes#71864.
Thanks @neeravmakwana.
* fix: add placeholder transcript for silent voice notes
* fix: handle placeholder transcripts per skipped attachment
* fix: preserve synthetic transcript attachment order
* fix: scope synthetic audio merge to audio slice only, preserve cross-capability and prefer ordering
Replace the global outputs.sort() with a targeted merge that:
1. Only sorts within the audio output slice (real + synthetic),
preserving CAPABILITY_ORDER and per-capability attachments.prefer
ordering for non-audio outputs.
2. Excludes synthetic placeholder indexes from audioAttachmentIndexes
used by extractFileBlocks, so tiny audio-MIME files with text
extensions can still be recovered via forcedTextMime.
Adds mergeAudioOutputsPreservingAttachmentOrder helper.
* fix: remove unused function and use toSorted() for oxlint compliance
* fix(media-understanding): preserve selected audio order for synthetic placeholders
- merge synthetic skipped-audio placeholders using audio decision order
instead of raw attachmentIndex sorting, preserving attachments.prefer
- insert synthetic-only audio outputs at the audio capability slot
(before video) when no real audio outputs were produced
* fix(media-understanding): use neutral too-small placeholder text
Clarify that this synthetic transcript path is triggered by attachment size,
not by a silence/no-speech detection result.
* test(media-understanding): update too-small audio placeholder expectations
* test(media-understanding): cover mixed too-small audio placeholder
* test(media-understanding): cover too-small audio context
* fix(tasks): preserve visible task title before internal context
* Revert "fix(tasks): preserve visible task title before internal context"
This reverts commit dc536fb4d3c8a01168de5d05e8562193dd68a88e.
---------
Co-authored-by: Eulices Lopez <eulices@users.noreply.github.com>
Co-authored-by: Peter Steinberger <steipete@gmail.com>
Terminalize Gateway-backed async task records from the run result while preserving aborted, failed, cancelled, and lost outcomes.\n\nThanks @likewen-tech.
Ayaan's 28e4cd81a9 (#70863, thanks @bidadh, source from Arthur Kazemi
8abbae0101) extended params.context1m:true so the configured 1M
context window override now applies to eligible Claude CLI Opus and
Sonnet models, not only direct API calls. CHANGELOG entry covered
the change but docs/providers/anthropic.md '1M context window (beta)'
Accordion only described direct-API behavior, so Claude CLI users had
no signal the same param works for their backend. Add a sentence
inside the same Accordion.