Four pages started with weak meta-descriptions ('This page covers...')
that restate the frontmatter summary. Replace with direct content-first
openings, and sentence-case a stray 'Slash Commands' link in
configuration-reference.
- concepts/streaming.md: remove '# Streaming + chunking'.
- reference/session-management-compaction.md: remove Title Case H1
'# Session Management & Compaction (Deep Dive)'.
- plugins/voice-call.md: remove '# Voice Call (plugin)'.
CLI pages keep their command-formatted body H1s since that is the repo
convention and the formatting is not expressible in frontmatter.
Sweep recent (last ~5h) doc edits for two readability/uniformity issues:
- Replace 42 path-as-text links of the form '[/foo/bar](/foo/bar)' with
descriptive labels derived from each target page's frontmatter title
(e.g. '[Anthropic]', '[Token use and costs]', '[OpenAI-compatible
endpoints]'). Affected files include gateway/troubleshooting,
concepts/oauth, reference/session-management-compaction, and
reference/transcript-hygiene.
- Sentence-case Title-Cased headings and link text in Related sections
across codex-harness, model-providers, tools/plugin, sdk-runtime,
sdk-setup, prompt-caching, ci, cli/config, google-meet, browser,
rich-output-protocol, subagents, web/control-ui, while preserving
brand and proper-noun capitalization (OpenAI, Codex, Chrome, Parallels,
Z.AI, etc.).
Keep WebChat runtime context available to the model while persisting only the transcript-facing user prompt across gateway, CLI, queued follow-up, and embedded Pi paths.
Adds regression coverage for history sanitization, CLI transcript persistence, media-only auto-reply prompts, and embedded Pi prompt rewrite against a real SessionManager file.
Co-authored-by: 91wan <91wan@users.noreply.github.com>
Port the Codex app-server harness onto the context-engine lifecycle, add Codex context projection and compaction integration, and cover bootstrap/history/compaction fallback behavior.
Thanks @jalehman.
node-llama-cpp defaults contextSize to "auto", which on large embedding
models like Qwen3-Embedding-8B (trained context 40,960) inflates gateway
VRAM from ~8.8 GB to ~32 GB and causes OOM on single-GPU hosts that share
the gateway with an LLM runtime.
Expose memorySearch.local.contextSize in openclaw.json (number | "auto"),
default to 4096 which comfortably covers typical memory-search chunks
(128–512 tokens) while keeping non-weight VRAM bounded.
Closes#69667.
Replace legacy qrcode-terminal usage with shared qrcode-tui media helpers, bound QR PNG rendering options, and raise bundled plugin host floors for the new SDK runtime surface.