Files
openclaw/docs/reference
aalekh-sarvam d40dd9088e feat(memory): configurable local embedding contextSize (default 4096) (#70544)
node-llama-cpp defaults contextSize to "auto", which on large embedding
models like Qwen3-Embedding-8B (trained context 40,960) inflates gateway
VRAM from ~8.8 GB to ~32 GB and causes OOM on single-GPU hosts that share
the gateway with an LLM runtime.

Expose memorySearch.local.contextSize in openclaw.json (number | "auto"),
default to 4096 which comfortably covers typical memory-search chunks
(128–512 tokens) while keeping non-weight VRAM bounded.

Closes #69667.
2026-04-23 14:21:53 -07:00
..