openclaw/src/memory at a98d7c26df0285891c51e3df129c3e2c2f867060 - openclaw - Gitea: Git with a cup of tea

Mirrors/openclaw

mirror of https://github.com/openclaw/openclaw.git synced 2026-03-30 13:58:33 +02:00

Files

History

Rodrigo Uroz 7f1712c1ba (fix): enforce embedding model token limit to prevent overflow (#13455 )

* fix: enforce embedding model token limit to prevent 8192 overflow

- Replace EMBEDDING_APPROX_CHARS_PER_TOKEN=1 with UTF-8 byte length
  estimation (safe upper bound for tokenizer output)
- Add EMBEDDING_MODEL_MAX_TOKENS=8192 hard cap
- Add splitChunkToTokenLimit() that binary-searches for the largest
  safe split point, with surrogate pair handling
- Add enforceChunkTokenLimit() wrapper called in indexFile() after
  chunkMarkdown(), before any embedding API call
- Fixes: session files with large JSONL entries could produce chunks
  exceeding text-embedding-3-small's 8192 token limit

Tests: 2 new colocated tests in manager.embedding-token-limit.test.ts
- Verifies oversized ASCII chunks are split to <=8192 bytes each
- Verifies multibyte (emoji) content batching respects byte limits

* fix: make embedding token limit provider-aware

- Add optional maxInputTokens to EmbeddingProvider interface
- Each provider (openai, gemini, voyage) reports its own limit
- Known-limits map as fallback: openai 8192, gemini 2048, voyage 32K
- Resolution: provider field > known map > default 8192
- Backward compatible: local/llama uses fallback

* fix: enforce embedding input size limits (#13455) (thanks @rodrigouroz)

---------

Co-authored-by: Tak Hoffman <781889+Takhoffman@users.noreply.github.com>

2026-02-10 20:10:17 -06:00

..

backend-config.test.ts

Memory: harden QMD startup, timeouts, and fallback recovery

2026-02-07 17:55:34 -08:00

backend-config.ts

Memory: harden QMD startup, timeouts, and fallback recovery

2026-02-07 17:55:34 -08:00

batch-gemini.ts

chore: Enable "experimentalSortImports" in Oxfmt and reformat all imorts.

2026-02-01 10:03:47 +09:00

batch-openai.ts

chore: Enable "experimentalSortImports" in Oxfmt and reformat all imorts.

2026-02-01 10:03:47 +09:00

batch-voyage.test.ts

fix(memory): add input_type to Voyage AI embeddings for improved retrieval (#10818 )

2026-02-06 21:55:09 -06:00

batch-voyage.ts

Centralize date/time formatting utilities (#11831 )

2026-02-08 04:53:31 -08:00

embedding-chunk-limits.ts

(fix): enforce embedding model token limit to prevent overflow (#13455 )

2026-02-10 20:10:17 -06:00

embedding-input-limits.ts

(fix): enforce embedding model token limit to prevent overflow (#13455 )

2026-02-10 20:10:17 -06:00

embedding-model-limits.ts

(fix): enforce embedding model token limit to prevent overflow (#13455 )

2026-02-10 20:10:17 -06:00

embeddings-gemini.ts

(fix): enforce embedding model token limit to prevent overflow (#13455 )

2026-02-10 20:10:17 -06:00

embeddings-openai.ts

(fix): enforce embedding model token limit to prevent overflow (#13455 )

2026-02-10 20:10:17 -06:00

embeddings-voyage.test.ts

Centralize date/time formatting utilities (#11831 )

2026-02-08 04:53:31 -08:00

embeddings-voyage.ts

(fix): enforce embedding model token limit to prevent overflow (#13455 )

2026-02-10 20:10:17 -06:00

embeddings.test.ts

fix: L2-normalize local embedding vectors to fix semantic search (#5332 )

2026-02-01 22:56:44 -05:00

embeddings.ts

(fix): enforce embedding model token limit to prevent overflow (#13455 )

2026-02-10 20:10:17 -06:00

headers-fingerprint.ts

chore: Enable "curly" rule to avoid single-statement if confusion/errors.

2026-01-31 16:19:20 +09:00

hybrid.test.ts

chore: Enable "experimentalSortImports" in Oxfmt and reformat all imorts.

2026-02-01 10:03:47 +09:00

hybrid.ts

chore: Enable "curly" rule to avoid single-statement if confusion/errors.

2026-01-31 16:19:20 +09:00

index.test.ts

chore: Enable "experimentalSortImports" in Oxfmt and reformat all imorts.

2026-02-01 10:03:47 +09:00

index.ts

Add more tests; make fall back more resilient and visible

2026-02-02 23:45:05 -08:00

internal.test.ts

fix: remap session JSONL chunk line numbers to original source positions (#12102 )

2026-02-10 18:09:24 -06:00

internal.ts

fix: remap session JSONL chunk line numbers to original source positions (#12102 )

2026-02-10 18:09:24 -06:00

manager-cache-key.ts

chore: Enable "experimentalSortImports" in Oxfmt and reformat all imorts.

2026-02-01 10:03:47 +09:00

manager-search.ts

chore: Enable "experimentalSortImports" in Oxfmt and reformat all imorts.

2026-02-01 10:03:47 +09:00

manager.async-search.test.ts

chore: Enable "experimentalSortImports" in Oxfmt and reformat all imorts.

2026-02-01 10:03:47 +09:00

manager.atomic-reindex.test.ts

chore: Enable "experimentalSortImports" in Oxfmt and reformat all imorts.

2026-02-01 10:03:47 +09:00

manager.batch.test.ts

chore: Enable "experimentalSortImports" in Oxfmt and reformat all imorts.

2026-02-01 10:03:47 +09:00

manager.embedding-batches.test.ts

chore: Enable "experimentalSortImports" in Oxfmt and reformat all imorts.

2026-02-01 10:03:47 +09:00

manager.embedding-token-limit.test.ts

(fix): enforce embedding model token limit to prevent overflow (#13455 )

2026-02-10 20:10:17 -06:00

manager.sync-errors-do-not-crash.test.ts

chore: Enable "experimentalSortImports" in Oxfmt and reformat all imorts.

2026-02-01 10:03:47 +09:00

manager.ts

(fix): enforce embedding model token limit to prevent overflow (#13455 )

2026-02-10 20:10:17 -06:00

manager.vector-dedupe.test.ts

chore: Enable "experimentalSortImports" in Oxfmt and reformat all imorts.

2026-02-01 10:03:47 +09:00

memory-schema.ts

chore: Enable "curly" rule to avoid single-statement if confusion/errors.

2026-01-31 16:19:20 +09:00

node-llama.ts

fix: make node-llama-cpp optional

2026-01-15 18:37:02 +00:00

openai-batch.ts

feat(memory): add gemini batches + safe reindex

2026-01-18 16:12:10 +00:00

provider-key.ts

chore: Enable "experimentalSortImports" in Oxfmt and reformat all imorts.

2026-02-01 10:03:47 +09:00

qmd-manager.test.ts

fix(memory/qmd): scope query to managed collections (#11645 )

2026-02-09 23:35:27 -08:00

qmd-manager.ts

fix(memory/qmd): scope query to managed collections (#11645 )

2026-02-09 23:35:27 -08:00

search-manager.test.ts

Memory: add SQLITE_BUSY fallback regression test

2026-02-07 17:55:34 -08:00

search-manager.ts

Memory: make QMD cache eviction callback idempotent

2026-02-07 17:55:34 -08:00

session-files.test.ts

fix: remap session JSONL chunk line numbers to original source positions (#12102 )

2026-02-10 18:09:24 -06:00

session-files.ts

fix: remap session JSONL chunk line numbers to original source positions (#12102 )

2026-02-10 18:09:24 -06:00

sqlite-vec.ts

chore(gate): fix lint and formatting

2026-01-18 06:01:25 +00:00

sqlite.ts

Centralize date/time formatting utilities (#11831 )

2026-02-08 04:53:31 -08:00

status-format.ts

chore: Enable "curly" rule to avoid single-statement if confusion/errors.

2026-01-31 16:19:20 +09:00

sync-memory-files.ts

chore: Enable "experimentalSortImports" in Oxfmt and reformat all imorts.

2026-02-01 10:03:47 +09:00

sync-session-files.ts

chore: Enable "experimentalSortImports" in Oxfmt and reformat all imorts.

2026-02-01 10:03:47 +09:00

types.ts

Fix build regressions after merge

2026-02-02 23:45:05 -08:00