Smart text trimming to keep input within LLM context window token limits.
A Dify tool plugin that ensures text fits within LLM context window limits
via intelligent extractive summarization. Supports Chinese (Simplified &
Traditional), Japanese, and English text.
Locally deployed or resource-constrained LLM instances often have a smaller
effective context window than the model's official specification — due to
hardware limits (GPU VRAM), concurrency requirements, or serving parameters
like . When input text exceeds the window, the LLM fails
with a context-length error.
This plugin acts as a pre-processing guard: it measures the input, and if it
exceeds a user-configured threshold, trims it by extracting only the
highest-scoring sentences — before the text ever reaches the LLM. No API
keys, no network calls, no external dependencies.
is a character count threshold — the plugin checks
against it. It does not measure tokens.
As a rough guide, using the Qwen3 BBPE tokenizer (~151K vocab; Qwen3 Technical Report, 2025):
Token-to-character ratios vary across tokenizers (source: TokLens, ACL 2026 SRW).
Always verify with your specific model when precise budgeting is critical.
Reserve ~80% of the context window for input text, leaving headroom for
prompt templates and output generation.
The plugin interface supports four locales:
All parameter labels, descriptions, and option values are translated across
the supported locales.
Text processing handles Chinese, Japanese, and English, with CJK-aware
sentence splitting and abbreviation protection.
This plugin is not a replacement for LLM summarization. It selects complete
verbatim sentences from the original text using extractive summarization, a
long-established NLP approach — it never rewrites or paraphrases.
This plugin uses extractive summarization — Python standard library only,
no external NLP dependencies. It selects complete sentences from the original
text; it never rewrites, paraphrases, or cuts mid-sentence.
Regex-based sentence splitting aware of CJK, Japanese, and English
punctuation conventions, with abbreviation protection.
Two strategies via the parameter:
Greedy — Sort sentences by score descending, pick top ones until the
character budget is exhausted. O(n log n).
MMR (Maximal Marginal Relevance) — Iteratively selects sentences that
maximize:
where λ () controls the relevance–diversity trade-off:
Diversity is measured as token overlap (Jaccard-like) between candidate and
already-selected sentences, updated incrementally per round. O(k × n) overall.
The output variable records which variant was actually used:
Selected sentences are re-sorted by original document order for coherent
output. If no sentence fits within , a boundary-aware fallback
truncates at sentence-ending punctuation → whitespace → hard cut with
ellipsis.
This plugin processes all text locally. No data is transmitted to external
servers, APIs, or third-party services. See PRIVACY.md [blocked] for details.
GitHub profile: https://github.com/AlexMultiAgent
GitHub Issues: https://github.com/AlexMultiAgent/dify-plugin-text-fitter/issues
MIT [blocked]