vLLM Dify Provider
Dify custom model provider for vLLM's OpenAI-Compatible Server, supporting extra parameters and thinking mode features.
Based on the official Dify OpenAI-API-compatible plugin, extended for vLLM OpenAI-Compatible Server.
Latest: v0.2.3
Fixes
- #34: Fixed thinking mode markup tags — / → /, aligned 1:1 with dify-official-plugins openai_api_compatible
- #31: Fixed parameter delivery — JSON contents now merged into top-level request body instead of nested under key
Features (since v0.2.0)
- extra_body: Pass any vLLM extra parameters directly as JSON
- Thinking Mode: toggle with (strict/extended)
- Reasoning Effort: Support (none/low/medium/high), natively supported by vLLM
- Compatibility Mode: Extended mode injects , , at top level
- Structured Output: Support , ,
- Thinking Content Filter: Auto-filter when thinking is disabled
- Thinking Content Cleanup: Strip thinking content from history before requests
- vLLM Reasoning Field: Priority read (vLLM >= 0.17.1), fallback to
v0.2.0 Baseline
Breaking Change: Removed all legacy parameters, use for all extra parameter needs.
Usage
Add model
Same as OpenAI-API-compatible, select "Vllm" provider:
Configure extra_body
Pass extra parameters via the JSON text field:
Example:
Repo
https://github.com/yangyaofei/dify-vllm-provider