vllm

0.2.3

vllm provider for extra_body support https://docs.vllm.ai/en/latest/serving/openai_compatible_server.html#id5

yangyaofei/vllm78356 installs

vLLM Dify Provider

Dify custom model provider for vLLM's OpenAI-Compatible Server, supporting extra parameters and thinking mode features.

Based on the official Dify OpenAI-API-compatible plugin, extended for vLLM OpenAI-Compatible Server.

Latest: v0.2.3

Fixes

#34: Fixed thinking mode markup tags — / → /, aligned 1:1 with dify-official-plugins openai_api_compatible
#31: Fixed parameter delivery — JSON contents now merged into top-level request body instead of nested under key

Features (since v0.2.0)

extra_body: Pass any vLLM extra parameters directly as JSON
Thinking Mode: toggle with (strict/extended)
Reasoning Effort: Support (none/low/medium/high), natively supported by vLLM
Compatibility Mode: Extended mode injects , , at top level
Structured Output: Support , ,
Thinking Content Filter: Auto-filter when thinking is disabled
Thinking Content Cleanup: Strip thinking content from history before requests
vLLM Reasoning Field: Priority read (vLLM >= 0.17.1), fallback to

v0.2.0 Baseline

Breaking Change: Removed all legacy parameters, use for all extra parameter needs.

Usage

Add model

Same as OpenAI-API-compatible, select "Vllm" provider:

Configure extra_body

Pass extra parameters via the JSON text field:

Example:

Repo

https://github.com/yangyaofei/dify-vllm-provider

CATEGORY

Model

VERSION

0.2.3

yangyaofei·05/13/2026 01:48 AM

REQUIREMENTS

LLM invocation

Maximum memory

256MB