app icon
vllm
0.2.3

vllm provider for extra_body support https://docs.vllm.ai/en/latest/serving/openai_compatible_server.html#id5

yangyaofei/vllm78356 installs

vLLM Dify Provider

Dify custom model provider for vLLM's OpenAI-Compatible Server, supporting extra parameters and thinking mode features.

Based on the official Dify OpenAI-API-compatible plugin, extended for vLLM OpenAI-Compatible Server.

Latest: v0.2.3

Fixes

  • #34: Fixed thinking mode markup tags — //, aligned 1:1 with dify-official-plugins openai_api_compatible
  • #31: Fixed parameter delivery — JSON contents now merged into top-level request body instead of nested under key

Features (since v0.2.0)

  • extra_body: Pass any vLLM extra parameters directly as JSON
  • Thinking Mode: toggle with (strict/extended)
  • Reasoning Effort: Support (none/low/medium/high), natively supported by vLLM
  • Compatibility Mode: Extended mode injects , , at top level
  • Structured Output: Support , ,
  • Thinking Content Filter: Auto-filter when thinking is disabled
  • Thinking Content Cleanup: Strip thinking content from history before requests
  • vLLM Reasoning Field: Priority read (vLLM >= 0.17.1), fallback to

v0.2.0 Baseline

Breaking Change: Removed all legacy parameters, use for all extra parameter needs.

Usage

Add model

Same as OpenAI-API-compatible, select "Vllm" provider:

Configure extra_body

Pass extra parameters via the JSON text field:

Example:

Repo

https://github.com/yangyaofei/dify-vllm-provider

CATEGORY
Model
VERSION
0.2.3
yangyaofei·05/13/2026 01:48 AM
REQUIREMENTS
LLM invocation
Maximum memory
256MB