Overview

OrcaRouter is an OpenAI-compatible LLM gateway that routes requests across 40+ upstream providers (OpenAI, Anthropic, Google, DeepSeek, Qwen, Grok, and more) to the cheapest or fastest path that can serve the model you asked for. Users pay below list price, and the dashboard tracks how much they save in real time.

Configuration

After installation, sign up at orcarouter.ai, grab an API key from the console, and set it up under Settings → Model Provider in Dify.

Adaptive routing —

The model is a virtual router that picks the best upstream per request. Configure its strategy from the routing console. Available strategies:

Strategy	Behavior
	Lowest-priced upstream that can serve the request (default)
	Trades off price vs latency vs quality
	Highest-quality upstream
	Linear contextual bandit picks among candidates based on per-request features (prompt length, code/math/JSON density, declared budget tier, MinHash-LSH similarity to recent traffic)
	Layers a task-difficulty score on top of — mundane prompts restricted to a "weak" model pool, hard prompts to a "strong" pool

Why this matters

Self-tuning — adaptive strategies learn from your own traffic; performance shifts when workload changes
Microsecond overhead — feature extraction and bandit are closed-form math; routing adds no measurable latency
Workload-aware in one call — same endpoint serves cheap summarization and premium code-refactor requests with no client-side dispatch logic
Cost & reliability guardrails by construction — reward explicitly penalizes cost, latency, rate-limit, and format failures
Admin-tunable without redeploys — strategies, pools, thresholds, and reward weights are changed from console, not client code

Fallback routing ()

OrcaRouter supports an OpenAI-compatible extension to specify per-request fallback models. Use the and parameters in the model node:

Parameter	Example	Effect
(string, JSON array)		If the primary upstream fails, try these in order
		Activate the fallback list above

These are translated to the request body's key — see the API reference.

Reasoning models

Some models (OpenAI //, Anthropic , DeepSeek /) expose reasoning controls:

OpenAI o-style: (high / medium / low / minimal), ,
Anthropic thinking: , (token budget),
DeepSeek r-style: (the model reasons by default)

These map onto the upstream provider's native reasoning protocol — OrcaRouter handles the translation.

Reasoning models do not accept . The Dify UI will show only for non-reasoning models.

Pricing

All prices are configured at the per-model level in this plugin and reflect the effective price you pay through OrcaRouter (which may be below the upstream's list price). See the live models page or for the authoritative source.

Privacy

This plugin is a thin client for the OrcaRouter API. It only transmits:

Your API key (entered in Dify settings) — sent in the header on every request
The prompt / messages / parameters you submit through Dify (forwarded to OrcaRouter, which routes them to the upstream provider you selected)
The input text for embedding and TTS endpoints

The plugin itself does not collect, store, log, or transmit any personal data beyond the request payloads required for the OrcaRouter API call. No telemetry, no analytics, no third-party data sharing happens in plugin code.

OrcaRouter's own privacy practices (request logging, billing data retention, etc.) are governed by the OrcaRouter privacy policy: https://www.orcarouter.ai/privacy.html

Repository

Plugin source code: https://github.com/Continuum-AI-Corp/dify-plugin-orcarouter