Overview
GMI Cloud is a cloud-based GPU infrastructure platform that provides high-performance AI model inference services. With an OpenAI-compatible API, GMI Cloud delivers fast and reliable access to popular large language models including DeepSeek, Llama, Qwen, and Zhipu models.
Key Features
- OpenAI-Compatible API: Seamless integration with standard OpenAI client libraries and tools.
- Multiple Model Families: Access to DeepSeek, Meta Llama, Qwen, OpenAI OSS, and Zhipu (ZAI) models.
- High Performance: Optimized GPU infrastructure for fast inference and low latency.
- Streaming Support: Real-time streaming responses for chat completions.
- Tool Calling: Built-in support for function calling and tool use.
- Custom Model Support: Deploy and use your own fine-tuned models.
- Flexible Endpoints: Support for custom API endpoints for enterprise deployments.
Configure
After installing the plugin, you will need the following to configure GMI Cloud:
- Get your API Key: Sign in to the GMI Cloud console and create an API key.
- Configure in Dify: Open Settings → Model Provider, find GMI Cloud, and enter your API key in the field.
- Custom Endpoint (Optional): If your organization uses a custom endpoint, fill in the field. Otherwise, the plugin defaults to .
- Save: Click "Save" to activate the plugin. Dify will validate your credentials by calling the endpoint.
Built-in Models
The plugin ships with the following preset models you can use immediately:
- DeepSeek: ,
- OpenAI OSS:
- Meta Llama:
- Qwen: , , ,
- Zhipu (ZAI):