ไธญๆ ๏ฝ English
Project source code address๏ผ
Qwen Text2Image & Image2Image Dify Plugin
๐ Project Overview
This is a comprehensive Dify plugin based on ModelScope Qwen-Image models that supports both text-to-image generation and image-to-image editing. Generate high-quality images from text descriptions or edit existing images with text prompts. The plugin uses asynchronous task processing to ensure stable and reliable image generation.
โจ Key Features
- ๐จ High-Quality Image Generation: Supports advanced Qwen-Image-2512 and other models
- โ๏ธ Image Editing: Edit existing images with text prompts using Qwen-Image-Edit-2511, Qwen-Image-Edit-2509 and other models
- ๐ Custom Image Size Support: Flexible image dimensions with custom size configuration (WxH format)
- ๐ผ๏ธ Automatic Size Detection: Image2Image tool automatically detects input image dimensions as default
- โก Asynchronous Processing: Uses task submission + polling async mode to avoid timeouts
- ๐ Real-time Feedback: Provides detailed generation progress and status information
- ๐ก๏ธ Error Handling: Comprehensive exception handling with user-friendly error messages
- ๐ Bilingual Support: Supports both English and Chinese interface and messages
๐๏ธ Project Architecture
๐ Quick Start
1. Get ModelScope API Key
- Visit ModelScope Official Website
- Register and login to your account
- Go to My Access Token page
- Create a new API Key (format: )
2. Install Dependencies
3. Configure Environment
Copy to and configure the parameters:
4. Install Plugin in Dify
- Upload the plugin folder to Dify plugin directory
- Enable the plugin in Dify management interface
- Configure ModelScope API Key
๐ง Usage
Basic Usage
-
Add "Qwen Text2Image" tool in Dify workflow
-
Configure ModelScope API Key
-
Input image description prompt
-
Select model (default: Qwen-Image)
-
Run the tool to generate image
Workflow DSL example:
Prompt Suggestions
For best image generation results, we recommend:
- Detailed Description: Provide specific information about scene, objects, colors, styles, etc.
- Clear Expression: Use concise and clear language for description
- Style Specification: You can specify artistic styles like "oil painting style", "cartoon style", etc.
Example prompt:
โ๏ธ Technical Implementation
Core Workflow
- Task Submission: Submit asynchronous image generation task to ModelScope API
- Status Polling: Query task status every 5 seconds, wait up to 5 minutes
- Image Download: Download generated image after task completion
- Format Conversion: Use PIL to convert image to PNG format and return
API Call Pattern
๐ Troubleshooting
Common Issues
-
Invalid API Key
- Check if API Key format starts with
- Confirm API Key is valid and not expired
-
Generation Timeout
- Check if network connection is normal
- Try simplifying prompt description
- Retry later
-
Image Download Failed
- Check network connection
- Confirm firewall settings allow access to ModelScope domains
Error Codes
- : Invalid or unauthorized API Key
- : API call rate limit exceeded
- : Internal server error
๐ Development Standards
This plugin strictly follows the Dify text-to-image plugin development standards defined in CLAUDE2.md:
- โ
Asynchronous task processing mode
- โ
Complete error handling mechanism
- โ
Real-time progress feedback
- โ
Bilingual support (English/Chinese)
- โ
Standard ModelScope API calls
๐ค Contributing
Welcome to submit Issues and Pull Requests to improve this plugin!
๐ License
This project is licensed under the MIT License.
๐ Related Links
๐ฆ Release Notes
0.0.4
- Fixed Image2Image Functionality: Resolved issue where ModelScope server couldn't access Dify internal image URLs causing task failures
- Added Temporary Image Hosting: Integrated litterbox.catbox.moe temporary image hosting service to automatically upload images and obtain publicly accessible URLs
- Improved Image Processing:
- Automatically handles RGBA, LA, P and other color modes, converting to RGB format
- Supports image URLs from various sources (including Dify internal addresses, intranet addresses, etc.)
- Increased Network Timeout:
- API submission request timeout: 300 seconds
- Task status polling timeout: 120 seconds
- Image hosting upload timeout: 120 seconds
- Updated Default Model: Image2Image default model updated to
- Improved Error Handling: Optimized error message extraction logic, providing more detailed debugging information
0.0.3
- Enhanced Custom Image Size Support: Both Text2Image and Image2Image tools now support flexible custom image dimensions
- Automatic Size Detection: Image2Image tool automatically detects and uses input image dimensions as default size
- Improved Size Validation: Added comprehensive size format validation with user-friendly error messages
- Better Error Handling: Enhanced error messages for invalid size parameters with automatic fallback
- Code Optimization: Improved parameter handling and validation logic in both tools
- Updated Documentation: Enhanced README with detailed size configuration examples and usage guidelines
0.0.2
- Added Image-to-Image tool (Image2Image) based on ModelScope Qwen-Image-Edit
- New files: ,
- Registered the tool in and imported in
- Updated description and labels to reflect both text-to-image and image-to-image
- Updated README docs (EN/ZH)
- Backward compatible; no breaking changes; existing Text2Image workflows are unaffected
- Usage: In Dify, choose the "Image to Image" tool, then provide a prompt and a public image URL
0.0.1
- Initial release with Text2Image tool based on ModelScope Qwen-Image