中文｜ English

Sora2 Text-to-Video / Image-to-Video Dify Plugin

📖 Project Overview

This is a comprehensive Dify plugin based on JXINCM Sora-2 API that supports both text-to-video and image-to-video generation modes. Generate high-quality videos from text descriptions or create animated videos from image URLs, with real-time progress tracking. The plugin offers rich features including landscape/portrait orientation, watermark control, and multiple model selection.

✨ Key Features

🎬 Dual Mode Video Generation: Support both text-to-video and image-to-video modes
📸 Multi-Image Support: Image-to-video mode supports multiple image URLs
🔄 Smart Mode Switching: Automatically switch generation mode based on image URL input
📐 Multiple Orientations: Support landscape and portrait video formats
🎯 High-Quality Output: Powered by JXINCM Sora-2 and Sora-2-Pro models
🔗 Complete Result Return: Returns video URL, thumbnail, and GIF preview
🔄 Real-time Progress Tracking: Display full process status from queued to completed
🛡️ Comprehensive Error Handling: User-friendly error messages and solutions
🌐 Bilingual Support: Supports both English and Chinese interface

🏗️ Project Architecture

🚀 Quick Start

1. Get JXINCM API Key

Visit JXINCM Official Website
Register and login to your account
Get your API Key

2. Install Dependencies

3. Install Plugin in Dify

Upload the plugin folder to Dify plugin directory
Enable the plugin in Dify management interface
Configure JXINCM API Key

🔧 Usage

Mode 1: Text-to-Video

Add "Text to Video" tool in Dify workflow
Configure JXINCM API Key in plugin settings
Input video description prompt
Select video orientation:
- : Vertical (suitable for mobile short videos)
- : Horizontal (suitable for widescreen playback)
Select video size: (high quality)
Select model:
- : Standard quality model
- : High quality model
Set video duration: Fixed at 15 seconds
Choose whether to add watermark
Choose whether to make it private
Leave image URL parameter empty
Run the tool to generate video

Example:

Mode 2: Image-to-Video

Add "Text to Video" tool in Dify workflow
Configure JXINCM API Key in plugin settings
Input animation description prompt
Input image URLs (supports the following formats):
- Single URL:
- Multiple URLs (comma-separated):
- Multiple URLs (newline-separated):
Configure other parameters (orientation, model, etc.)
Run the tool to generate animated video

Example:

Prompt Suggestions

For best video generation results, we recommend:

Text-to-Video Prompts:

Detailed Description: Provide specific information about scenes, actions, camera movements, lighting
Clear Expression: Use concise and clear language
Camera Direction: Specify camera movements like "slow push-in", "orbit shot"

Example:

Image-to-Video Prompts:

Action Description: Describe how elements in the image should animate
Keep It Simple: Usually short action commands work best
Stay Consistent: Prompt should relate to the image content

Example:

⚙️ Technical Implementation

Core Workflow

Video Generation Flow:

Parameter Parsing: Parse input parameters, determine generation mode (text or image)
Task Creation: Submit video generation request to JXINCM API
Status Polling: Monitor task status via periodic API calls
Progress Tracking: Display generation progress (queued → processing → completed)
Result Extraction: Auto-extract video URL, thumbnail, GIF preview
Result Return: Return complete video information

API Call Pattern

Key Implementation Details

Task Creation:

Progress Polling:

🔍 Troubleshooting

Common Issues

Invalid API Key
- Check if API Key format is correct
- Confirm API Key is valid and has sufficient quota
- Verify JXINCM platform account status
Generation Timeout
- Check network connection stability
- Video generation typically takes 2-10 minutes
- Try simplifying prompt description
- Retry later if server is busy
Invalid Image URL
- Ensure image URL is publicly accessible
- Check if image format is supported (JPG, PNG, WebP, etc.)
- Verify URL format is correct
Prompt Rejected
- Avoid sensitive or inappropriate content
- Use more general descriptions
- Follow content policy guidelines

Error Codes

: Invalid or unauthorized API Key
: API call rate limit exceeded
: Internal server error
: Request timeout (network or server issue)

📊 Performance Metrics

Request Timeout: 30 seconds (API calls), 10 minutes (total polling)
Average Generation Time: 2-10 minutes (varies by complexity and queue length)
Video Duration: 15 seconds (fixed)
Supported Format: MP4
Video Orientation:
- portrait (vertical)
- landscape (horizontal)
Video Size: large (high quality)
Model Selection:
- sora-2 (standard quality)
- sora-2-pro (high quality)
Generation Modes:
- Text-to-video (no image URLs)
- Image-to-video (with image URLs)
Output Content:
- Video URL (main playback link)
- Thumbnail (preview image)
- GIF preview (animated preview)
Polling Interval: 5 seconds (real-time progress updates)

🔒 Privacy & Security

Please refer to PRIVACY.md for detailed information about data handling and privacy policy.

Key points:

No local storage of prompts or videos
Temporary processing only during generation
API Key stored securely in Dify environment
Data processed by JXINCM according to their privacy policy
Image URLs provided by users, plugin does not store or cache them

📋 Development Standards

This plugin follows Dify plugin development best practices:

✅ Generator response processing
✅ Real-time progress tracking
✅ Automatic URL extraction
✅ Complete error handling mechanism
✅ Bilingual support (English/Chinese)
✅ Standard JXINCM API integration
✅ Dual mode support (text/image)

🤝 Contributing

Welcome to submit Issues and Pull Requests to improve this plugin!

📄 License

This project is licensed under the MIT License.

🔗 Related Links

📦 Release Notes

0.0.4 (2025-11-01) 🆕

API Migration: Migrated from 302.AI to JXINCM API
Dual Mode Support: Support both text-to-video and image-to-video modes
Multi-Image Support: Image-to-video mode supports multiple image URLs
Parameter Optimization: Simplified parameter configuration, fixed 15-second duration
Enhanced Output: Returns video URL, thumbnail, and GIF preview
Progress Display: Complete task creation and progress tracking information
Smart Mode: Auto-switch generation mode based on image URL parameter

0.0.1 (2025-10-02)

Initial release with text-to-video generation