🎙️ Podcast Studio Plugin
Author: gaurav0651
Plugin Name: podcast_studio
Repository URL: https://github.com/gaurav0651/podcast_studio
Overview
Transform text scripts into professional podcast audio with AI-powered voices. This plugin brings new value to Dify by enabling content creators to generate high-quality, multi-host podcast conversations directly within their Dify workflows, eliminating the need for expensive recording equipment or voice actors.
Key Value Propositions
- Workflow Integration: Seamlessly generate podcast audio within Dify workflows
- Multi-TTS Support: Choose between OpenAI and ElevenLabs for optimal quality/cost balance
- Professional Quality: Studio-grade voices with various accents and characteristics
- Time Savings: Convert scripts to audio in minutes instead of hours of recording/editing
- Cost Effective: No need for professional voice actors or recording studios
Features
- Generate podcast audio using OpenAI or ElevenLabs Text-to-Speech
- Support for multiple host voices with distinct characteristics
- Australian accent voices (Stuart, Lee, Amelia, Maya, Sophia)
- American premium voices (Rachel, Drew, Clyde, Paul, Domi)
- British voices (Dave)
- Mixed TTS service support (use different services for each host)
- Production-grade audio generation optimized for podcast content
Supported TTS Services
OpenAI TTS
- Voices: Alloy, Echo, Fable, Onyx, Nova, Shimmer
- Languages: 29+ supported languages
- Quality: High-quality neural voices
- Cost: Pay-per-character usage
ElevenLabs
- Voices: 11 premium voices with various accents and styles
- Specialties: Australian, American, and British accents
- Quality: Studio-grade voice synthesis
- Cost: Subscription-based with character limits
Setup Instructions
Step 1: Install the Plugin
- Download the latest file from the releases page
- In your Dify instance, navigate to Tools → Plugins
- Click "Install Plugin" and upload the file
- Wait for installation confirmation message
- Verify the plugin appears in your available tools list
Step 2: Configure API Credentials
For OpenAI TTS:
- Visit OpenAI API Keys
- Create a new API key with TTS permissions
- In Dify, go to plugin settings and select "OpenAI TTS"
- Enter your OpenAI API key in the API Key field
- (Optional) Set custom base URL if using a proxy or custom endpoint
- Click "Test Connection" to verify setup
For ElevenLabs:
- Visit ElevenLabs API Keys
- Create a new API key
- In Dify, go to plugin settings and select "ElevenLabs"
- Enter your ElevenLabs API key in the API Key field
- Click "Test Connection" to verify setup
Step 3: Verify Installation
- Create a new workflow in Dify
- Add the "Podcast Studio" tool from the tools panel
- Configure your desired voices for Host 1 and Host 2
- Test with a sample script:
- Run the workflow and verify audio generation
Usage Instructions
Basic Usage
- Add the Tool: In your Dify workflow, drag the "Podcast Studio" tool into your workflow
- Configure Voices:
- Select voice for Host 1 (e.g., "Stuart - Australian Male, Energetic")
- Select voice for Host 2 (e.g., "Sophia - Australian Female, Bright")
- Format Your Script: Use clear speaker labels with consistent formatting:
- Generate Audio: Connect your script input and run the workflow
- Download Result: The tool outputs an audio file ready for podcast distribution
Advanced Configuration
- Mixed Services: Use OpenAI voices for one host and ElevenLabs for another to optimize cost/quality
- Voice Characteristics: Each voice has specific traits (age, accent, tone) - choose combinations that create natural conversations
- Script Formatting: Ensure consistent "Host 1:" and "Host 2:" labels for proper voice assignment
- Workflow Integration: Connect with other Dify tools for automated content generation pipelines
Required APIs and Credentials
OpenAI TTS (Option 1)
- API Key: Required from OpenAI Platform
- Permissions: Text-to-Speech API access
- Pricing: Pay-per-use based on character count (~$15/1M characters)
- Base URL: Optional custom endpoint support
- Rate Limits: 50 requests per minute (default)
ElevenLabs (Option 2)
- API Key: Required from ElevenLabs Dashboard
- Subscription: Starter plan or higher recommended
- Pricing: Character-based limits vary by plan
- Character Limits: 10K (free) to 500K+ (paid plans) per month
- Rate Limits: Varies by subscription tier
Connection Requirements and Configuration
Network Requirements
- Outbound HTTPS connections required to:
- (for OpenAI TTS service)
- (for ElevenLabs service)
- Ports: 443 (HTTPS)
- Firewall: Ensure outbound connections are allowed
- Proxy Support: OpenAI base URL can be customized for proxy usage
System Requirements
- Memory: 256MB minimum for plugin operation
- CPU Architecture: AMD64 or ARM64 supported
- Python Runtime: 3.12+ (managed by Dify)
- Dify Version: Compatible with Community Edition and Cloud Version
Configuration Details
- Plugin Memory Allocation: 268MB reserved
- Concurrent Requests: Supports parallel processing
- Audio Output Format: WAV format, 22kHz sample rate
- Maximum Script Length: Limited by TTS service character limits
Testing and Compatibility
✅ Tested Environments:
- Dify Community Edition v0.6.0+
- Dify Cloud Version (latest)
- Docker deployments
- Local development environments
✅ Functionality Verified:
- OpenAI TTS integration and voice generation
- ElevenLabs TTS integration and voice generation
- Mixed service usage (different services per host)
- Error handling for invalid API keys
- Network connectivity issues handling
- Large script processing
- Audio quality and format consistency
Troubleshooting
Common Issues
"Invalid API key" Error:
- Verify your API key is correctly entered without extra spaces
- Check that the API key has proper permissions for TTS services
- Ensure you've selected the correct TTS service (OpenAI vs ElevenLabs)
- Test the API key directly with the service provider's documentation
"Plugin verification failed" Error:
- For local development: Disable plugin verification in Dify settings
- For production: Ensure plugin is properly signed and approved
- Check plugin installation logs for specific error details
Audio generation fails:
- Verify your script follows the correct format with "Host 1:" and "Host 2:" labels
- Check API key has sufficient credits/quota remaining
- Ensure network connectivity to TTS service endpoints
- Verify script length doesn't exceed service character limits
Poor audio quality:
- Try different voice combinations for better contrast
- Ensure script has natural conversation flow
- Check if using mixed services improves quality for your use case
Getting Support
For technical issues, feature requests, or contributions:
- GitHub Issues: Create an issue
- Documentation: Review this README and GUIDE.md
- Community: Check existing issues for similar problems
Repository and Source Code
Source Code Repository: https://github.com/gaurav0651/podcast_studio
The complete source code, documentation, and examples are available in the GitHub repository. Contributors welcome!
License
This plugin is released under the MIT License.
This plugin complies with Dify Plugin Privacy Protection Guidelines and has been thoroughly tested for completeness and functionality on both Dify Community Edition and Cloud Version.