<<<<<<< HEAD
🚀 Bright Data Plugin for Dify
A comprehensive web scraping and data extraction plugin powered by Bright Data's enterprise-grade infrastructure. This plugin provides advanced web scraping capabilities with anti-bot detection for the Dify platform.
✨ New! Smart Data Extractor
Our latest addition automatically determines the best extraction method for your needs:
- 🧠 Intelligent Auto-Detection: Just describe what you want or provide any URL
- 🎯 20+ Data Sources: Amazon, LinkedIn, Instagram, YouTube, TikTok, Crunchbase, and more!
- ⚡ One Tool for Everything: No need to choose between multiple extractors
🔧 Features
Smart Data Extractor (Recommended)
- E-commerce: Amazon products/reviews/search, Walmart, eBay, Best Buy
- Social Media: LinkedIn profiles/companies/jobs, Instagram, TikTok, Facebook, X/Twitter
- Business Intelligence: Crunchbase companies, Google Maps reviews, ZoomInfo
- Content & Media: YouTube videos/comments, Reddit posts, GitHub repositories
- Real Estate: Zillow property listings
Classic Tools
- Web Page Scraping: Extract content as Markdown or HTML with anti-bot protection
- Search Engine Scraping: Google, Bing, Yandex search results
🚀 Quick Start
Option 1: Download Pre-built Package (Easiest)
-
Download Plugin Package
-
Install in Dify
- Open your Dify dashboard
- Go to Plugins → Install from file
- Upload the file
- Click Install
-
Configure API Token
- Click on the installed Bright Data plugin
- Add your Bright Data API Token
- Save configuration
-
Start Using!
- Create workflows and use the Smart Data Extractor
- Or use individual tools like Web Scraper and Search Engine
Option 2: Development/Debug Mode
Only use this if you want to modify the plugin:
- Clone Repository
- Set Up Environment
- Configure Debug Connection
- Run Development Server
🔑 Getting Bright Data API Token
- Sign up at Bright Data
- Go to your dashboard → Zones
- Create a new zone or use existing one
- Copy the API Token
- Use this token when configuring the plugin in Dify
💡 Usage Examples
Smart Data Extractor Usage
The Smart Data Extractor automatically detects what you want to extract:
Example 1: Amazon Product Analysis
Example 2: LinkedIn Profile Research
Example 3: Instagram Analytics
Example 4: Business Intelligence
Workflow Integration
- Create a new Workflow in Dify
- Add Smart Data Extractor node
- Configure parameters:
- Request: Describe what you want
- URL: Paste any supported URL
- Additional Parameters: Leave empty (optional)
- Connect to LLM node to process the extracted data
- Run and enjoy!
🎯 Supported Platforms
| Category | Platforms |
|---|
| E-commerce | Amazon, Walmart, eBay, Best Buy, Home Depot, Zara, Etsy |
| Social Media | LinkedIn, Instagram, TikTok, Facebook, X/Twitter, YouTube, Reddit |
| Business Intel | Crunchbase, ZoomInfo, Google Maps, Yahoo Finance |
| Content & Apps | GitHub, Reuters News, Google Play, Apple App Store |
| Real Estate | Zillow, Google Shopping |
🔧 Configuration Parameters
Smart Data Extractor
- Request (required): Describe what data you want to extract
- URL (optional): Specific URL to extract from
- Additional Parameters (optional): JSON with extra settings like
Web Scraper
- URL (required): Webpage to scrape
- Output format: Markdown or HTML
Search Engine
- Query (required): Search terms
- Engine: Google (default), Bing, or Yandex
🚨 Troubleshooting
Common Issues
"Could not determine appropriate tool"
- Be more specific in your request
- Use full URLs with https://
- Try: "Get Amazon product info" instead of "get data"
"API Error 400: Invalid input"
- Check that URLs are complete and valid
- Ensure your BrightData API token is correct
- Verify you have sufficient credits
Plugin not connecting
- Restart the plugin: then
- Check your file has correct debug credentials
- Refresh Dify and re-add the plugin
Debug Mode Issues
"Failed to connect to localhost:5003"
- Make sure Dify is running with plugin daemon
- Get fresh debug credentials from Dify
- Update your file with new credentials
📊 What's New in v0.1.0
- ✅ Smart Data Extractor: Auto-detects best extraction method
- ✅ 20+ Data Sources: Massive expansion from 3 to 20+ platforms
- ✅ Intelligent URL Detection: Automatically selects tools based on URLs
- ✅ Natural Language Requests: Describe what you want in plain English
- ✅ Better Error Handling: Helpful suggestions when things go wrong
- ✅ Robust API Integration: Handles various response formats gracefully
🤝 Contributing
- Fork the repository
- Create a feature branch:
- Make your changes and test thoroughly
- Commit:
- Push:
- Submit a pull request
📄 License
This project is licensed under the MIT License - see the LICENSE file for details.
🆘 Support
🙏 Credits
- Bright Data: For providing the enterprise-grade web scraping infrastructure
- Dify Team: For creating an amazing AI workflow platform
- Contributors: Thank you to everyone who helps improve this plugin!
=======
<<<<<<< HEAD
🚀 Bright Data Plugin for Dify
A comprehensive web scraping and data extraction plugin powered by Bright Data's enterprise-grade infrastructure. This plugin provides advanced web scraping capabilities with anti-bot detection for the Dify platform.
✨ New! Smart Data Extractor
Our latest addition automatically determines the best extraction method for your needs:
- 🧠 Intelligent Auto-Detection: Just describe what you want or provide any URL
- 🎯 20+ Data Sources: Amazon, LinkedIn, Instagram, YouTube, TikTok, Crunchbase, and more!
- ⚡ One Tool for Everything: No need to choose between multiple extractors
🔧 Features
Smart Data Extractor (Recommended)
- E-commerce: Amazon products/reviews/search, Walmart, eBay, Best Buy
- Social Media: LinkedIn profiles/companies/jobs, Instagram, TikTok, Facebook, X/Twitter
- Business Intelligence: Crunchbase companies, Google Maps reviews, ZoomInfo
- Content & Media: YouTube videos/comments, Reddit posts, GitHub repositories
- Real Estate: Zillow property listings
Classic Tools
- Web Page Scraping: Extract content as Markdown or HTML with anti-bot protection
- Search Engine Scraping: Google, Bing, Yandex search results
🚀 Quick Start
Option 1: Download Pre-built Package (Easiest)
-
Download Plugin Package
-
Install in Dify
- Open your Dify dashboard
- Go to Plugins → Install from file
- Upload the file
- Click Install
-
Configure API Token
- Click on the installed Bright Data plugin
- Add your Bright Data API Token
- Save configuration
-
Start Using!
- Create workflows and use the Smart Data Extractor
- Or use individual tools like Web Scraper and Search Engine
Option 2: Development/Debug Mode
Only use this if you want to modify the plugin:
- Clone Repository
- Set Up Environment
- Configure Debug Connection
- Run Development Server
🔑 Getting Bright Data API Token
- Sign up at Bright Data
- Go to your dashboard → Zones
- Create a new zone or use existing one
- Copy the API Token
- Use this token when configuring the plugin in Dify
💡 Usage Examples
Smart Data Extractor Usage
The Smart Data Extractor automatically detects what you want to extract:
Example 1: Amazon Product Analysis
Example 2: LinkedIn Profile Research
Example 3: Instagram Analytics
Example 4: Business Intelligence
Workflow Integration
- Create a new Workflow in Dify
- Add Smart Data Extractor node
- Configure parameters:
- Request: Describe what you want
- URL: Paste any supported URL
- Additional Parameters: Leave empty (optional)
- Connect to LLM node to process the extracted data
- Run and enjoy!
🎯 Supported Platforms
| Category | Platforms |
|---|
| E-commerce | Amazon, Walmart, eBay, Best Buy, Home Depot, Zara, Etsy |
| Social Media | LinkedIn, Instagram, TikTok, Facebook, X/Twitter, YouTube, Reddit |
| Business Intel | Crunchbase, ZoomInfo, Google Maps, Yahoo Finance |
| Content & Apps | GitHub, Reuters News, Google Play, Apple App Store |
| Real Estate | Zillow, Google Shopping |
🔧 Configuration Parameters
Smart Data Extractor
- Request (required): Describe what data you want to extract
- URL (optional): Specific URL to extract from
- Additional Parameters (optional): JSON with extra settings like
Web Scraper
- URL (required): Webpage to scrape
- Output format: Markdown or HTML
Search Engine
- Query (required): Search terms
- Engine: Google (default), Bing, or Yandex
🚨 Troubleshooting
Common Issues
"Could not determine appropriate tool"
- Be more specific in your request
- Use full URLs with https://
- Try: "Get Amazon product info" instead of "get data"
"API Error 400: Invalid input"
- Check that URLs are complete and valid
- Ensure your BrightData API token is correct
- Verify you have sufficient credits
Plugin not connecting
- Restart the plugin: then
- Check your file has correct debug credentials
- Refresh Dify and re-add the plugin
Debug Mode Issues
"Failed to connect to localhost:5003"
- Make sure Dify is running with plugin daemon
- Get fresh debug credentials from Dify
- Update your file with new credentials
📊 What's New in v0.1.0
- ✅ Smart Data Extractor: Auto-detects best extraction method
- ✅ 20+ Data Sources: Massive expansion from 3 to 20+ platforms
- ✅ Intelligent URL Detection: Automatically selects tools based on URLs
- ✅ Natural Language Requests: Describe what you want in plain English
- ✅ Better Error Handling: Helpful suggestions when things go wrong
- ✅ Robust API Integration: Handles various response formats gracefully
🤝 Contributing
- Fork the repository
- Create a feature branch:
- Make your changes and test thoroughly
- Commit:
- Push:
- Submit a pull request
📄 License
This project is licensed under the MIT License - see the LICENSE file for details.
🆘 Support
🙏 Credits
- Bright Data: For providing the enterprise-grade web scraping infrastructure
- Dify Team: For creating an amazing AI workflow platform
- Contributors: Thank you to everyone who helps improve this plugin!
=======
BrightData_Dify_Plugin
A BrightData plugin for the Dify platform, the plugin contains all of Bright's web scraping, unlocking and dataset tools
c79671e0e242770add67da64e39647731b1ebdd5
3fcdbbf934150efbd3a2a0c0432ed5680181d5a2