Firecrawl API integration for web crawling and scraping.
Firecrawl is a powerful API integration for web crawling and data scraping. It allows users to extract URLs, scrape website content, and retrieve structured data from web pages. With its modular tools, Firecrawl simplifies the process of gathering web data efficiently. You can now use it in your application workflows for automated web data extraction and analysis.
To set up Firecrawl, follow these steps:
Install Firecrawl Tool
Access the Plugin Marketplace, locate the Firecrawl tool, and install it.
Apply for a Firecrawl API Key
Go to the Firecrawl API Keys page, create a new API Key, and ensure your account has sufficient balance.
Authorize Firecrawl
Navigate to Plugins > Firecrawl > To Authorize in Dify, and input your API Key to enable the tool.

The Firecrawl tool provides four actions for web crawling and scraping:
Retrieve scraping results based on a Job ID or cancel an ongoing scraping task. This is ideal for managing and monitoring your workflows.

Perform a recursive crawl of a website's subdomains to gather content. Perfect for extracting extensive datasets from interconnected pages.

When using the Map action to generate a complete map of all URLs present on a site, you need to configure the following input parameters.

Convert any URL into clean, structured data, transforming raw HTML into actionable insights.

Firecrawl can be seamlessly integrated into both Chatflow / Workflow Apps and Agent Apps.
Integrate Firecrawl into your pipeline by following these steps:

Add the Firecrawl tool in the Agent application, then send the URL. The tool will help to scrape the online content, so the LLM will have the ability to get the latest online data.
