A powerful e-commerce data extraction tool that effortlessly scrapes real-time Amazon product details, keyword rankings, and BSR top lists into structured JSON for your AI agents.
Author: pangolinfo
Version: 0.0.2
Type: tool (e-commerce_data_extraction)
Pangolinfo Amazon Data Scraper empowers your AI agents with high-fidelity e-commerce intelligence. This official plugin transforms complex Amazon web pages into clean, LLM-ready JSON, seamlessly handling CAPTCHAs and enterprise-grade anti-bot systems in the background. With a comprehensive suite of 7 specialized parsers, it serves as the ultimate data foundation for building autonomous market research agents, real-time pricing monitors, and E-commerce RAG workflows.
🎁 Welcome Bonus: Start your AI journey with 60 complimentary credits! New users receive free credits upon registration at the Pangolinfo Portal, allowing you to test and deploy your first autonomous agents at zero initial cost.
Powered by Pangolinfo API. 👉 Get your API Key & Documentation: https://www.pangolinfo.com/amazon-scraper-skill/?referrer=dify_plugin_amazon
Comprehensive Amazon Coverage: 7 specialized parsing engines tailored for every business logic—including Keywords, ASIN Details, BSR, New Releases, Categories, Storefronts, and Hijacker Tracking (Follow Sellers).
Anti-Bot & CAPTCHA Bypass: Enterprise-grade infrastructure featuring automated CAPTCHA solving and proxy rotation to ensure uninterrupted, high-success data retrieval.
Global Marketplaces Support: Seamlessly scrape data from regional Amazon domains (e.g., .com, .co.uk, .de, .co.jp).
Real-Time Market Intelligence: Instant access to live pricing, organic rankings, and BuyBox ownership status for immediate tactical decision-making.
LLM-Ready Structured Data: Transforms unstructured Amazon content into high-density, clean JSON. Optimized to minimize token consumption and maximize prompt accuracy for RAG and AI Agents.
Granular Product Insights: Extract deep-level specifications including bullet points, high-resolution imagery, SKU variations, and comprehensive review metrics.
Before using this plugin, you need:
1. Basic Keyword Search Extraction
To extract organic rankings, prices, and sponsored ads from a search page:
Parameters:
url: https://www.amazon.com/s?k=wireless+mouse
format: json
parserName: amzKeyword
bizContext: {}
2. Deep ASIN Detail Scraping
To extract comprehensive details (bullet points, BuyBox, variations) of a specific product:
Parameters:
url: https://www.amazon.com/dp/B08H93ZRK9
format: json
parserName: amzProductDetail
bizContext: {}
3. Niche Discovery (Best Sellers)
To scrape the Top 100 products in a specific category:
Parameters:
url: https://www.amazon.com/Best-Sellers-Electronics/zgbs/electronics/
format: json
parserName: amzBestSellers
bizContext: {}
Understanding Parsers (parserName)
You must select the correct parser that matches your provided url:
amzKeyword: Extracts product lists from search result URLs.
amzProductDetail: Extracts deep specs from standard product detail (/dp/ASIN) URLs.
amzBestSellers: Extracts Top 100 lists from Amazon BSR category URLs.
amzNewReleases: Extracts trending products from Amazon New Releases URLs.
amzProductOfCategory: Extracts macro product lists from Amazon category navigation URLs.
amzProductOfSeller: Extracts storefront listings from a specific Seller ID's URL.
amzFollowSeller: Monitors hijackers/competitors from the "Other Sellers on Amazon" URL of an ASIN.
Output Format
The plugin returns structured JSON data for the LLM to parse. Example output for amzKeyword:
Crawl a competitor's storefront to analyze their pricing strategy:
Set up an Agent to regularly check if unauthorized sellers are hijacking your product:
Create a workflow that first calls amzBestSellers to get 10 hot ASINs, extracts their URLs, and uses a Loop Iteration node to call amzProductDetail on each to generate a comprehensive niche report.
Match URLs to Parsers Strictly: Ensure the URL you provide structurally matches the parserName. Passing a product detail URL to a keyword parser will result in an error.
Utilize Workflows for Scale: Use Dify's Workflow Loop (Iteration) nodes for bulk ASIN processing instead of prompting an Agent to do it all at once.
Prompt the LLM Properly: In Agent apps, instruct the LLM explicitly: "You must construct a valid Amazon URL (e.g., https://www.amazon.com/dp/ASIN) before calling the tool."
Use Long-Term Keys: Always use an API key obtained via the Auth API to prevent sudden workflow failures due to token expiration.
Real-time Latency: Since Pangolinfo fetches live data and solves CAPTCHAs on the fly, a single request may take anywhere from 3 to 15 seconds.
Workflow Timeouts: When looping through multiple URLs in a Dify Workflow, ensure you increase the overall node/app timeout settings to accommodate the combined processing time.
Troubleshooting
Common Issues
"Authentication Failed" error:
"Invalid URL / Parsing Error":
"Rate Limit Exceeded" / Insufficient Balance:
Empty Data Returned:
API Limits
Privacy
Please refer to the Pangolinfo Privacy Policy for information on how your data is handled when using this plugin.
Official Website: Visit Pangolinfo
Last updated: March 2026
Need higher rate limits, custom endpoints, or enterprise solutions? 🌐 Visit Pangolinfo Official Website: https://pangolinfo.com/?referrer=dify_plugin_amazon