SchemaRAG Database Schema RAG Plugin
Author: joto
Version: 0.1.7
Type: tool
Repository: https://github.com/JOTO-AI/SchemaRAG-dify-plugin
中文文档
Overview
SchemaRAG is a database schema RAG plugin designed specifically for the Dify platform. It can automatically analyze database structures, build knowledge bases, and implement natural language to SQL queries. This plugin provides a complete database schema analysis and intelligent query solution, ready to use out of the box.
Example workflow download
✨ Core Features
- Multi-Database Support: MySQL, PostgreSQL, MSSQL, Oracle, DM (达梦), automatic syntax adaptation
- Schema Auto-Analysis: One-click data dictionary generation, structure visualization
- Knowledge Base Upload: Automatic upload to Dify, supports incremental updates
- Natural Language to SQL: Ready to use out of the box, supports complex queries
- AI Data Analysis: Analyze query data, supports custom rules
- Data Visualization: Provides visualization tools, LLM recommends charts and fields
- Security Mechanism: SELECT-only access, supports field whitelist, minimum privilege principle
- Flexible Support: Compatible with mainstream large language models
📋 Configuration Parameters
Supported Database Types
🚀 Quick Start
Method 1: Command Line
Method 2: Dify Plugin Integration
-
Fill in the above parameters in the Dify platform plugin configuration interface
-
After configuration is complete and accurate, click save to automatically build the configured database schema knowledge base in Dify
-
Add tools in the workflow and configure the knowledge base ID that was just created (the knowledge base ID is in the URL of the knowledge base page)
-
Provide SQL execution tool, input the generated SQL for direct execution, supports markdown and json output
Method 3: Code Invocation
🛠️ Tool Components
1. text2sql Tool
Natural Language to SQL Query Tool - Convert natural language questions to SQL queries using database schema knowledge base
Core Features
- Intelligent Query Conversion: Automatically convert natural language questions to accurate SQL query statements
- Multi-Database Support: Supports MySQL, PostgreSQL, MSSQL, Oracle, and DM SQL dialects
- Knowledge Base Retrieval: Intelligent retrieval and matching based on database schema knowledge base
- Ready to Use: Can be used directly after configuring the knowledge base, no additional setup required
- Customize propt rules: Add custom to prompt words and configure custom rules
Parameter Configuration
2. sql_executer Tool
SQL Query Execution Tool - Safely execute SQL queries and return formatted results
Core Features
- Safe Execution: Only supports SELECT queries to ensure data security
- Output Control: Provides interface to control maximum query rows to prevent excessive data queries
- Multi-Format Output: Supports JSON and Markdown output formats
- Direct Connection: Direct database connection for query execution, real-time results
- Error Handling: Comprehensive error handling mechanism with detailed error information
Parameter Configuration
3. sql_executer_cust Tool
Custom SQL Query Execution Tool - Custom database connection and safely execute SQL queries to return formatted results
Core Features
- Custom Database Connection: Supports multiple databases without plugin configuration
- Safe Execution: Only supports SELECT queries to ensure data security
- Output Control: Provides interface to control maximum query rows to prevent excessive data queries
- Multi-Format Output: Supports JSON and Markdown output formats
- Direct Connection: Direct database connection for query execution, real-time results
- Error Handling: Comprehensive error handling mechanism with detailed error information
Parameter Configuration
Database connection URL examples:
- mysql: mysql://user:password@host:port/dbname
- postgresql: postgresql://user:password@host:port/dbname
- DM: dameng://user:password@host:port/dbname
- mssql: mssql://user:password@host:port/dbname
- oracle: oracle://user:password@host:port/dbname
4. text2data Tool (recommend)
Natural Language to Data Query Tool - Integrates text2sql and sql_executer functionality for one-stop conversion from questions to data
Core Features
- End-to-End Query: Convert natural language questions directly to query results without intermediate steps
- Multi-Database Support: Supports MySQL, PostgreSQL, MSSQL, Oracle, and DM databases
- Smart Output: Supports JSON, Markdown, and Summary output formats
- SQL Auto-Repair: Experimental feature that automatically analyzes and fixes SQL errors when execution fails (requires enablement)
- Safe Execution: Built-in SQL security policies to prevent dangerous operations
- Optimized Experience: Uses tags to fold intermediate processes, with clear result display
Parameter Configuration
SQL Auto-Repair Feature (Experimental)
When is enabled, if the generated SQL execution fails, the system will:
- Auto-Analyze Errors: Capture database error messages and specific causes
- Intelligent Repair: Use LLM to analyze errors and generate repaired SQL
- Iterative Optimization: Support up to N repair attempts (configurable)
- Transparent Process: Display repair process within tags
Repair Scenario Examples:
- ✅ Column name spelling errors (e.g., → )
- ✅ Table name does not exist or is incorrect
- ✅ JOIN condition errors
- ✅ Data type mismatches
- ✅ Syntax errors (dialect-specific syntax)
Usage Recommendations:
- 🧪 Experimental feature,Enabling it will increase the consumption of tokens additionally.
- 📝 Better results in complex Schema scenarios
- ⚡ Adds 2-10 seconds to response time
- 💰 Each repair consumes approximately 2000-3000 tokens
5. data_summary Tool
Data Summary Analysis Tool - Intelligent data content analysis and summarization using large language models
Analysis Capabilities
- Custom Rules: Supports user-defined analysis rules and guidelines
- Smart Data Format Recognition: Automatically identifies JSON and other data formats for optimized processing
- Performance Optimized: Cached common configurations to reduce response time
Configuration Options
6. llm_chart_generator Tool
LLM Intelligent Chart Generation Module - Based on large language models to recommend chart types and fields, using antv to render charts, providing highly maintainable end-to-end chart solutions
Features
- Intelligent Analysis: Automatically analyzes user questions and data, intelligently selects the most suitable chart type
- Multi-Chart Support: Supports mainstream charts such as bar charts, line charts, pie charts, scatter plots, histograms
- High Maintainability: Modular design with clear interfaces, easy to extend and maintain
- Unified Standards: Chart configuration uses standardized JSON format for easy integration and parsing
- Fallback Solutions: Automatically falls back to table display when chart generation fails
- Configuration Validation: Comprehensive configuration validation and error handling mechanisms to ensure stability
Configuration Options
❓ FAQ
Q: Which databases are supported?
A: Currently supports MySQL, PostgreSQL, MSSQL, Oracle, and DM (达梦).
Q: Is the data secure?
A: The plugin only reads database structure information to build Dify knowledge base. Sensitive information is not uploaded.
Q: How to configure the database?
A: Configure database and knowledge base related information in the Dify plugin page. After configuration, it will automatically build the schema knowledge base in Dify.
Q: How to use the text2sql tool?
A: After configuring the database and generating the schema knowledge base, you need to obtain the dataset_id from the generated knowledge base URL and fill it into the tool to specify the indexed knowledge base, and configure other information to use it.
Q: What data formats does the data_summary tool support?
A: Supports multiple data formats including text and JSON. The tool automatically recognizes and optimizes processing. Supports data content up to 50,000 characters.
Q: How to use custom rules?
A: You can specify specific analysis requirements, focus points, or constraints in the custom_rules parameter, supporting up to 2,000 characters.
📸 Example Screenshots
📞 Contact
📄 License
Apache-2.0 license
**