Automatically create database schema knowledge base to build RAGs. Nodes containing natural language to SQL functionality based on schemaRAG
Author: joto
Version: 0.1.6
Type: tool
Repository: https://github.com/JOTO-AI/SchemaRAG-dify-plugin

SchemaRAG is a database schema RAG plugin designed specifically for the Dify platform. It can automatically analyze database structures, build knowledge bases, and implement natural language to SQL queries. This plugin provides a complete database schema analysis and intelligent query solution, ready to use out of the box.
Example workflow download
| Parameter Name | Type | Required | Description | Example |
|---|---|---|---|---|
| Dataset API Key | secret | Yes | Dify knowledge base API key | dataset-xxx |
| Database Type | select | Yes | Database type MySQL/PostgreSQL/MSSQL/Oracle/DM | MySQL |
| Database Host | string | Yes | Database host/IP | 127.0.0.1 |
| Database Port | number | Yes | Database port | 3306/5432 |
| Database User | string | Yes | Database username | root |
| Database Password | secret | Yes | Database password | ****** |
| Database Name | string | Yes | Database name | mydb |
| Dify Base URL | string | No | Dify API base URL |
| Database Type | Default Port | Driver | Connection String Format |
|---|---|---|---|
| MySQL | 3306 | pymysql | |
| PostgreSQL | 5432 | psycopg2-binary | |
| Microsoft SQL Server | 1433 | pymssql | |
| Oracle | 1521 | oracledb | |
| DM Database (达梦) | 5236 | dm+pymysql |
Fill in the above parameters in the Dify platform plugin configuration interface

After configuration is complete and accurate, click save to automatically build the configured database schema knowledge base in Dify
Add tools in the workflow and configure the knowledge base ID that was just created (the knowledge base ID is in the URL of the knowledge base page)

Provide SQL execution tool, input the generated SQL for direct execution, supports markdown and json output

Natural Language to SQL Query Tool - Convert natural language questions to SQL queries using database schema knowledge base
| Parameter | Type | Required | Description |
|---|---|---|---|
| dataset_id | string | Yes | Dify knowledge base ID containing database schema |
| llm | model-selector | Yes | Large language model for SQL generation |
| content | string | Yes | Natural language question to convert to SQL |
| dialect | select | Yes | SQL dialect (MySQL/PostgreSQL/MSSQL/Oracle/DM) |
| top_k | number | No | Number of results to retrieve from knowledge base (default 5) |
SQL Query Execution Tool - Safely execute SQL queries and return formatted results
| Parameter | Type | Required | Description |
|---|---|---|---|
| sql | string | Yes | SQL query statement to execute |
| output_format | select | Yes | Output format (JSON/Markdown) |
| max_line | int | No | Maximum number of query rows (default 1000) |
Custom SQL Query Execution Tool - Custom database connection and safely execute SQL queries to return formatted results
| Parameter | Type | Required | Description |
|---|---|---|---|
| database_url | string | Yes | Database connection URL |
| sql | string | Yes | SQL query statement to execute |
| output_format | select | Yes | Output format (JSON/Markdown) |
| max_line | int | No | Maximum number of query rows (default 1000) |
Database connection URL examples:
Natural Language to Data Query Tool - Integrates text2sql and sql_executer functionality for one-stop conversion from questions to data
| Parameter | Type | Required | Description |
|---|---|---|---|
| dataset_id | string | Yes | Dify knowledge base ID containing database schema, supports multiple IDs separated by commas |
| llm | model-selector | Yes | Large language model for SQL generation and analysis |
| content | string | Yes | Natural language question to convert to SQL |
| dialect | select | Yes | SQL dialect (MySQL/PostgreSQL/MSSQL/Oracle/DM) |
| output_format | select | Yes | Output format (JSON/Markdown/Summary) |
| top_k | number | No | Number of results to retrieve from knowledge base (default 5) |
| max_rows | number | No | Maximum number of rows to return (default 500, prevents excessive data) |
| example_dataset_id | string | No | Example knowledge base ID, can provide SQL examples to improve generation quality |
| enable_refiner | boolean | No | Enable SQL auto-repair feature (experimental, default false) |
| max_refine_iterations | number | No | Maximum SQL repair attempts (1-5, default 3) |
When is enabled, if the generated SQL execution fails, the system will:
Repair Scenario Examples:
Usage Recommendations:
Data Summary Analysis Tool - Intelligent data content analysis and summarization using large language models
| Parameter | Type | Required | Description |
|---|---|---|---|
| data_content | string | Yes | Data content to be analyzed |
| llm | model-selector | Yes | Large language model for analysis |
| query | string | Yes | Analysis query or focus area |
| custom_rules | string | No | Custom analysis rules |
| user_prompt | string | No | Custom prompt |
LLM Intelligent Chart Generation Module - Based on large language models to recommend chart types and fields, using antv to render charts, providing highly maintainable end-to-end chart solutions
| Parameter | Type | Required | Description |
|---|---|---|---|
| user_question | string | Yes | User question describing the chart type and requirements (e.g., sales trends, market share) |
| data | string | Yes | Data for visualization, supports JSON, CSV, or structured data |
| llm | model-selector | Yes | Large language model for analysis and chart generation |
| sql_query | string | Yes | SQL query statement used to recommend charts and fields |
Q: Which databases are supported?
A: Currently supports MySQL, PostgreSQL, MSSQL, Oracle, and DM (达梦).
Q: Is the data secure?
A: The plugin only reads database structure information to build Dify knowledge base. Sensitive information is not uploaded.
Q: How to configure the database?
A: Configure database and knowledge base related information in the Dify plugin page. After configuration, it will automatically build the schema knowledge base in Dify.
Q: How to use the text2sql tool?
A: After configuring the database and generating the schema knowledge base, you need to obtain the dataset_id from the generated knowledge base URL and fill it into the tool to specify the indexed knowledge base, and configure other information to use it.
Q: What data formats does the data_summary tool support?
A: Supports multiple data formats including text and JSON. The tool automatically recognizes and optimizes processing. Supports data content up to 50,000 characters.
Q: How to use custom rules?
A: You can specify specific analysis requirements, focus points, or constraints in the custom_rules parameter, supporting up to 2,000 characters.




Apache-2.0 license