Document Review Agent
A powerful Dify plugin providing comprehensive AI-powered document review capabilities for various types of documents including tender documents, official documents, contracts, and materials, with support for non-compliant document detection. Supports intelligent document parsing, rule-based auditing, risk aggregation, and annotated document generation with professional-grade quality and flexible configuration options.
Version Information
- Current Version: v0.0.2
- Release Date: 2026-04-13
- Compatibility: Dify Plugin Framework
- Python Version: 3.12
Version History
- v0.0.2 (2026-04-13):
- Added integrated slice audit tool (parse -> load rules -> audit -> aggregate -> annotate -> revise)
- Added integrated simple/full-text audit tool for short document single-loop auditing
- Added template slice audit tool with required and optional
- Added template full-text audit tool with required and optional
- Added template comparators: and
- Added template risk code normalization to style and aligned output fields for aggregation/annotation
- Improved no-risk handling in (returns original reviewed file with instead of failing)
- Reorganized provider tool exposure and YAML definitions around integrated top-level tools
- v0.0.1 (2026-04-05): Initial release with local document review capabilities
Quick Start
-
Install plugin in your Dify environment
-
Download Rules Template and Sample Files:
https://github.com/sawyer-shi/awsome-dify-agents/blob/master/src/doc-review-agent/agent_test_files/review_rules_research_en.csv
-
Configure your LLM model settings. Also note: To prevent timeout, you can modify the parameter PLUGIN_MAX_EXECUTION_TIMEOUT to increase processing time!!!
-
Upload your document and start the review process. Results are as follows:
Key Features
-
Four Integrated Audit Tools: Slice/non-template, full-text/non-template, slice/template, and full-text/template workflows
-
Template Baseline Review: Template-based findings use normalized risk codes like for consistent downstream tagging
-
Hybrid Rule + Template Aggregation: Optional can run together with template audit and merge into one unified risk payload
-
Structured Risk Pipeline: Audit -> aggregation -> annotation -> revision with consistent data schema across workflows
-
High-Quality Output Files: Reviewed (annotated) and revised outputs with configurable JSON/file output modes
-
Flexible Control Knobs: Slice strategy, audit strategy, merge policy, merge strategy, language, and output settings
-
No-Risk Safe Handling: When no risks are found, the workflow returns a valid reviewed file instead of failing
-
Multi-Language Reasoning: Supports zh/en/ja/ko/es/fr/de/pt/ru/ar outputs
Core Features
1) Doc Slice Audit ()
Integrated non-template slice auditing for larger documents.
- Required: , ,
- What it does (6 steps):
- Document slicing
- Rule loading
- Chunk auditing
- Risk aggregation
- Document annotation
- File revision
- Best for: contracts/tenders where chunk-level analysis is preferred
2) Doc Audit ()
Integrated non-template full-text auditing for short documents.
- Required: , ,
- What it does (6 steps):
- Load review document
- Rule loading
- Full-text rule audit
- Risk aggregation
- Document annotation
- File revision
- Best for: shorter documents where whole-text context is important
3) Doc Slice Audit Template ()
Integrated template-based slice auditing.
- Required: , ,
- Optional: (runs rule audit + template audit together when provided)
- What it does (8-step pipeline):
- Slice review document
- Slice template document
- Rule loading (optional input; step kept in progress output)
- Rule-based chunk audit (runs when is provided, otherwise marked as skipped)
- Template chunk comparison audit
- Risk aggregation
- Document annotation
- File revision
- Output semantics: template findings use normalized codes (, , ...), severity from LLM ()
4) Doc Audit Template ()
Integrated template-based full-text auditing.
- Required: , ,
- Optional: (runs rule audit + template audit together when provided)
- What it does (8-step pipeline):
- Load review document
- Load template document
- Rule loading (optional input; step kept in progress output)
- Rule-based full-text audit (runs when is provided, otherwise marked as skipped)
- Full-text template comparison audit
- Risk aggregation
- Document annotation
- File revision
- Best for: short-form baseline checks against a model template
Shared Output and Controls
- JSON output: or
- File output: revised only, or reviewed + revised
- Revision behavior: choose merge strategy and whether to apply revisions back to source text
- No-risk behavior: returns a valid reviewed file with
Technical Advantages
- LLM-Powered Analysis: Leverages advanced LLM models for intelligent document understanding
- Rule-Based Auditing: Flexible rule system for customizable review criteria
- Chunk-Based Processing: Efficient handling of large documents through intelligent slicing
- Risk Deduplication: Smart aggregation to eliminate redundant findings
- Annotated Output: Professional document output with clear risk indicators
- Multi-Format Support: Optimized for docx format with extensibility for other formats
- Configurable Audit Levels: Support for strict and lenient auditing modes
- Real-Time Processing: Efficient workflow for timely document review
Requirements
- Python 3.12
- Dify Platform access
- Configured LLM model
- Required Python packages (installed via requirements.txt):
- dify_plugin>=0.5.0
- python-docx>=1.1.2
- openpyxl>=3.1.5
Installation & Configuration
-
Install required dependencies:
-
Configure your LLM model in plugin settings
-
Install plugin in your Dify environment
Usage
Choose the Right Tool
A) Non-template Slice Audit
Use when you have a rule file and need chunk-level review.
- Required: , ,
- Recommended options: , , ,
B) Non-template Full-Text Audit
Use when you have a rule file and the document is short enough for full-text auditing.
- Required: , ,
- Recommended options: , ,
C) Template Slice Audit
Use when template compliance is required at chunk level.
- Required: , ,
- Optional: for hybrid rule + template audit
- Notes: template findings are normalized to style risk codes
D) Template Full-Text Audit
Use when template compliance is required for the full document.
- Required: , ,
- Optional: for hybrid rule + template audit
Typical Output
- A JSON summary (or detailed JSON if enabled)
- A reviewed (annotations)
- A revised (merged or applied revisions)
Supported Document Formats
- Input: .docx (Microsoft Word)
- Output: .docx (Microsoft Word with annotations)
Notes
- Document parsing is optimized for docx format
- Chunk size can be adjusted based on document complexity
- Audit level affects the strictness of rule application
- Risk aggregation uses intelligent deduplication to avoid redundant findings
- Annotation style currently supports comment-based annotations
- Large documents are processed efficiently through chunking
- All tools require a configured LLM model for operation
Developer Information
- Author:
- Email: [email protected]
- License: Apache License 2.0
- Source Code:
- Support: Through Dify platform and GitHub Issues
License Notice
This project is licensed under Apache License 2.0. See LICENSE [blocked] file for full license text.
Ready to review your documents with AI-powered intelligence?