Python tool for converting files and office documents to Markdown.
A powerful document conversion plugin for Dify based on MarkItDown 0.0.2a1, designed to convert various file formats into Markdown with high accuracy and reliability.
This plugin serves as an excellent alternative to traditional document extraction nodes, offering robust file conversion capabilities within the Dify ecosystem. It leverages MarkItDown's plugin-based architecture to provide seamless conversion of multiple file formats to Markdown.
Documents
.pdf
).doc
, .docx
).ppt
, .pptx
).xls
, .xlsx
).html
, .htm
)Media Files
Data Formats
Archives
The plugin accepts the following parameters in the Dify interface:
Yaml1files:
2 type: files
3 required: false
4 description: "Array of files to be converted to Markdown"
The plugin provides three types of response formats for maximum flexibility:
JSON Response (New)
JSON1{
2 "status": "success|error",
3 "total_files": 2,
4 "successful_conversions": 2,
5 "results": [
6 {
7 "filename": "example1.pdf",
8 "original_format": "pdf",
9 "markdown_content": "# Content...",
10 "status": "success"
11 },
12 {
13 "filename": "example2.docx",
14 "original_format": "docx",
15 "markdown_content": "# Content...",
16 "status": "success"
17 }
18 ]
19}
Error response example:
JSON1{
2 "filename": "failed.pdf",
3 "original_format": "pdf",
4 "error": "Error message",
5 "status": "error"
6}
Blob Response
Text Response (Legacy)
[Markdown content of the file]
==================================================
File 1: example1.pdf
==================================================
[Markdown content of file 1]
==================================================
File 2: example2.docx
==================================================
[Markdown content of file 2]
Please convert the attached files to Markdown format.
{@markitdown files=["document.pdf", "presentation.pptx"]}
Here's how to integrate the plugin into your Dify workflow:
Document Analysis Flow
Input: {files} -> MarkItDown Plugin -> LLM Analysis
Content Extraction Flow
Input: {files} -> MarkItDown Plugin -> Text Extraction -> Database Storage
For issues and feature requests, please create an issue in the repository or contact the plugin maintainer.
Note: This plugin is based on MarkItDown 0.0.2a1 and may receive updates as the base library evolves to its first non-alpha release.
JSON Response
Blob Response
Text Response
contact:evanchen@dify.ai