Skip to content

Latest commit

 

History

History
119 lines (67 loc) · 6.17 KB

File metadata and controls

119 lines (67 loc) · 6.17 KB
graph LR
    CLI_Entrypoint["CLI Entrypoint"]
    MarkItDown_Engine["MarkItDown Engine"]
    Converter_Registry_Dispatcher["Converter Registry & Dispatcher"]
    PlainTextConverter["PlainTextConverter"]
    PdfConverter["PdfConverter"]
    ImageConverter["ImageConverter"]
    DocumentIntelligenceConverter["DocumentIntelligenceConverter"]
    llm_caption["llm_caption"]
    DocumentConverterResult["DocumentConverterResult"]
    Unclassified["Unclassified"]
    CLI_Entrypoint -- "invokes" --> MarkItDown_Engine
    MarkItDown_Engine -- "registers and dispatches via" --> Converter_Registry_Dispatcher
    Converter_Registry_Dispatcher -- "dispatches to" --> PlainTextConverter
    Converter_Registry_Dispatcher -- "dispatches to" --> PdfConverter
    Converter_Registry_Dispatcher -- "dispatches to" --> ImageConverter
    Converter_Registry_Dispatcher -- "dispatches to" --> DocumentIntelligenceConverter
    ImageConverter -- "optionally uses" --> llm_caption
    PlainTextConverter -- "produces" --> DocumentConverterResult
    PdfConverter -- "produces" --> DocumentConverterResult
    ImageConverter -- "produces" --> DocumentConverterResult
    DocumentIntelligenceConverter -- "produces" --> DocumentConverterResult
    CLI_Entrypoint -- "receives output from" --> DocumentConverterResult
Loading

CodeBoardingDemoContact

Details

MarkItDown is a thin, command‑line‑driven document‑to‑markdown engine. The main() entry‑point parses user options, builds a StreamInfo hint object and instantiates the MarkItDown façade. The façade registers a prioritized list of converter objects (plain‑text, PDF, image, Azure Document‑Intelligence, etc.) in a registry. When a conversion request arrives, the registry walks the list, asks each converter whether it accepts the supplied stream, and dispatches to the first matching converter. Converters perform the core transformation and may invoke optional AI enrichment helpers – an OpenAI‑style LLM for image captioning (llm_caption) or Azure Document‑Intelligence for OCR‑rich documents. The resulting markdown is returned to the CLI, which writes it to stdout or a user‑specified file. This layered design (CLI → Engine → Registry → Converters → Optional AI) yields a clear, modular data‑flow that maps directly onto a compact flow‑graph with distinct visual boundaries for each architectural component.

CLI Entrypoint

Entry point that parses command‑line arguments, builds StreamInfo, creates MarkItDown engine, invokes conversion, and writes markdown output.

Related Classes/Methods:

MarkItDown Engine

Facade that holds the converter registry, registers built‑in converters (and optional plugins) on construction.

Related Classes/Methods:

Converter Registry & Dispatcher

Ordered list of ConverterRegistration objects; dispatches conversion request to the first converter that accepts the stream.

Related Classes/Methods:

PlainTextConverter

Handles plain‑text, JSON, and markdown files; implements accepts() and convert().

Related Classes/Methods:

PdfConverter

Handles PDF files; implements accepts() and convert().

Related Classes/Methods:

ImageConverter

Handles JPEG/PNG images; may add EXIF metadata and optional LLM caption.

Related Classes/Methods:

DocumentIntelligenceConverter

Uses Azure Document‑Intelligence service for OCR and layout extraction; also a converter.

Related Classes/Methods:

llm_caption

Helper that generates a natural‑language caption for an image via an OpenAI‑compatible LLM.

Related Classes/Methods:

DocumentConverterResult

Result object containing generated markdown and related metadata.

Related Classes/Methods: None

Unclassified

Component for all unclassified files and utility functions (Utility functions/External Libraries/Dependencies)

Related Classes/Methods: None