graph LR
CLI_Entrypoint["CLI Entrypoint"]
MarkItDown_Engine["MarkItDown Engine"]
Converter_Registry_Dispatcher["Converter Registry & Dispatcher"]
PlainTextConverter["PlainTextConverter"]
PdfConverter["PdfConverter"]
ImageConverter["ImageConverter"]
DocumentIntelligenceConverter["DocumentIntelligenceConverter"]
llm_caption["llm_caption"]
DocumentConverterResult["DocumentConverterResult"]
Unclassified["Unclassified"]
CLI_Entrypoint -- "invokes" --> MarkItDown_Engine
MarkItDown_Engine -- "registers and dispatches via" --> Converter_Registry_Dispatcher
Converter_Registry_Dispatcher -- "dispatches to" --> PlainTextConverter
Converter_Registry_Dispatcher -- "dispatches to" --> PdfConverter
Converter_Registry_Dispatcher -- "dispatches to" --> ImageConverter
Converter_Registry_Dispatcher -- "dispatches to" --> DocumentIntelligenceConverter
ImageConverter -- "optionally uses" --> llm_caption
PlainTextConverter -- "produces" --> DocumentConverterResult
PdfConverter -- "produces" --> DocumentConverterResult
ImageConverter -- "produces" --> DocumentConverterResult
DocumentIntelligenceConverter -- "produces" --> DocumentConverterResult
CLI_Entrypoint -- "receives output from" --> DocumentConverterResult
MarkItDown is a thin, command‑line‑driven document‑to‑markdown engine. The main() entry‑point parses user options, builds a StreamInfo hint object and instantiates the MarkItDown façade. The façade registers a prioritized list of converter objects (plain‑text, PDF, image, Azure Document‑Intelligence, etc.) in a registry. When a conversion request arrives, the registry walks the list, asks each converter whether it accepts the supplied stream, and dispatches to the first matching converter. Converters perform the core transformation and may invoke optional AI enrichment helpers – an OpenAI‑style LLM for image captioning (llm_caption) or Azure Document‑Intelligence for OCR‑rich documents. The resulting markdown is returned to the CLI, which writes it to stdout or a user‑specified file. This layered design (CLI → Engine → Registry → Converters → Optional AI) yields a clear, modular data‑flow that maps directly onto a compact flow‑graph with distinct visual boundaries for each architectural component.
Entry point that parses command‑line arguments, builds StreamInfo, creates MarkItDown engine, invokes conversion, and writes markdown output.
Related Classes/Methods:
Facade that holds the converter registry, registers built‑in converters (and optional plugins) on construction.
Related Classes/Methods:
Ordered list of ConverterRegistration objects; dispatches conversion request to the first converter that accepts the stream.
Related Classes/Methods:
Handles plain‑text, JSON, and markdown files; implements accepts() and convert().
Related Classes/Methods:
Handles PDF files; implements accepts() and convert().
Related Classes/Methods:
Handles JPEG/PNG images; may add EXIF metadata and optional LLM caption.
Related Classes/Methods:
Uses Azure Document‑Intelligence service for OCR and layout extraction; also a converter.
Related Classes/Methods:
Helper that generates a natural‑language caption for an image via an OpenAI‑compatible LLM.
Related Classes/Methods:
Result object containing generated markdown and related metadata.
Related Classes/Methods: None
Component for all unclassified files and utility functions (Utility functions/External Libraries/Dependencies)
Related Classes/Methods: None