Home

Since its initial release, PaddleOCR has gained widespread acclaim across academia, industry, and research communities, thanks to its cutting-edge algorithms and proven performance in real-world applications. It’s already powering popular open-source projects like Umi-OCR, OmniParser, MinerU, and RAGFlow, making it the go-to OCR toolkit for developers worldwide.

On January 29, 2026, PaddleOCR open-sourced the advanced and efficient document parsing model PaddleOCR-VL-1.5. PaddleOCR-VL-1.5 is a new iterative version of the PaddleOCR-VL series. Based on comprehensive optimization of the core capabilities of version 1.0, the model achieves 94.5% accuracy on the authoritative document parsing benchmark OmniDocBench v1.5, surpassing top global general-purpose large models and document parsing–specific models.

PaddleOCR-VL-1.5 innovatively supports irregular-shaped bounding box localization of document elements, enabling excellent performance in real-world application scenarios such as scanning, skew, warping, screen-photography, and complex illumination, achieving comprehensive SOTA performance. In addition, the model further integrates seal recognition and spotting tasks, with key metrics continuing to lead mainstream models.

You can use it online on the PaddleOCR official website or call the model API.

Major Features in PaddleOCR 3.x:

PaddleOCR-VL - Multilingual Document Parsing via a 0.9B VLM
The SOTA and resource-efficient model tailored for document parsing, that supports 109 languages and excels in recognizing complex elements (e.g., text, tables, formulas, and charts), while maintaining minimal resource consumption.
PP-OCRv5 — Universal Scene Text Recognition
Single model supports five text types (Simplified Chinese, Traditional Chinese, English, Japanese, and Pinyin) with 13% accuracy improvement. Solves multilingual mixed document recognition challenges.
PP-StructureV3 — Complex Document Parsing
Intelligently converts complex PDFs and document images into Markdown and JSON files that preserve original structure. Outperforms numerous commercial solutions in public benchmarks. Perfectly maintains document layout and hierarchical structure.
PP-ChatOCRv4 — Intelligent Information Extraction
Natively integrates ERNIE 4.5 to precisely extract key information from massive documents, with 15% accuracy improvement over previous generation. Makes documents "understand" your questions and provide accurate answers.

[!TIP]

On October 24, 2025, the PaddleOCR official website Beta version was launched, offering a more convenient online experience and large-scale PDF file parsing, as well as free API and MCP services. For more details, please visit the PaddleOCR official website.

In addition to providing an outstanding model library, PaddleOCR 3.0 also offers user-friendly tools covering model training, inference, and service deployment, so developers can rapidly bring AI applications to production.

You can Quick Start directly, find comprehensive documentation in the PaddleOCR Docs, get support via Github Issues, and explore our OCR courses on OCR courses on AIStudio.

Special Note: PaddleOCR 3.x introduces several significant interface changes. Old code written based on PaddleOCR 2.x is likely incompatible with PaddleOCR 3.x. Please ensure that the documentation you are reading matches the version of PaddleOCR you are using. This document explains the reasons for the upgrade and the major changes from PaddleOCR 2.x to 3.x.

🔄 Quick Overview of Execution Results¶

PP-OCRv5¶

PP-OCRv5 Demo

PP-StructureV3¶

PP-StructureV3 Demo

PaddleOCR-VL¶