What makes LLMWhisperer a more scalable solution compared to self-hosted open source OCR models?

Self-hosted open source OCR models, especially LLM-based ones, require significant infrastructure to scale. Developers must manage GPU resources, model serving, batching, and error handling. LLMWhisperer OCR is offered as a managed API, handling all the complexity of scalability, reliability, and uptime behind a single endpoint. This allows developers to integrate a high-performance OCR capability without the operational overhead of maintaining their own inference infrastructure, making it suitable for processing millions of documents.

What are some popular open source OCR tools available in 2025?

Some of the most widely-used open source OCR tools in 2025 include Tesseract, PaddleOCR, EasyOCR, Surya, and docTR. These tools are supported by active communities, and each has strengths in different document types and workflows. For example, Tesseract is known for its stability and language support, while PaddleOCR performs well with complex layouts and multilingual documents. If you are looking for an enterprise grade OCR software, then try LLMWhisperer

How are modern open source OCR models different from traditional ones?

Modern open source OCR models, especially those powered by large language models (LLMs) such as olmOCR and Qwen2.5-VL, go beyond simple character recognition. They can interpret document context, handle irregular or complex layouts, preserve tables and sections, and even recognize handwriting. In contrast, traditional open source OCR tools typically excel at clean, structured text but may struggle with messy documents or handwritten input.

What is LLMWhisperer OCR and how does it enhance open source OCR capabilities?

LLMWhisperer OCR is a modern OCR solution designed and managed by Unstract to deliver enterprise-grade document processing. Building on the strengths of open source OCR, LLMWhisperer offers production-ready infrastructure, scalability, advanced compliance features, and support for complex workflows. It combines the flexibility of open source OCR with the robustness and reliability required by large organizations.

How is LLMWhisperer OCR better than other open OCR models or tools?

LLMWhisperer OCR stands out from other open OCR options by providing a scalable and secure pipeline ready for enterprise deployment. Unlike most open source OCR tools or models, LLMWhisperer handles high document volumes, offers dedicated uptime and support, includes compliance capabilities, and leverages the latest LLM-based approaches to ensure semantic accuracy even for documents with irregular layouts or handwriting. This makes it suitable for mission-critical and high-volume environments where traditional open source OCR solutions might be insufficient.

Last updated on September 19, 2025
Nuno Bispo

Best Open-Source OCR Tools in 2025: A Comparison

Introduction

Optical Character Recognition (OCR) is the backbone of modern digital workflows. From scanning historical archives and automating data entry to enabling AI-powered document analysis, OCR transforms static images and PDFs into actionable, searchable text. As businesses and researchers continue to digitize processes, the demand for accurate and flexible OCR solutions is only growing.

Open-source OCR software has provided tools that are not only free to use but also provide transparency, community-driven improvements, and the flexibility to customize solutions for specific needs. Open-source OCR has become a powerful option for developers, researchers, and teams who want control over their technology stack without relying on proprietary platforms.

This guide is written for developers, researchers, small teams, and even enterprises looking to experiment with or adopt open-source OCR in their workflows. We’ll cover both traditional OCR engines and modern LLM-powered approaches, giving you a practical sense of their strengths and limitations.

What is Open-Source OCR?

Optical Character Recognition (OCR) is the process of converting text contained in images, scanned documents, or PDFs into machine-readable text. It’s the technology that makes it possible to digitize books, extract information from invoices, process forms automatically, or enable search across large document collections. In short, OCR bridges the gap between the physical and digital worlds.

When we talk about open-source OCR, we mean software that is freely available under open-source licenses. These tools are cost-free, transparent in how they work, and often highly customizable. Developers can adapt them to specific use cases, researchers can experiment with algorithms, and organizations can integrate them into their workflows without being locked into proprietary solutions.

That said, open-source OCR comes with its own set of challenges. Setup and configuration can be more complex compared to polished commercial tools. Maintenance often depends on community activity, which can vary widely across projects. And performance may differ significantly between tools, with some excelling at structured text while struggling with handwriting or complex document layouts.

Top Open-Source OCR Libraries Selected for Evaluation:

Tesseract

PaddleOCR

Docling

EasyOCR

Surya OCR

Mistral OCR (LLM-based OCR)

olmOCR (LLM-based OCR)

Qwen 2.5-VL (LLM-based OCR)

How We Evaluated the OCR Tools

To make this guide practical, we focused on how each OCR tool performs against real-world document challenges rather than running exhaustive benchmarks.

Our evaluation centered on three key criteria:

Tables – Can the tool correctly recognize and preserve tabular structures?
Forms – How well does it handle checkboxes, radio buttons, and handwriting elements commonly found in forms?
Complex layouts – Does it cope with multi-column text, mixed fonts, and non-standard document designs?

In terms of approach, we’re not re-testing every tool from scratch. For OCR engines we’ve already explored in a previous article, we will provide a summary of past results. For the others, we conducted fresh evaluations.

It’s worth noting that this is not intended as a comprehensive benchmark study. Instead, our goal is to deliver a practical guide that helps developers, researchers, and teams quickly understand which open-source OCR tools might fit their use cases and what trade-offs they should expect.

LLMWhisperer: The Best OCR for LLMs

If your solution involves using Large Language Models(LLMs) to process and extract document data:

LLMs are powerful, but their output is as good as the input you provide. Documents can be a mess: widely varying formats and encodings, scans of images, numbered sections, and complex tables.

LLMWhisperer is a technology that presents data from complex documents to LLMs in a way they’re able to best understand it.

If you want to quickly take it for test drive, you can checkout our free playground.

Traditional Open-Source OCR Tools

Traditional open-source OCR engines form the base of many document digitization workflows. These tools have been battle-tested across countless projects, from academic research to enterprise automation, and remain some of the most reliable free options available. While they may lack the adaptability of newer LLM-based approaches, they excel in stability, performance, and community support.

In this section, we’ll first provide a brief recap of the tools we’ve already evaluated in detail, before moving on to highlight new additions worth exploring.

Previously Reviewed OCR Engines

Tesseract
→ Recap: Learn more about the pros and cons of using Tesseract OCR

Tesseract remains one of the most widely used and battle-tested OCR engines. Backed by Google, it has a long history and an active community. Its strength lies in handling clean, high-quality images, making it a dependable choice for straightforward OCR tasks. Tesseract supports more than 100 languages and even allows custom training for domain-specific vocabularies.

Strengths:

Solid accuracy with clean, well-structured documents.
Extensive language support.
Custom training options to improve recognition.

Weaknesses:

Struggles with complex layouts and formatting-heavy documents.
Poor performance on handwritten text.

Explore the full comparison: LLMWhisperer vs. Tesseract

PaddleOCR
→ Recap: Learn more about the pros and cons of using PaddleOCR

Developed by Baidu, PaddleOCR has quickly gained traction as a robust open-source alternative for multilingual and layout-aware OCR. It performs significantly better than Tesseract when dealing with multi-language documents and complex layouts such as tables and mixed formatting. It’s also optimized for speed, making it a practical option for real-time applications.

Strengths:

Excels at multi-language recognition.
Handles complex layouts better than Tesseract.
Lightweight and fast, suitable for real-time use cases.

Weaknesses:

Structured data extraction is less advanced compared to enterprise cloud services.

See how LLMWhisperer’s OCR approach stacks up against PaddleOCR

Docling
→ Recap: Learn more about the pros and cons of using PaddleOCR

IBM’s Docling is designed for converting documents into lightweight, markdown-like formats. It’s particularly effective when dealing with digital-born PDFs where the layout is relatively simple. Its strength lies in its ability to output clean, human-readable markdown, which makes it appealing for workflows that rely on lightweight text processing. However, its markdown-first design limits its ability to preserve complex layouts or extract structured data, and it struggles with scanned or handwritten inputs.

See the side-by-side comparison of LLMWhisperer and Docling

Strengths:

Simple markdown output for digital documents.
Lightweight integration, easy to adopt for basic workflows.

Weaknesses:

Markdown syntax limits complex layout preservation.
Struggles with non-digital inputs (scans, handwriting).

Open-Source OCR Comparison: Updated with New Tools

EasyOCR

EasyOCR is one of the most widely used open-source OCR libraries, with support for 80+ languages. Built on PyTorch, it’s straightforward to install and use; developers can get started with just a few lines of Python. It works well for quick text extraction tasks and provides decent results on scanned documents and images with standard fonts.

Installation:

pip install easyocr pypdfium2

If on Windows, torch and torchvision must be installed first, as per PyTorch documentation.

Usage:

import easyocr
import pypdfium2 as pdfium
import sys

# Create a reader object (multiple languages can be specified, and here it will run on GPU)
reader = easyocr.Reader(['en'], gpu=True)

# Process the pdf file
def process_pdf(pdf_path):
    # Load a document
    pdf = pdfium.PdfDocument(pdf_path)

    # Loop over pages and render
    for i in range(len(pdf)):
        page = pdf[i]
        image = page.render(scale=1).to_pil()

        # Save image
        file_name = f"docs/output/page_{i+1}.png"
        image.save(file_name)
        
        # Read the text from the image
        print(f"Processing page {i+1}...")
        result = reader.readtext(file_name, detail=0, paragraph=True)
        
        # Print the result
        print(f"Page {i+1} OCR Results:")
        print(result)

    # Close the PDF
    pdf.close()

# Main function, receives the path to the pdf file
if __name__ == "__main__":
    process_pdf(sys.argv[1])

This script demonstrates how to run EasyOCR on PDF files by first converting each page into an image with pypdfium2, then passing the images to EasyOCR for text extraction. It supports multiple languages (configurable at initialization), runs on GPU by default, and outputs recognized text per page.

Surya OCR

Surya is a modern, open-source OCR system designed for document layout analysis and advanced text extraction. It supports 90+ languages and is built to handle complex documents with tables, multi-column layouts, and varied formatting. Unlike simpler OCR libraries, Surya emphasizes structural understanding of documents, making it more useful for use cases where layout preservation is important.

Installation:

pip install surya-ocr pypdfium2

Usage:

from surya.foundation import FoundationPredictor
from surya.recognition import RecognitionPredictor
from surya.detection import DetectionPredictor
import pypdfium2 as pdfium
import sys


# Process the pdf file
def process_pdf(pdf_path):
    # Load a document
    pdf = pdfium.PdfDocument(pdf_path)

    # Loop over pages and render
    for i in range(len(pdf)):
        page = pdf[i]
        image = page.render(scale=1).to_pil()
        
        print(f"Processing page {i+1}...")
        foundation_predictor = FoundationPredictor()
        recognition_predictor = RecognitionPredictor(foundation_predictor)
        detection_predictor = DetectionPredictor()

        predictions = recognition_predictor([image], det_predictor=detection_predictor)
        print(f"Page {i+1} OCR Results:")
        for prediction in predictions:
            for text_lines in prediction.text_lines:
                print(text_lines.text)

    # Close the PDF
    pdf.close()

# Main function, receives the path to the pdf file
if __name__ == "__main__":
    process_pdf(sys.argv[1])

This script shows how to apply Surya OCR to PDF documents by rendering each page into an image with pypdfium2 and then passing it through Surya’s Foundation, Recognition, and Detection predictors. The pipeline first analyses the page layout, detects text regions, and then recognizes the textual content line by line.

docTR OCR

docTR (Document Text Recognition) is a deep-learning–based OCR library that integrates text detection and recognition into a single pipeline. Built on TensorFlow and PyTorch, it is designed for handling scanned documents, multi-column layouts, and mixed formatting, making it suitable for more complex document processing tasks than traditional OCR engines.

Installation:

pip install python-doctr

Usage:

from doctr.io import DocumentFile
from doctr.models import ocr_predictor
import sys

# Process the pdf file
def process_pdf(pdf_path):
    model = ocr_predictor(pretrained=True)
    # PDF
    doc = DocumentFile.from_pdf(pdf_path)
    # Analyze
    result = model(doc)
    # Export the result
    json_output = result.export()
    for page in json_output['pages']:
        for block in page['blocks']:
            for line in block['lines']:
                for word in line['words']:
                    print(word['value'])

# Main function, receives the path to the pdf file
if __name__ == "__main__":
    process_pdf(sys.argv[1])

This script demonstrates how to use docTR for PDF OCR by loading the file with DocumentFile, running it through a pretrained ocr_predictor, and exporting the recognized content as structured JSON. The pipeline captures text at multiple levels (blocks, lines, words), allowing granular access to document content. In this example, the recognized words are iterated and printed directly.

Modern LLM-Based Open-Source OCR Tools

Large Language Models (LLMs) are beginning to reshape how OCR is performed. Unlike traditional OCR engines, which focus on character recognition, LLM-based tools can interpret context, adapt to irregular layouts, and even infer structure when documents don’t follow a standard pattern. These approaches are still new and come with trade-offs, such as higher resource demands and the risk of hallucinations, but they represent a significant step forward in OCR capabilities.

Recap of Past Evaluations

MistralOCR

MistralOCR is a modern LLM-based open-source OCR tool designed to extract text from documents while interpreting context and layout. Unlike traditional OCR engines, it leverages large language models to understand structure and content, providing better results on irregular layouts and mixed-format documents. However, as with most LLM-based systems, it has limitations, especially with structured data, handwriting, and low-quality scans.

When to Use Mistral OCR:

✅ For clean, digital documents (e.g., basic PDFs with standard fonts and layout).
⚡ When you need fast Markdown output without additional processing.
???? Suitable for simple extraction tasks where layout fidelity and field grouping aren’t critical.

When not to use Mistral OCR:

????️ Useful in document-heavy domains like logistics, legal, healthcare, and finance where accuracy and data structure are essential.
???? For scanned, skewed, or handwritten documents requiring layout-aware parsing.
???? When tables, checkboxes, or multi-format inputs (PDF, DOCX, XLSX) are involved.
???? Ideal for automation pipelines where structured or schema-mapped output is required.

Mistral OCR vs. LLMWhisperer OCR
→ Learn more about the pros and cons of using Mistral OCR

Feature / Document Type	Mistral OCR	LLMWhisperer (Unstract)
Layout Preservation	⚠️ Collapses layout on scans or OCR noise	✅ Reconstructs layout even in skewed scans
Table Extraction	⚠️ Struggles with structured tables	✅ Schema-aware tabular output
Checkboxes / Radios	⚠️ Detected but lacks structure	✅ Parsed with semantic structure
Handwriting Support	⚠️ Very limited, poor accuracy	✅ Parses mixed print + handwriting
Document Boundaries	❌ Flattened, lacks clear sections	✅ Maintains headers, sections, totals
Field Deduplication	❌ Redundant values repeated excessively	✅ Clean de-duplication and semantic grouping
Numerical Accuracy	⚠️ Often ambiguous due to layout collapse	✅ Preserves precision across columns
Image/Scan Robustness	⚠️ Struggles with low quality or skew	✅ Layout and data still reconstructed
Excel File Support	❌ Not supported (API error)	✅ Reads .xlsx directly, extracts data
Hallucination Risk	❌ High — spurious or repeated data	✅ Controlled, factual extraction

New additions to the evaluation:

In the following examples, both olmOCR and Qwen2.5-VL will be executed using the Hugging Face Transformers library. This ensures standardized model loading, tokenization, and inference across different architectures, while making it easy to swap between models or run them locally/with GPU acceleration.

olmOCR

olmOCR is an open-source OCR system developed by Allen AI, built upon the Qwen-2-VL 7B vision-language model. It specializes in converting rasterized PDFs and scanned documents into clean, structured text, preserving layout, tables, equations, and even handwriting. Unlike many proprietary solutions, olmOCR is fully open-source, offering transparency in training data, code, and methodologies.

Installation:

pip install transformers pypdfium2

Usage:

import base64
from transformers import AutoProcessor, AutoModelForVision2Seq
import sys
import pypdfium2 as pdfium
import torch
from transformers import AutoProcessor, AutoModelForImageTextToText

# Process the pdf file
def process_pdf(pdf_path):
    # Load a document
    pdf = pdfium.PdfDocument(pdf_path)

    # Loop over pages and render
    image_list = []
    for i in range(len(pdf)):
        page = pdf[i]
        image = page.render(scale=1).to_pil()

        # Save image
        file_name = f"docs/output/page_{i+1}.png"
        image.save(file_name)
        
        # Encode image to base64
        with open(file_name, "rb") as image_file:
            encoded_string = base64.b64encode(image_file.read()).decode('utf-8')

        # Load the model
        model_id = "allenai/olmOCR-7B-0725" 
        processor = AutoProcessor.from_pretrained(model_id)
        model = AutoModelForImageTextToText.from_pretrained(model_id, torch_dtype=torch.float16).to("cuda").eval()

        # Define the prompt
        PROMPT = """
                Attached is one page of a document that you must process. 
                Just return the plain text representation of this document as if you were reading it naturally. 
                Convert equations to LateX and tables to markdown.
                Return your output as markdown
            """

        # Define the messages
        messages = [
            {
                "role": "user",
                "content": [
                    {
                        "type": "image",
                        "image": encoded_string,
                    },
                    {"type": "text", "text": PROMPT},
                ],
            }
        ]

        # Apply the chat template
        inputs = processor.apply_chat_template(
            messages,
            add_generation_prompt=True,
            tokenize=True,
            return_dict=True,
            return_tensors="pt"
        ).to(model.device)
# Generate the output
        output_ids = model.generate(**inputs, max_new_tokens=1000)
        generated_ids = [output_ids[len(input_ids):] for input_ids, output_ids in zip(inputs.input_ids, output_ids)]
        # Decode the output
        output_text = processor.batch_decode(generated_ids, skip_special_tokens=True, clean_up_tokenization_spaces=True)
        # Print the output
        print(output_text)

    # Close the PDF
    pdf.close()

# Main function, receives the path to the pdf file
if __name__ == "__main__":
    process_pdf(sys.argv[1])

This script showcases how to use olmOCR-7B by converting PDF pages into images, encoding them in Base64, and sending them with a natural language prompt to the multimodal model. Unlike traditional OCR tools, olmOCR interprets the document contextually, returning plain text output in Markdown (defined in this prompt). Running on GPU via Hugging Face’s transformers, it leverages the AutoProcessor and AutoModelForImageTextToText pipeline to generate rich, structured text representations directly from scanned PDFs.

Qwen2.5vl

Qwen2.5-VL is a state-of-the-art vision-language model developed by Alibaba Group. It integrates advanced visual perception with deep language understanding, enabling it to process and interpret images and documents at native resolutions. The model is designed to handle complex document layouts, multi-language text, and various orientations, making it highly effective for OCR tasks.

Installation:

pip install transformers pypdfium2

Usage:

import base64
from transformers import AutoProcessor, AutoModelForVision2Seq
import sys
import pypdfium2 as pdfium
import torch
from transformers import AutoProcessor, AutoModelForImageTextToText

# Process the pdf file
def process_pdf(pdf_path):
    # Load a document
    pdf = pdfium.PdfDocument(pdf_path)

    # Loop over pages and render
    image_list = []
    for i in range(len(pdf)):
        page = pdf[i]
        image = page.render(scale=1).to_pil()

        # Save image
        file_name = f"docs/output/page_{i+1}.png"
        image.save(file_name)
        
        # Encode image to base64
        with open(file_name, "rb") as image_file:
            encoded_string = base64.b64encode(image_file.read()).decode('utf-8')

        # Load the model
        model_id = "Qwen/Qwen2.5-VL-7B-Instruct" 
        processor = AutoProcessor.from_pretrained(model_id)
        model = AutoModelForImageTextToText.from_pretrained(model_id, torch_dtype=torch.float16).to("cuda").eval()

        # Define the prompt
        PROMPT = """
                Attached is one page of a document that you must process. 
                Just return the plain text representation of this document as if you were reading it naturally. 
                Convert equations to LateX and tables to markdown.
                Return your output as markdown
            """

        # Define the messages
        messages = [
            {
                "role": "user",
                "content": [
                    {
                        "type": "image",
                        "image": encoded_string,
                    },
                    {"type": "text", "text": PROMPT},
                ],
            }
        ]

        # Apply the chat template
        inputs = processor.apply_chat_template(
            messages,
            add_generation_prompt=True,
            tokenize=True,
            return_dict=True,
            return_tensors="pt"
        ).to(model.device)

        # Generate the output
        output_ids = model.generate(**inputs, max_new_tokens=1000)
        generated_ids = [output_ids[len(input_ids):] for input_ids, output_ids in zip(inputs.input_ids, output_ids)]
        # Decode the output
        output_text = processor.batch_decode(generated_ids, skip_special_tokens=True, clean_up_tokenization_spaces=True)
        # Print the output
        print(output_text)

    # Close the PDF
    pdf.close()

# Main function, receives the path to the pdf file
if __name__ == "__main__":
    process_pdf(sys.argv[1])

This script demonstrates how to use Qwen2.5-VL-7B-Instruct via Hugging Face Transformers for document OCR and structured extraction. Each page of a PDF is rendered to an image with pypdfium2, encoded to base64, and passed along with a text prompt that instructs the model to output markdown-formatted text.

The Role of LLMs in OCR

Large Language Models bring a fundamentally different approach to OCR. Instead of only recognizing characters, they can interpret structure, context, and intent, making them especially effective for documents with irregular layouts, multi-column text, or a mix of tables, handwriting, and images. In essence, LLMs can “read” documents more like humans do, reconstructing meaning rather than just transcribing shapes.

Advantages

Context-aware extraction – LLMs understand not just words, but how they relate, enabling better handling of tables, forms, and multi-modal inputs.
Flexible layouts – Work well with semi-structured or messy documents where traditional OCR struggles.
Beyond text – Can interpret metadata, infer relationships, and sometimes even detect errors or missing content.

Drawbacks

But these strengths come with real challenges:

Hallucinations – LLMs may invent words, numbers, or structures not present in the source. For example, an invoice total might be “corrected” incorrectly because the model inferred a pattern.
Resource intensive – Running LLM-based OCR often requires GPUs, large memory, and careful optimization, making it harder to deploy at scale.
Unpredictable outputs – Unlike traditional OCR, which is deterministic, LLMs may produce slightly different results for the same input. This is problematic in compliance-heavy industries.
Maintenance complexity – Fine-tuning or prompting strategies may be needed to keep accuracy consistent across diverse document types.

When to Use LLM/AI-based OCR (and When Not To)

LLM-based OCR is best suited for R&D, experimental projects, or innovation-driven use cases where flexibility and interpretive power matter more than strict reproducibility. It’s a great choice when documents are highly varied or contain unstructured information.

However, for enterprise-scale, compliance-driven, or mission-critical workflows, relying solely on LLMs is risky. In those cases, a hybrid approach, using traditional OCR for core text extraction, with LLMs layered on top for context-aware interpretation, often strikes the best balance.

LLMWhisperer: Best OCR for PDF Checkbox Extraction

PDF forms have checkboxes and radiobuttons that can be filled out by the user. These form elements are used to collect data from the user. In this video, we will show how to extract these form elements using LLMWhisperer in a way that LLMs can understand.

Best Open-Source OCR: Results & Insights

For evaluating and comparing the different open-source OCR tools, we selected a set of test documents designed to reflect real-world challenges and common use cases.

Insurance Plan – Complex Tables

Tests the ability to correctly extract structured data from multi-row, multi-column tables with nested details.

Download sample →

(Page 2 not relevant for this test, shortened for brevity in the outputs)

EasyOCR

Processing page 1... 
Page 1 OCR Results: 
['Premera Blue Cross Preferred Gold 1500 Alaska plan for Indivlduals end famllles Slarg QaleJanuani 1,2023', 'PREMERA blul cmoby ULur ALAbr Atdtdelieeuriie', 'YouhatcucoeatoteLajqraDur J Sckci NcncnrdterjbonalBl Cmas Wccane plondcr nCeol PKJ:c [eict :Dene neat panctor Impolart plan andretwolk Intormallon', 'Loeluddat', 'a cedma', 'Laneaneaaldporau', 'dendb', 'Pu caenjz APc| Famk - %mnanoui Ancum CUPivanc Oudcirtobemx Tcne Gcto Gcan Famk - %mnanaudIrTeteonvi', '57 enn', '63027', '8om', 'Dut orcconmutmum', '63uo', 'Acon', 'A coannt', 'Feudes', 'Ambuatxymaban Eantea DlcVatr', 'Cteitaeti', 'Icoleten Zur SD)coly St0 coaly Eoconi', 'Dencnihit 772', 'drtchloein 3T4', 'Deranzd FCPomie Sicsomoum', 'Drrlclin aa Ihtr 4J1 Drrlclin aa Ihtr 4J1 Drrlclin aa Ihtr 4J1', 'Deqctela Enen GTt Drouctlo Ehtn LTe Drouctlo Ehtn LTe', 'Urceni CC eunalmancuecn: HEMs PCY, Cum 1772-', 'drtcr', '3u4anot54o4', 'EMttacnc Ein', 'Deouibl tenzot Deouibl tenzot', 'Drrlclina Ihtr 371 Buctja Ihr', 'Drouctlo Ehtn 37 Drouctlo Ehtn LTe', 'embuance [aranonalon atoomunji', 'rcailtietale', 'Icoleten Zur', 'LELLCD Iner 44', 'Drouctlo Ehtn LTe', 'Cinam[EAL', 'Deouibl tren?o', 'DOCL ', 'DA Cj.l', 'ndnamoomonn', 'Prentdrdonraulcam', 'Icoleten Zur', 'Mencnihit 47x', 'DECUctEio Fhtn LTe', 'Icpaltrt Lai aniberthdenaeal Gtn = E4 Indudro Troaikrt hEcatmrtlochigalhaalh', 'Uten Zrr', 'SLocool', 'Mencn', 'DtuctEle e drtchloeirn TE', 'Icoleten Zur', 'Mencnihit 47x', 'Cteitaeti', 'Ichleten', 'Deofin FtfI', 'Araodrnonunt', 'Prelered Ener', '515cooly', '315 Cpa;', '515Cooz', 'RGti Epcral 37-d2/supdly', 'Prleied Erand', '525Cooi', '545 Cpa;', '525Cooz', 'Vallomo 70-d2/Suddy', 'Rorrpretcntd Cnqa', 'Ichln', 'Con', 'Ro coent', 'S1camy', 'FidffhiR [FAn I', 'Mencnihit 47x', 'Deofin Foet ETF', 'Onols', 'Aebnenaencen andoato', 'roalktt (chablezacn 3007+ FCY Pnuacd sceth cc_lal maz Enedy 25VEECTEMpCY Durhemedc', 'Deouibl tenzot', 'Mencnihit 47x', 'drtchloeirn TE', 'Qd Emm?', 'SL] ccpan', 'Mencnihit 47x', 'drtchloeirn TE', 'Ichl_', 'drtcr Forn UTI', 'Voxqon54r', 'Tccide Majtq IC Cajncba slndzdlnsorc', 'Icoleten Zur', 'Mencnihit 47x', 'drtchloeirn TE', 'WeomeananCMOCoR CPaT', 'Icoleten Zur', 'LOedo', 'RMci4', 'Rucljo', 'Dtuctble', 'Aratuaaerelnatd entkoet', '33nngs', 'Corttunn Aae | SD)coly', 'Mencnihit 47x', 'DECUctEio Fhtn LTe drtctoeirn LT 531 cooy', 'Thit 47 den4a', 'Fethaeyral Uaand Yra Beeam 1PCY ndim enoal EXcABir Carocee_PcfOmian POIf @tade ICIEeser accOicrcisert RuotJu20 tamerdkna', 'Corttunn', 'CovdddintuI', 'Coerdinhull', 'bentat Petcu Qbomacr', 'Coittuni', 'Kucllja Ihn 3J*', 'Drouctlo Ehtn 37', 'Cnnodor-a melcalk', 'Icoleten Borl', 'mecnnchit 57x', 'DECUctEio encn 579 A coannt', 'LECID Un DoL', 'Muanc', 'Conrdnn', 'Acon', 'Brcecmeo odt Hettrmennlnetn inducrg ELbalanco CEcre', 'Eoconi', 'Acon', 'Ro coent', 'LULCi', 'Clid Ihtr 4J7', 'Dtuctble enen', 'Thr deiecnonecoles Wnlaonnt', 'Jij] D-15-_l'] 

Processing page 2... 
Page 2 OCR Results: 
...

Surya OCR

Processing page 1...
Page 1 OCR Results:
Premera Blue Cross Preferred Gold 1500
PREMERA |
Alaska plan for individuals and families
BLUE CROSS BLUE SHIELD OF ALASKA
Start date January 1, 2023
dent Licenses of the Else Cross New Yorkid Jacobian
You have access to the Legacy and Dental Select Network and the national Blue
Legacy and Dental
Non-preferred providers
Non-participating providera
Cross Blue Shield BlueCard(r) provider network. Please refer to the next page for
ot network
important plan and network information.
Annual deductible
Per calendar year (PCY)
$1,500
$3,000
$3,000
Family = 2x individual
Amount you pay after your deductible is met
30%
4000
60%
Out-of-pockat maximum
Includes deductible, copays
Family = 2x individual (in-network only)
$6,300
Not covered
Not covered
10 essential health benefits
1 Ambulatory patient services
Outpatient services
Deductible, then 30%
Deductible, then 30%
Deductible, then 30%
Office visits
Designated PCP office visit
$30 copay
Deductible then 47%
Deductible then 60%
Specialist office visit
$60 copay
Reductible then 40%
Deductible, then 60%
Urgent care
$60 copay
Deductible, then 40%
Deductible, then 60%
Spinal manipulation: 12 visits PCY;
$30 copav
Deductible then 47%
Deductible then 60%
Acupuncture: 12 visits PCY
2 Emergency services
Emergency care
Deductible, then 30%
Deductible then 30%
Deductible then 30%
Ambulance transportation (air and ground)
Deductible, then 30%
Deductible, then 40%
Deductible, then 60%
3 Hospitalization
Inpatient services
Deductible then 30%
Deductible, then 40%
Deductible, then 60%
Organ and tissue transplants, inpatient
Deductible, then 30%
Not covered
Not covered
4 Meternity and newborn care
Prenatal and postnatal care
Deductible then 30%
Deductible then 47%
Deductible, then 60%
Inpatient delivery and services
Deductible, then 30%
Deductible, then 40%
Deductible, then 60%
5 Mental heelth and substance use
Office visit
$60 copay
Deductible, then 40%
Deductible, then 60%
disorder services, including
Inpatient hospital: mental/behavioral health
Deductible then 30%
Deductible then 47%
Deductible then 60%
behavioral health treatment
Deductible, then 30%
Outpatient services
Deductible then 47%
Deductible then 60%
6 Prescription drugs
Preferred generic
$15 copay
$15 copay
$15 copay
Retail/Specialty: 30-day supply
Preferred brand
$45 copay
$45 copay
$45 copav
Mail order: 90-day supply
Non-preferred drugs
Deductible then 50%
Not covered
Not covered
Deductible then 40%
Deductible then 47%
Reductible then 40%
Specialty
Drug list
Inpatient rehabilitation: 30 days PCY
7 Rehabilitative and habilitative
Deductible, then 30%
Deductible, then 40%
Deductible, then 60%
services and devices
Physical, speech, occupational, massage
Deductible, then $60 copey
Deductible then 47%
Deductible then 60%
therapy: 25 visits combined PCY
Durable medical equipment
Deductible, then 30%
Deductible, then 40%
Deductible, then 60%
Endoratory services
Includes x-ray, pathology, imaging and
Deductible, then 30%
Deductible, then 60%
diagnostic, standard ultrasound
Deductible then 47%
Major imaging, including MRI, CT, PET
Deductible, then 30%
Deductible, then 40%
Deductible, then 60%
(preapproval required for certain services)
Preventive/wellness services
Screenings
Covered in full
Deductible then 47%
Deductible, then 60%
Exams and vaccinations
Covered in full
Deductible, then 40%
Deductible, then 60%
10 Pediatric services, including vision Eye exam: 1 PCY
$30 copay
$30 copay
$30 copay
and dental
Eyewear: 1 pair of glasses PCY (frames and
under 19 years of age
lenses); 12-month supply of contacts PCY,
Covered in full
Covered in full
Covered in full
in lieu of glasses (frames and lenses)
Covered in full
Deductible, then 30%
Deductible, then 30%
Rental: nreventive/hasic/maine
Orthodontia (medically necessary only)
Deductible, then 50%
Deductible, then 50%
Deductible, then 50%
Virtual care
Doctor On Demand: general medicine
Covered in full
Not covered
Not covered
Boulder Care or Workit Health: mental health
$60 copay
Not covered
Not covered
including substance use disorder
All other virtual providers
Deductible, then 40%
Deductible, then 60%
$60 copay
* The deductible applies unless otherwise noted
028509 (09.15.2022)

Processing page 2...
Page 2 OCR Results:
...

docTR OCR

['Premera', 'Blue', 'Cross', 'Preferred', 'Gold', '1500', 'PREMERA', '6', 'Alaska', 'plan', 'for', 'individuals', 'and', 'families', 'BLUE', 'CROSS', 'BLUE', 'SHIELD', 'OF', 'ALASKA', 'Start', 'date', 'January', '1,', '2023', 'Anindependentl', 'LicenseeoftheB', 'Blue', 'Cross', 'Blues', 'Shield', 'Association', 'You', 'have', 'access', 'tot', 'thel', 'Legacy', 'and', 'Dental', 'Select', 'Network', 'and', 'the', 'national', 'Blue', 'and', 'Dental', 'Cross', 'Blue:', 'Shield', 'BlueCardo', 'provider', 'network.', 'Please', 'refer', 'to', 'the', 'next', 'page', 'for', 'Legacy', 'Selectr', 'network', 'Non-preferred', 'providers', 'Non-participating', 'providers', 'important', 'plan', 'and', 'networki', 'information.', 'Annual', 'deductible', 'Per', 'calendary', 'year', '(PCY)', '$1,500', '$3,000', 'Family', '=', '2xi', 'individual', '$3,000', 'Amounty', 'your', 'pay:', 'after', 'your', 'deductiblei', 'is', 'met', '30%', '40%', '60%', 'Out-of-pocket', 'maximum', 'Includes', 'deductible,', 'copays', 'Family', '-', '2xi', 'individual', '(in-network', 'only)', '$6,300', 'Not', 'covered', 'Not', 'covered', '106', 'essentiall', 'health', 'benefits', '1', 'Ambulatory', 'patients', 'services', 'Outpatient:', 'services', 'Deductible,', 'then', '30%', 'Deductible,', 'then', '30%', 'Deductible,', 'then:', '30%', 'Office', 'visits', 'Designated', 'PCP', 'office', 'visit', '$30', 'copay', 'Deductible,', 'then', '40%', 'Deductible,', 'then', '60%', 'Specialist', 'office', 'visit', '$60', 'copay', 'Deductible,', 'then', '40%', 'Deductible,', 'then', '60%', 'Urgent', 'care', '$60', 'copay', 'Deductible,', 'then', '40%', 'Deductible,', 'then', '60%', 'Spinal', 'manipulation:', '12', 'visits', 'PCY;', '$30', 'copay', 'Acupuncture:', '12', 'visits', 'PCY', 'Deductible,', 'then', '40%', 'Deductible,', 'then', '60%', '21', 'Emergencys', 'services', 'Emergency', 'care', 'Deductible,', 'then', '30%', 'Deductible,', 'then', '30%', 'Deductible,', 'then:', '30%', 'Ambulancet', 'transportation', '(air', 'and', 'ground)', 'Deductible,', 'then', '30%', 'Deductible,', 'then', '40%', 'Deductible,', 'then', '60%', '3', 'Hospitalization', 'Inpatients', 'services', 'Deductible,', 'then', '30%', 'Deductible,', 'then', '40%', 'Deductible,', 'then', '60%', 'Organ', 'and', 'tissuet', 'transplants,', 'inpatient', 'Deductible,', 'then', '30%', 'Not', 'covered', 'Not', 'covered', '41', 'Maternity', 'andr', 'newborn', 'care', 'Prenatala', 'and', 'postnatal', 'care', 'Deductible,', 'then', '30%', 'Deductible,', 'then', '40%', 'Deductible,', 'then', '60%', 'Inpatient', 'delivery', 'and:', 'services', 'Deductible,', 'then', '30%', 'Deductible,', 'then', '40%', 'Deductible,', 'then', '60%', '51', 'Mental', 'health', 'and', 'substance', 'use', 'Office', 'visit', '$60', 'copay', 'Deductible,', 'then', '40%', 'Deductible,', 'then', '60%', 'disorder', 'behavioral', 'services,', 'health', 'treatment', 'including', 'Inpatient', 'hospital:', 'mental/behaviorall', 'health', 'Deductible,', 'then', '30%', 'Deductible,', 'then', '40%', 'Deductible,', 'then', '60%', 'Outpatient:', 'services', 'Deductible,', 'then', '30%', 'Deductible,', 'then', '40%', 'Deductible,', 'then', '60%', '61', 'Prescription', 'drugs', 'Preferred', 'generic', '$15', 'copay', '$15', 'copay', '$150', 'copay', 'Retail/Specialty::', '30-days', 'supply', 'Preferredi', 'brand', '$45', 'copay', '$45', 'copay', '$45', 'copay', 'Mail', 'order:', '90-day', 'supply', 'Non-preferred', 'drugs', 'Deductible,', 'then', '50%', 'Not', 'covered', 'Not', 'covered', 'Specialty', 'Deductible,', 'then', '40%', 'Deductible,', 'then', '40%', 'Deductible,', 'then', '40%', 'Drugl', 'list', 'M4', '7', 'Rehabilitative:', 'and', 'habilitative', 'Inpatient', 'rehabilitation::', '30', 'days', 'PCY', 'Deductible,', 'then', '30%', 'Deductible,', 'then', '40%', 'Deductible,', 'then', '60%', 'services', 'and', 'devices', 'Physical,', 'speech,', 'occupational,', 'massage', 'Deductible,', 'then', '$60', 'copay', 'therapy:', '25', 'visits', 'combined', 'PCY', 'Deductible,', 'then', '40%', 'Deductible,', 'then', '60%', 'Durable', 'medicale', 'equipment', 'Deductible,', 'then', '30%', 'Deductible,', 'then', '40%', 'Deductible,', 'then', '60%', '81', 'Laboratory', 'services', 'Includes', 'x-ray,', 'pathology,', 'imaging', 'and', 'Deductible,', 'then', '30%', 'diagnostic,', 'standard', 'ultrasound', 'Deductible,', 'then', '40%', 'Deductible,', 'then', '60%', 'Majori', 'imaging,', 'including', 'MRI,', 'CT,', 'PET', 'Deductible,', 'then', '30%', '(preapproval', 'required', 'for', 'certain:', 'services)', 'Deductible,', 'then', '40%', 'Deductible,', 'then', '60%', '9', 'Preventive/wellness:', 'services', 'Screenings', 'Coveredi', 'in', 'full', 'Deductible,', 'then', '40%', 'Deductible,', 'then', '60%', 'Exams', 'and', 'vaccinations', 'Coveredi', 'in', 'full', 'Deductible,', 'then', '40%', 'Deductible,', 'then', '60%', '10', 'Pediatric', 'services,', 'including', 'vision', 'Eye', 'exam:', '1', 'PCY', '$30', 'copay', '$30', 'copay', '$30', 'copay', 'and', 'under', 'dental', '19', 'years', 'of', 'age', 'Eyewear:', '1', 'pair', 'of', 'glasses', 'PCY', '(frames', 'and', 'lenses);', '12-month', 'supply', 'of', 'contacts', 'PCY,', 'Coveredi', 'int', 'full', 'Coveredi', 'int', 'full', 'Coveredi', 'int', 'full', 'inl', 'lieu', 'of', 'glasses', '(frames', 'and', 'lenses)', 'Dental:', 'preventive/basic/major', 'Coveredi', 'int', 'full', 'Deductible,', 'then', '30%', 'Deductible,', 'then', '30%', 'Orthodontia', '(medically', 'necessary', 'only)', 'Deductible,', 'then', '50%', 'Deductible,', 'then', '50%', 'Deductible,', 'then', '50%', 'Virtual', 'care', 'Doctor', 'On', 'Demand:', 'general', 'medicine', 'Coveredi', 'int', 'full', 'Not', 'covered', 'Not', 'covered', 'Boulder', 'Care', 'or', 'Workit', 'Health:', 'mental', 'health', '$60', 'copay', 'Not', 'covered', 'Not', 'covered', 'including', 'substance', 'use', 'disorder', 'All', 'other', 'virtual', 'providers', '$60', 'copay', 'Deductible,', 'then', '40%', 'Deductible,', 'then', '60%', '*The', 'deductible', 'applies', 'unless', 'otherwiser', 'noted', '028509', '(09-15-2022)'
...

olmOCR

['| Benefit Category | Legacy and Dental Select network | Non-preferred providers | Non-participating providers |\n|------------------|---------------------------------|------------------------|---------------------------|\n| **Annual deductible** | $1,500 (Family - 2x individual) | $3,000 | $3,000 |\n| **Out-of-pocket maximum** | Includes deductible, copays | Not covered | Not covered |\n| **10 essential health benefits** | | | |\n| **Ambulatory patient services** | Outpatient services | Deductible, then 30% | Deductible, then 30% |\n| Office visits | Designated PCP office visit | $30 copay | Deductible, then 40% |\n| | Specialist office visit | $60 copay | Deductible, then 40% |\n| | Urgent care | $90 copay | Deductible, then 40% |\n| | Special population: 12 visits PCY | $30 copay | Deductible, then 40% |\n| | Acupuncture: 12 visits PCY* | | |\n| **Emergency services** | Emergency care | Deductible, then 30% | Deductible, then 30% |\n| | Ambulance transportation (air and ground) | Deductible, then 30% | Deductible, then 60% |\n| **Hospitalization** | Inpatient services | Deductible, then 30% | Deductible, then 60% |\n| | Organ and tissue transplants, inpatient | Not covered | Not covered |\n| **Maternity and newborn care** | Pregnancy and postpartum care | Deductible, then 30% | Deductible, then 60% |\n| | Inpatient delivery and services | Deductible, then 30% | Deductible, then 60% |\n| **Mental health and substance use disorder services, including behavioral health treatment** | Inpatient hospital: mental/behavioral health | $60 copay | Deductible, then 40% |\n| | Outpatient services | Deductible, then 30% | Deductible, then 40% |\n| **Prescription drugs** | Preferred generic | $15 copay | $15 copay |\n| | Retail/Specialty: 30-day supply | $45 copay | $45 copay |\n| | Mail order: 90-day supply | Not covered | Not covered |\n| | Specialty | Deductible, then 40% | Deductible, then 40% |\n| | Drug list | M4 | |\n| **Rehabilitative and habilitative services and devices** | Inpatient rehabilitation: 30 days PCY | Deductible, then 30% | Deductible, then 60% |\n| | Physical, speech, occupational, massage therapy: 25 visits combined PCY | Deductible, then $60 copay | Deductible, then 60% |\n| | Durable medical equipment | Deductible, then 30% | Deductible, then 60% |\n| **Laboratory services** | Includes x-ray, pathology, imaging and diagnostic, standard ultrasound | Deductible, then 30% | Deductible, then 60% |\n| | Major imaging, including MRI, CT, PET (geographic area required for certain services) | Deductible, then 30% | Deductible, then 60% |\n| **Preventive/wellness services** | Screenings | Covered in full | Deductible, then 40% |\n| | Exams and vaccinations | Covered in full | Deductible, then 60% |\n| **Pediatric services, including vision and dental under 19 years of age** | Eye exam: 1 PCY | $30 copay | $30 copay |\n| | Eyewear: 1 pair of glasses PCY (frames and lenses): 12 month supply of contacts PCY, in lieu of glasses (frames and lenses) | Covered in full | Covered in full |\n| | Dental preventive/basic/major | Covered in full | Deductible, then 30% |\n| | Orthodontics (medically necessary only) | Deductible, then 50% | Deductible, then 50% |\n| **Virtual care** | Doctor on Demand: primary medicine | Covered in full | Not covered |\n| | Boulder Care or World Health: mental health including substance use disorder |']

Qwen2.5vl OCR

['Premera Blue Cross Preferred Gold 1500\nAlaska plan for individuals and families\nStart date January 1, 2023\n\nYou have access to the Legacy and Dental Select Network and the national Blue Cross Blue Shield BlueCard(r) provider network. Please refer to the next page for important plan and network information.\n\n| Annual deductible | Legacy and Dental Select network | Non-preferred providers | Non-participating providers |\n|---|---|---|---|\n| Family = 2+ individual | $1,500 | $3,000 | $3,000 |\n| Amount you pay after your deductible is met | 30% | 40% | 60% |\n| Out-of-pocket maximum | Includes deductible, copays | Not covered | Not covered |\n| Family = 2+ individual (in-network only) | $6,300 | | |\n\n## 10 essential health benefits\n\n| 1 | Ambulatory patient services | Outpatient services | Designated PCP office visit | Specialist office visit | Urgent care | Special consultation: 12 visits PCY; Acupuncture: 12 visits PCY |\n|---|---|---|---|---|---|---|\n| | | Deductible, then 30% | $30 copay | Deductible, then 40% | Deductible, then 40% | Deductible, then 40% |\n| | | | | | | |\n| 2 | Emergency services | Emergency care | Deductible, then 30% | Deductible, then 30% | Deductible, then 30% | Deductible, then 30% |\n| | | Ambulance transportation (air and ground) | Deductible, then 30% | Deductible, then 40% | Deductible, then 60% | Deductible, then 60% |\n| 3 | Hospitalization | Inpatient services | Deductible, then 30% | Deductible, then 30% | Not covered | Deductible, then 60% |\n| | | Organ and tissue transplants, inpatient | Deductible, then 30% | | | |\n| 4 | Maternity and newborn care | Prenatal and postpartum care | Deductible, then 30% | Deductible, then 40% | Deductible, then 60% | Deductible, then 60% |\n| | | Inpatient delivery and services | Deductible, then 30% | | | |\n| 5 | Mental health and substance use disorder services, including behavioral health treatment | Inpatient hospital: mental/behavioral health | Deductible, then 30% | Deductible, then 40% | Deductible, then 60% | Deductible, then 60% |\n| | | Outpatient services | Deductible, then 30% | | | |\n| 6 | Prescription drugs | Preferred generic | $15 copay | $15 copay | $15 copay | |\n| | | Retail/Specialty: 30-day supply | Preferred brand | $45 copay | $45 copay | $45 copay |\n| | | Mail order: 90-day supply | Non-preferred drugs | Deductible, then 50% | Not covered | Not covered |\n| | | Specialty | Deductible, then 40% | Deductible, then 40% | Deductible, then 40% |\n| | | Drug list | M4 | | | |\n| 7 | Rehabilitative and habilitative services and devices | Inpatient rehabilitation: 30 days PCY | Deductible, then 30% | Deductible, then 40% | Deductible, then 60% | Deductible, then 60% |\n| | | Physical, speech, occupational, massage therapy: 25 visits combined PCY | Deductible, then $60 copay | Deductible, then 40% | Deductible, then 60% | Deductible, then 60% |\n| | | Duraplasty: 1 visit | Deductible, then 30% | | | |\n| 8 | Laboratory services | Includes x-ray, pathology, imaging and diagnostic, standard ultrasound | Deductible, then 30% | Deductible, then 40% | Deductible, then']

Loan Application – Checkboxes

Evaluates how well tools detect and interpret form fields, checkboxes, and radio buttons.

Download sample →

EasyOCR

Processing page 1... 
Page 1 OCR Results: 
['Te ScoTpnrdorirljtr CFTEFAdenmnm dezhi', '""genzy Laa %a', 'Uniform Residential Loan Application Verity andcompletc thcintomationon thu applitaton Inyorl nio marico Jircccd DrYCurLenozr', '"PPnnotcrin65 Icunathothan Cao7 ocditicnalJcTC MEI PIOvice', 'Section 1: Borrower Information: This section asks about your personal information &ndyour ircorne from emiplovmneril othe! sources such asreluernen(al vou wvani coisideredlooualifytor this loan-', 'peronaiintonaton', 'Rame ethdal', 'Alno', 'Logalquntynumar (cr Individual Toxpoycr IdentuicohonNumBen) Date 0f Wirth Cmrenknin Yayyl I Cibcen JermeacntpengentAen Ton-ermenentpenoentMen', 'Jchn Acama nteratenama Lufaninamc Drmhiyou cnonotongcama Lindcr Which credtmasprypuilYrrrered (Fetuidle Last Jumx', 'Typc of Credi Ianapalng for individual credit Ianapatina foricint credi Totl Numocr cf Bcutowcix Womo2Nintendr poly tr joint crede + Your initialz', 'List Nimelilof Other Vomowerkl Appking forthis Loan Afirtn Wnste Use @icpuro D( benvrenname', 'attaletatus', 'Depcnocnts(notMtez Numjcr', 'anomeriomoe', 'contacntomaticn', 'Vetkd Scponted Unmetico (Singke Drvced Wiowed,Uvil Urion Danestk Partnersh P; Reguterd heoprcoi BenchncyrachonshIp_', 'Homc Phcnc Ucpacor', 'Monk Phone', 'Lnorareccm', 'cumten Addness Stiect 145 Fro Hijjo Pliza Cit; Ham Kjw Lcngat Cunent nddres!', 'Unit $ 23101 Ccuntn ORent ($ 53oo', 'Montha Aousino', 'No pnnaYnouo EDeOte', 'Umorai', 'Mat CurrcntAddress torLESsthanzycatsIut Formcr Addtes= Etteet', 'Dousnotopply', 'Wnie', 'Ccuntn', 'Kjw cngattorrn Adole', 'Montha Aousino', 'nenenouag ecrc', 'monthi', 'Milling^0orczs oumerenimomeutentaddmess Stiect 345 Fro Hijjo Pliza', 'Dogsnotapny', 'Unit $ Gqwrtn', 'Cit; Ham', 'cuntcnt Lmploymentzcil-EmploymcntandIncomt', 'dDoes', 'GrosMonthly Incomt', 'Employcr or Buzincs; Name Strect Fonddn Djdia', 'Talc CCMciG Sacms', 'Fnonc', 'eraccon Keeeh', 'nm=', 'Oraimt', 'Mnneh', '@ 3101', 'Lountr', 'ecdommeren', 'Posticn or Tiex Enjreerra Manjocr stan Datc Ui', 'Cncorm thu ntementappler Dlam tndoytdbyafamik Ncmb; Cropttly tcllt Gale JenL 0i ott FJI; [0 thetnac', 'Commia', 'Keeeh', 'Lan', 'Kjw IcngInthiinc Ciork?', 'Montha', 'EEtiLnan 5 5', 'Jnexen', 'Mnneh', 'Ocheck ifyou are the Dusiness Owncr or Sem-Employco', 'Inave an ovncnnip tnarc ofleaathan *4_ EcnthlvinromeiGTlon Totes navc an ovncnnip tnarc 0f184mcrc;', '0loo AFuneh', 'Medenn D _Annnnnrocie Ficddicenc Fomai Fanna Auc Fom ICO BtTir [JM1']

Surya OCR

Processing page 1...
Page 1 OCR Results:
To be completed by the Lender:
Lender Loan No./Universal Loan Identifier 8849949003
Agency Case No. 7748800
Uniform Residential Loan Application
Verify and complete the information on this application. If you are applying for this loan with others, each additional Borrower must provide
information as directed by your Lender.
Section 1: Borrower Information. This section asks about your personal information and your income from
employment and other sources, such as retirement, that you want considered to qualify for this loan.
1a. Personal Information
Name (First, Middle, Last, Suffix)
Social Security Number
(or Individual Taxpayer Identification Number)
John Adams
Date of Birth
Alternate Names - List any names by which you are known or any names
Citizenship
(mm/dd/yyyy)
under which credit was previously received (First, Middle, Last, Suffix)
U.S. Citizen
02 / 02 /
Ö Permanent Resident Alien
1976
Non-Permanent Resident Alien
Type of Credit
List Name(s) of Other Borrower(s) Applying for this Loan
(First, Middle, Last, Suffix) - Use a separator between names
I am applying for individual credit.
O I am applying for joint credit. Total Number of Borrowers:
Each Borrower intends to apply for joint credit. Your initials:
Marital Status
Dependents (not listed by another Borrower)
Contact Information
Married
Number 3
Home Phone (<u>884</u>) <u>443</u> -
3433
O Separated
Ages 12,6,4
Cell Phone
(01)233
3454
O Unmarried
Work Phone
(
Ext.
(Single, Divorced, Widowed, Civil Union, Domestic Partnership, Registered
Email [johna@infoware.com](mailto:johna@infoware.com)
Reciprocal Beneficiary Relationship)
Current Address
Street 345, Fire Ridge Plaza
Unit #
City Miami
State FL
ZIP 33101
Country US
Months Housing () No primary housing expense () Own () Rent ($
How Long at Current Address?
Years
(month)
$4300
If at Current Address for LESS than 2 years, list Former Address
Does not apply
Street
Unit #
City
State
ZIP
Country
Months Housing ONo primary housing expense O Own O Rent ($
How Long at Former Address?
Years
(month)
Mailing Address - if different from Current Address Does not apply
Street 345, Fire Ridge Plaza
Unit #
City Miami
7IP 33101
Country US
State EL
Does not apply
1b. Current Employment/Self-Employment and Income
Gross Monthly Income
Employer or Business Name Info ware computer systems
Phone
Base
$ $140000 /month
Street 23, Rohde Building
Unit #
Overtime
/month
S
State FL
ZIP 33101
Country US
City Miami
Bonus
$2000 /month
ς
Position or Title Engineering Manager
Check if this statement applies:
Commission S
/month
I am employed by a family member,
Start Date
/ /
(mm/dd/yyyy)
Military
property seller, real estate agent, or other
Entitlements S
/month
How long in this line of work? 4 Years 3 Months
party to the transaction.
Other
/month
Check if you are the Business I have an ownership share of less than 25%. Monthly Income (or Loss)
TOTAL $
0.00/month
Owner or Self-Employed
O I have an ownership share of 25% or more. $
Uniform Residential Loan Application
Freddie Mac Form 65 • Fannie Mae Form 1003
Effective 1/2021

docTR OCR

['Tobec', 'completedbythel', 'Lender:', 'Lenderl', 'LoanN', 'No./Universall', 'Loanl', 'Identifier', '8849949003', 'Agency', 'Casel', 'No.', '7748833', 'Uniform', 'Residential', 'Loan', 'Application', 'Verify', 'and', 'complete', 'thei', 'information', 'on', 'this', 'application.', 'Ifyou', 'are', 'applying', 'for', 'this', 'loan', 'with', 'others,', 'each', 'additional', 'Borrower', 'must', 'provide', 'information', 'as', 'directed', 'by', 'yourl', 'Lender.', 'Section', '1:', 'Borrower', 'Information.This:', 'section', 'asks', 'about', 'your', 'personal', 'information', 'and', 'your', 'income', 'from', 'employment.', 'and', 'other', 'sources,', 'such', 'as', 'retirement,', 'that', 'you', 'want', 'considered', 'to', 'qualify', 'for', 'thisl', 'loan.', '1a.', 'Personall', 'Information', 'Name', '(First,', 'Middle,', 'Last,:', 'Suffix)', 'Social', 'Security', 'Number', 'John', 'Adams', '(orl', 'Individual', 'Taxpayer', 'Identification7', 'Number)', 'Alternatel', 'Names', '-', 'Listo', 'anynamest', 'by', 'which)', 'your', 'arel', 'known', 'or', 'any', 'names', 'Date', 'ofl', 'Birth', 'Citizenship', 'under', 'which', 'credit', 'wasp', 'previously', 'received', '(First,', 'Middle,', 'Last,', 'Suffix)', '(mm/dd/yyyy)', 'U.S.', 'Citizen', '02', '02', '1976', 'Permanentl', 'Resident', 'Alien', 'Non-Permanent', 'Resident', 'Alien', 'Type', 'of', 'Credit', 'Listl', 'Name(s)', 'of', 'Other', 'Borrower(s).', 'Applying', 'for', 'this', 'Loan', 'am', 'applying', 'fori', 'individual', 'credit.', '(First,', 'Middle,', 'Last,', 'Suffix).', '-', 'Use', 'a', 'separator', 'between', 'names', 'am', 'applying', 'forj', 'joint', 'credit.', 'Totall', 'Number', 'of', 'Borrowers:', 'Each', 'Borroweri', 'intendst', 'to', 'applyf', 'forj', 'joint', 'credit.)', 'Youri', 'initials:', 'Marital', 'Status', 'Dependents', '(notl', 'listed', 'by', 'another', 'Borrower)', 'Contactl', 'Information', 'Married', 'Number', '3', 'Homel', 'Phone', '(', '884', ')', '443', '3433', 'Separated', 'Ages', '12,6,4', 'Cell', 'Phone', '(', '01', '233', '3454', 'Unmarried', 'Workl', 'Phone', 'Ext.', '(Single,', 'Divorced,', 'Widowed,', 'Civil', 'Union,', 'Domestic', 'Partnership,', 'Registered', 'Reciprocal', 'Beneficiary', 'Relationship)', 'Email', '[johna@infoware.com](mailto:johna@infoware.com)', 'Current', 'Address', 'Street', '345,', 'Fire', 'Ridge', 'Plaza', 'Unit', '#', 'City', 'Miami', 'State', 'FL', 'ZIP', '33101', 'Country', 'US', 'Howl', 'Long', 'at', 'Current', 'Address?', 'Years', 'Months', 'Housing', 'No', 'primaryl', 'housing', 'expense', 'Own', 'Rent', '$', '$4300', '/month)', 'Ifat', 'Current', 'Address', 'for', 'LESS', 'than', '2', 'years,', 'list', 'Former', 'Address', 'Doesr', 'not', 'apply', 'Street', 'Unitt', '#', 'City', 'State', 'ZIP', 'Country', 'Howl', 'Long', 'at', 'Former', 'Address?', 'Years', 'Months', 'Housing', 'No', 'primaryl', 'housing', 'expense', 'Own', 'Rent', '($', '/month)', 'Mailing', 'Address-', 'ifdifferentf', 'from', 'Current', 'Address', 'Does', 'not', 'apply', 'Street', '345,', 'Firel', 'Ridge', 'Plaza', 'Unit', '#', 'City', 'Miami', 'State', 'FL', 'ZIP:', '33101', 'Country', 'US', '1b.', 'Current', 'Employment/Self-Employment:', 'andl', 'Income', 'Does', 'not', 'apply', 'Employer', 'orl', 'Business', 'Name', 'Info', 'ware', 'computers', 'systems', 'Phone', 'Gross', 'Monthly', 'Income', 'Street', '23,', 'Rohdel', 'Building', 'Unit', '#', 'Base', '$', '$140000', '/month', 'City', 'Miami', 'State_', 'FL', 'ZIP', '33101', 'Country', 'US', 'Overtime', '$', '/month', 'Bonus', '$', '$2000', '/month', 'Position', 'or', 'Title', 'Engineering!', 'Manager', 'Checki', 'ifthis', 'statement', 'applies:', 'Commission', '$', '/month', 'Startl', 'Date', '(mm/dd/yyyy)', 'lam', 'employed', 'byaf', 'family', 'member,', 'Military', 'property', 'seller,', 'real', 'estate', 'agent,', 'or', 'other', 'Howl', 'longi', 'int', 'this', 'line', 'of', 'work?', '4', 'Years', '3', 'Months', 'partyt', 'tot', 'the', 'transaction.', 'Entitlements', '$', '/month', 'Other', '$', '/month', 'Checki', 'if', 'you', 'aret', 'thel', 'Business', 'Ihave', 'an', 'ownership', 'share', 'ofl', 'less', 'than', '25%.', 'Monthly', 'Income', '(orl', 'Loss)', 'Owner', 'or', 'Self-Employed', 'Ihavea', 'an', 'ownership:', 'share', 'of', '25%', 'or', 'more.', '$', 'TOTAL:', '$', '0.00/month', 'Uniform', 'Residentiall', 'Loan', 'Application', 'Freddiel', 'Mac', 'Form', '65', 'Fanniel', 'Mae', 'Form', '1003', 'Effective', '1/2021']

olmOCR

['Uniform Residential Loan Application\n\nVerify and complete the information on this application. If you are applying for this loan with others, each additional Borrower must provide information as directed by your Lender.\n\nSection 1: Borrower Information. This section asks about your personal information and your income from employment and other sources, such as retirement, that you want considered to qualify for this loan.\n\n1a. Personal Information\n\nName (First, Middle, Last, Suffix) John Adams\nAlternate Names - List any names by which you are known or any names under which credit was previously received (First, Middle, Last, Suffix)\n\nSocial Security Number (or Individual Taxpayer Identification Number)\nDate of Birth (mm/dd/yyyy) 02 / 02 / 1976\nCitizenship\n☐ US Citizen ☐ Permanent Resident Alien ☐ Non-Permanent Resident Alien\n\nType of Credit\n☐ I am applying for individual credit.\n☐ I am applying for joint credit. Total Number of Borrowers: Each Borrower intends to apply for joint credit. Your initials:\n\nList Name(s) of Other Borrower(s) Applying for this Loan (First, Middle, Last, Suffix) - Use a separator between names\n\nMarital Status\n☐ Married\n☐ Separated\n☐ Unmarried\nDependents (not listed by another Borrower)\nNumber 3 Ages 12, 8, 4\n\nContact Information\nHome Phone ( ) 884 - 443 - 3433\nCell Phone ( ) 01 - 233 - 3454\nWork Phone ( ) - Ext.\nEmail [johns@infoware.com](mailto:johns@infoware.com)\n\nCurrent Address\nStreet 345, Fire Ridge Plaza\nCity Miami\nState FL\nZIP 33101\nCountry US\nUnit #\nHow Long at Current Address? Years Months Housing No primary housing expense Own Rent ($) /month\n\nIf at Current Address for LESS than 2 years, list Former Address Does not apply\nStreet\nCity\nState\nZIP\nCountry\nUnit #\nHow Long at Former Address? Years Months Housing No primary housing expense Own Rent ($) /month\n\nMailing Address - if different from Current Address Does not apply\nStreet 345, Fire Ridge Plaza\nCity Miami\nState FL\nZIP 33101\nCountry US\nUnit #\n\n1b. Current Employment/Self-Employment and Income\n\nEmployer or Business Name Into ware computer systems\nPhone ( )\nStreet 23, Rodeo Building\nCity Miami\nState FL\nZIP 33101\nCountry US\nUnit #\nGross Monthly Income\nBase $14000/month\nOvertime $/month\nBonus $2000/month\nCommission $/month\nMilitary Entitlements $/month\nOther $/month\nTOTAL $0.00/month\n\nPosition or Title Engineering Manager\nStart Date (mm/dd/yyyy)\nHow long in this line of work? 4 Years 2 Months\n\nCheck if this statement applies:\n☐ I am employed by a family member, property seller, real estate agent, or other party to the transaction\n☐ Check if you are the Business Owner or Self-Employed\n☐ I have an ownership share of less than 25% Monthly income (or Loss)\n☐ I have an ownership share of 25% or more\n\nUniform Residential Loan Application\nFreddie Mac Form 65 - Fannie Mae Form 1003\nEffective 1/2021']

Qwen2.5vl OCR

['\n# Uniform Residential Loan Application\n\n**To be completed by the Lender:**\n- Lender Loan No./Universal Loan Identifier: 884549023\n- Agency Case No.: 7148833\n\n## Section 1: Borrower Information\n\n### 1a. Personal Information\n\n| Field | Value |\n|-------|-------|\n| Name (First, Middle, Last, Suffix) | John Adams |\n| Social Security Number (or Individual Taxpayer Identification Number) | - |\n| Date of Birth (mm/dd/yyyy) | 02/02/1976 |\n| Citizenship | U.S. Citizen |\n| Type of Credit | I am applying for individual credit. |\n| List Name(s) of Other Borrower(s) Applying for this Loan | - |\n| Marital Status | Married |\n| Dependents (not listed by another Borrower) | Number 3 |\n| Contact Information | Home Phone: 884-443-3433<br>Cell Phone: 01-233-3545<br>Email: [johna@infoware.com](mailto:johna@infoware.com) |\n| Current Address | Street: 345, Fire Ridge Plaza<br>City: Miami<br>State: FL<br>ZIP: 33101<br>Country: US |\n| How Long at Current Address? | Years: 4<br>Months: 2 |\n| Housing | No primary housing expense |\n| Mailing Address | Same as current address |\n\n### 1b. Current Employment/Self-Employment and Income\n\n| Field | Value |\n|-------|-------|\n| Employer or Business Name | Info ware computer systems |\n| Position or Title | Engineering Manager |\n| Start Date | 02/02/1976 |\n| How Long in this Line of Work? | 4 years, 2 months |\n| Gross Monthly Income | Base: $140000/month<br>Bonus: $2000/month<br>Total: $0.00/month |\n\n**Check if this statement applies:** I am employed by a family member, property seller, real estate agent, or other party to the transaction.\n\n**Check if you are the Business Owner or Self-Employed:** I have an ownership share of less than 25%.\n\n**Monthly Income (or Loss):** $0.00/month\n']

Bank Statement – Complex Layout

Challenges OCR engines with mixed layouts including headers, transaction tables, and scattered text.

Download sample →

EasyOCR

Processing page 1... 
Page 1 OCR Results: 
['freedom', 'Donu', '95422', '15dt', 'CHASE FREEDOM UTIMATE REWARDSD SUMMARY', '5508429 Ad S5Q0O', '022824', 'Toblpoints avaiable for etmpiion 45,553', '435EF sdent0 €', 'Z7726 Lopmin', 'aor Fannan Wimng: HieUfLamyuarnn 7zt Anedau Ea4to- 02 *u', 'Etn Chnni', 'Jeuerard LP', 'IEG77E chmonaou 24', 'FITETNThS', 'Paid +603 642124 517179 Juelddlng Previous balanae 1enr2JI', 'ACCOUNT SUMMARY Aecumeanmt AHiTtImean', 'FTEAel Balica Fe-Crda Fucnn Canh Lnern', '421A4 E7Jox', 'JBAE', 'Euna [mtam Lnn Chieend', 'Balanze', 'MDULEA FITE-FETDE', 'Cpand OeinqD', 'GUJD: Rtel', 'Mhbacidt Can 4Zn AaetetG=', 'Fp LudAmaun', 'Ena CredeAEzean ma', 'YOURACCOUNT MESSACES EDOTnmd COT ie4 enretarimede Y%y Rd5u55= L4tRL29/E maEluhcdlclt Lad,Eo (oOXyETa Eonindomloyur-e', 'freedom" 3237710l074170d50d5og42*ddddz KDX 1917 ttOMAT DDae: 03224 MLHATO E0*a ReentDeC FT Udh S012 mmprmant LuS SULU EEGGMTANATT Al7Jin Keen Fa', 'LIIETENSetAD', 'Ajtopatisor', '114z9atorxas', 'CUICUEUJEIIEZHAC BE3EaL = EOlat-Ea', "ESO01G028 3 23 70L06076177v'", 'dott']

Surya OCR

Processing page 1...
Page 1 OCR Results:
Interpretation of the second and the second and the second and the second and the second and the second and the second and the second and the second and the second and the second and the second and the second and the secon
Customer Service:<br>1-000-031-000
C.
Mible Doubleding
freedom
Channe blocks Furger locks
New Balance
CHASE FREEDOM: ULTIMATE
February 2024
$5,084,29
REWARDS(r) SUMMARY
1.4
т
<b>W</b>
Ŧ
.
.....
8
Minimum Payment Due
28 29 30 31 1 2 3
Previous points balance
40,458
$50.00
+ 1% (1 Pt)/$1 earned on all purchases
5,085
5 6 7 8 9
4
10
Payment Due Date
Total points available for
11 12 13 14 15 16 17
02/28/24
redemption
45,553
18 19 20 21 22 23 24
25 26 27 28 29 1 2
Start receering today. Vait Ultimate Rewards(r) at
[www.ultimaterewards.com](http://www.ultimaterewards.com/)
3 4 5 6 7 8 9
You always earn unlimited 1% cash back on all your purchases.<br>Activate new bonus categories every quarter. You'll earn an
Late Payment Warning: If we do not receive your minimum payment
by the date listed above, you may have to pay a late fee of up to<br>$40.00 and your APIY's will be subject to increase to a maximum
additional 4% cash back, for a total of 5% cash back on up to
$1,500 in combined bonus calegory purchases each quarter.
Penalty APR of 29.99%.
Activate for free at chase com/freedom, visit a Chase branch or
call the number on the back of your card.
Minimum Payment Warning: If you make only the minimum
payment each period, you will pay more in interest and it will take you <br>longer to pay off your balance. For example:
If you make no
You will pay off the
And you will end up
additional charpes
balance shown on this
paying an estimated
using this card and
statement in about ...
total of ...
ach month you pay.
Paid $6000 on oct 25/24
Only the minimum
15 years
$12.128
including previous balance
Destrent
$189
$5,817
3 years
(Savings=$5,311)
If you would like information about credit counseling services, call
1-866-797-2885
ACCOUNT SUMMARY
Account Number: 4342 3780 1050 7320
President Balance
$4 233 99
Payment, Credits
.$4 233 93
+$5,084.29
Purchases
Cash Advances
$0.00
Balance Transfers
$0.00
Fees Charged
$0.00
Interest Charged
$0.00
$5,084.29
New Balance
Opening/Closing Date
01/04/24 - 02/03/24
Courlet Accesse Line
$31 700
Available Credit
$26,615
Cash Access Line
$1,585
Available for Cash
$1,585
Past Due Amount
$0.00
Balance over the Credit Access Line
$0.00
YOUR ACCOUNT MESSAGES
Reminder: It is important to continue making your payments on time. Your APRs may increase if the minimum payment is not made on
time or payments are returned.
Your next AutoPay payment for $5,081,29 will be deducted from your Pay From account and orbidied on your due <br>date. If your due date fails on a Saturday, well credit your payment the Friday before.
0000001 F6833339 D 12
Y # 03 24/02/02
Page 1 of 2
08810 MA MA 26942 02410000120000694201
Freedom
433232701040241/20000506429000002
P.O. BOX 15123
Payment Due Date:
02/28/24
AUTOPAY IS ON
WILMINGTON, DE 19850-5123
$5,084.29
New Balance:
See Your Account <br>Messages for details.
For Undeliverable Mail Only
$50.00
Minimum Payment Due:
Account number: 4342 3780 1050 7320
$_
_ Amount Enclosed
AUTOPAY IS ON
NEET BEY & FAMILIE
LARRY PAGE<br>C PAGE
24917 KEYSTONE AVE<br>LOS ANGELES CA 97015-5505
CARDMEMBER SERVICE
PO BOX 6294<br>CAROL STREAM IL 60197-6294
19000 160 280 3 23 20 1060 21 17 20

docTR OCR

['Managey', 'youra', 'accountorineat:', 'Mobile:', 'Downloadthe', 'wwwctesaormcarchat', 'Chasel', 'castoneremnee', 'freedom', 'MobilsPapptoday', 'February', '2024', 'New', 'Balance', 'CHASE', 'FREEDOM:', 'ULTIMATE', 'S', 'M', 'T', 'W', 'T', 'F', 'S', '$5,084.29', 'REWARDSO', 'SUMMARY', 'Minimum', 'Payment', 'Due', '28', '29', '30', '31', '2', '3', '$50.00', 'Previous', 'points', 'balance', '40,468', '10', '+1%(1P', 'PI)/S1', 'earned', 'on', 'allpurchases', '5,085', 'Payment', 'Due', 'Date', '11', '12', '13', '14', '15', '16', '17', '02/28/24', 'Total', 'points', 'available', 'for', '18', '19', '20', '21', '22', '23', '24', 'redemption', '45,553', '25', '26', '27', '28', '29', '2', 'Start', 'redeemingt', 'today.', 'Visit', 'Ultimate', 'Rewardso', 'at', '[www.utimatronards.com](http://www.utimatronards.com/)', '9', 'Youa', 'always', 'earn', 'unlimited', '1%', 'cash', 'back', 'on', 'purchases.', 'Late', 'Payment', 'Warning:', 'Ifwe', 'dor', 'notr', 'receivey', 'minimum', 'payment', 'Activate', 'new', 'bonus', 'every', 'quarter.', 'AMOMEO', 'earn', 'an', 'byt', 'the', 'datel', 'listeda', 'mayt', 'have', 'top', 'pay', 'ytarer', 'late', 'fee', 'ofupt', 'to', 'additional', '4%', 'casht', 'poateroeras', 'total', 'of', '5%', 'casht', 'back', 'to', '$40.00', 'andy', 'Apesanurs', "APR's", 'subjectt', 'toir', 'increaset', 'toa', 'am', 'maximum', '$1,500ir', 'inc', 'combinedt', 'borhuise', 'each', 'quspis', 'Penalty', 'APR', 'AOYEAPRSM', 'Activateforf', 'freeatc', 'chase.c', 'ancaonerenad', 'Chase', 'branch', 'or', 'Minimum', 'Payment', 'Ifyour', 'make', 'only', 'ther', 'minimum', 'callt', 'ther', 'number', 'onthet', 'back', 'ofy', 'your', 'card.', 'paymente', 'each', 'period,y', 'Memningiliye', 'morei', 'ininteresta', 'andi', 'itwillt', 'take', 'you', 'longert', 'to', 'pay', 'offy', 'your', 'balance.', 'Fore', 'example:', 'no', 'pay', 'offt', 'Andyouwille', 'endup', 'abdcwemders', 'charges', 'asovealle', 'shown', 'diteia', 'payinga', 'ane', 'estimated', 'usingt', 'this', 'card', 'and', 'statementi', 'inabout...', 'total', 'of..', 'each', 'monthy', 'youp', 'pay..', 'Paid', '46000', 'ou', '025124', 'Onlyther', 'minimum', '15years', '$12,128', 'payment', 'incloding', 'previous', 'balance', '$189', '3years', '$6,817', '(Savings=$5,311)', 'Ifyouv', 'wouldi', 'likei', 'informationa', 'about', 'credit', 'counselings', 'services,', 'call', '1-866-797-2885.', 'ACCOUNT', 'SUMMARY', 'Account', 'Number:', '43423', '3780', '10507', '7320', 'Previous', 'Balance', '$4,233.99', 'Payment,', 'Credits', '-$4,233.99', 'Purchases', '+$5,084.29', 'Cash', 'Advances', '$0.00', 'Balance', 'Transfers', '$0.00', 'Fees', 'Charged', '$0.00', 'Interest', 'Charged', '$0.00', 'New', 'Balance', '$5,084.29', 'Opening/Closing', 'Date', '01/04/24', '02/03/24', 'Credit', 'Access', 'Line', '$31,700', 'Available', 'Credit', '$26,615', 'Cash', 'Access', 'Line', '$1,585', 'Availablef', 'for', 'Cash', '$1,585', 'Past', 'Due', 'Amount', '$0.00', 'Balance', 'overt', 'the', 'Credit', 'Access', 'Line', '$0.00', 'YOUR', 'ACCOUNT', 'MESSAGES', 'Reminder:', 'Itisir', 'important', 'to', 'continue', 'making', 'your', 'payments', 'ont', 'time.', 'Your', 'APRS', 'mayi', 'increasei', 'ifther', 'minimump', 'paymentis', 'notr', 'made', 'on', 'time', 'orp', 'payments', 'are', 'returned.', 'Your', 'next', 'AutoPay', 'payment', 'for', '$5,084.29willbe', 'deductedi', 'fromy', 'your', 'From', 'account', 'and', 'creditedo', 'on', 'your', 'due', 'date.', 'Ifyour', 'due', 'datef', 'falls', 'on', 'as', 'Saturday,', "we'll", 'credity', 'your', 'payment', 'Mhor', 'Fridayt', 'before.', '0000001', 'FIS33339D12', '24/02/03', 'Page1of3', '06610', 'MAMA', '34942', '0341000012000344201', 'freedom', '43323770106074170000500000508429000000002', 'P.O.', 'BOX1', '15123', 'AUTOPAY', 'IS', 'ON', 'Payment', 'Due', 'Date:', '02/28/24', 'WILMINGTON,', 'DE', '19850-5123', 'See', 'Your/', 'Account', 'Newl', 'Balance:', 'For', 'Undeliverable', 'Mail', 'Only', 'Messagest', 'for', 'details.', 'Minimuml', 'Payment', 'Due:', '$5,084.29', '$50.00', 'Account', 'number:', '43423', '37801', '10507', '7320', '$', 'Amount', 'Enclosed', '34942BEX9', '03424D', 'AUTOPAYI', 'IS', 'ON', 'PAGE', 'Y0155605', 'Eammss', 'SERVICE', '6294', 'STREAM', 'IL', '60197-6294', 'COENMRETIN', '1:5000', 'L6O', '28::3', '23', '70', 'LOGO', '74177']

olmOCR

['**CHASE FREEDOM: ULTIMATE REWARDS(r) SUMMARY**\n\n**February 2024**\n\nNew Balance: $5,084.29\n\nMinimum Payment Due: $50.00\n\nDue Date: 02/28/24\n\n**Late Payment Warning:** If we do not receive your minimum payment by the date listed above, you may have to pay a late fee in addition to your regular interest rate. Your APR will be subject to increase to a maximum Penalty APR of 29.99%.\n\n**Minimum Payment Warning:** If you make only the minimum payment, you could end up paying much more than you expect because it will take you longer to pay off your balance. For example:\n\n- If you make only the minimum payment using this card and don't pay it off, you'll pay:\n  - 15 years: $12,128\n  - 3 years: $5,617 (Savings: $6,511)\n\nIf you would like information about credit counseling services, call 1-800-350-2227.\n\n**ACCOUNT SUMMARY**\n\nAccount Number: 0432 3780 1050 7320\n\nPrevious Balance: $4,233.99\n\nPayment Credits: -$4,233.99\n\nCash Advances: -$5,084.29\n\nCash Advances: $0.00\n\nBalance Transfers: $0.00\n\nInterest Charged: $0.00\n\nNew Balance: $5,084.29\n\n**YOUR ACCOUNT MESSAGES**\n\nReminder: It is important to continue making your payments on time. Your APRs may increase if the minimum payment is not made on time or payments are returned.\n\nYou can AutoPay by calling for $50.00 to be deducted from your Pay From account and directed to your Chase card. If your AutoPay fails one day, they will recall your payment the next day.\n\n**AUTOPAY IS ON**\n\nPayment Due Date: 02/28/24\n\nNew Balance: $5,084.29\n\nMinimum Payment Due: $50.00\n\nAccount number: 0432 3780 1050 7320\n\nAmount Enclosed: $0\n\nAUTOPAY IS ON\n\nCardmember Service\nP.O. Box 16324\nCarmel, IN 46032-6324\n\n**5000 460 2813 23 70 1650 7147?**']

Qwen2.5vl OCR

\n# Chase Freedom Ultimate Rewards Summary\n\n## February 2024\n\n| Date       | S | M | T | W | T | F | S |\n|------------|---|---|---|----|----|----|----|\n| 1          |   |   |   |    |    |    |    |\n| 2          |   |   |   |    |    |    |    |\n| 3          |   |   |   |    |    |    |    |\n| 4          |   |   |   |    |    |    |    |\n| 5          |   |   |   |    |    |    |    |\n| 6          |   |   |   |    |    |    |    |\n| 7          |   |   |   |    |    |    |    |\n| 8          |   |   |   |    |    |    |    |\n| 9          |   |   |   |    |    |    |    |\n| 10         |   |   |   |    |    |    |    |\n| 11         |   |   |   |    |    |    |    |\n| 12         |   |   |   |    |    |    |    |\n| 13         |   |   |   |    |    |    |    |\n| 14         |   |   |   |    |    |    |    |\n| 15         |   |   |   |    |    |    |    |\n| 16         |   |   |   |    |    |    |    |\n| 17         |   |   |   |    |    |    |    |\n| 18         |   |   |   |    |    |    |    |\n| 19         |   |   |   |    |    |    |    |\n| 20         |   |   |   |    |    |    |    |\n| 21         |   |   |   |    |    |    |    |\n| 22         |   |   |   |    |    |    |    |\n| 23         |   |   |   |    |    |    |    |\n| 24         |   |   |   |    |    |    |    |\n| 25         |   |   |   |    |    |    |    |\n| 26         |   |   |   |    |    |    |    |\n| 27         |   |   |   |    |    |    |    |\n| 28         |   |   |   |    |    |    |    |\n| 29         |   |   |   |    |    |    |    |\n| 30         |   |   |   |    |    |    |    |\n| 1          |   |   |   |    |    |    |    |\n\n**New Balance:** $5,084.29  \n**Minimum Payment Due:** $50.00  \n**Payment Due Date:** 02/28/24  \n\n## Total Points Available for Redemption: 45,553\n\n### Late Payment Warning:\nIf you do not receive your minimum payment by the date listed above, you may have to pay a late fee of up to $25. Your APR will be subject to increase to a maximum Penalty APR of 29.99%.\n\n### Minimum Payment Warning:\nIf you make only the minimum payment, it could take you longer to pay off your balance. For example:\n\n- If you make no payments at all using this card and only pay the minimum payment, you will pay off the balance in about 15 years and end up with an additional total of $12,128.\n- If you make only the minimum payment, you will pay off the balance in about 3 years and end up with an additional total of $5,617 (Saving $6,511).\n\nIf you would like information about credit counseling services, call 1-866-757-2885.\n\n## Account Summary\n\n- **Previous Balance:** $4,233.99\n- **Payment, Credits:** -$4,233.99\n- **Cash Advances:** +$5,084.29\n- **Cash Advances:** $0.00\n- **Balance Transfers:** $0.00\n- **Interest Charged:** $0.0']

Receipt – Scan and Poorly Aligned

Tests OCR resilience on low-quality scans, skewed text, and imperfect document alignment.

Download sample →

EasyOCR

Processing page 1... 
Page 1 OCR Results: 
['Fioon) sh Grand ecr 9 (Lott ecd 26-10 UxIT KARETCITY 33440l#aecdizb WATX RofD FHOEMI VElacERY 057 16286 #X442 Ove#NA] - velaotery 76 DLCe] Cr8td [micl F21zze 724-c58 = Hetar-2024( 15*41} Invoice kenon cate IamI LNNn( %1 Casnier Yalue Sply @acrtotion Diech peca piace 41thc9 W Qate Ravnor Cole 4801/010 E Pobl 829 . Threc Keon 034535 490ioio 0773 . 1298 1728 Total 120 Tot : Gcand pnmx Paytweic 5831 Sreh C6st 0.Qu Taweale 0.t0 0.W 1288 - 4228 "Kefind Tot " Exchanat Tar Logic Soft Softrerd -Ioiceott _']

Surya OCR

Processing page 1...
Page 1 OCR Results:
RESISTANCE CONTRACTOR OF CONTRACTOR OF CONTRACTOR OF CONTRACTOR OF CONTRACTOR OF CONTRACTOR OF CONTRACTOR OF CONTRACTOR OF CONTRACTOR OF CONTRACTOR OF CONTRACTOR OF CONTRACTOR OF CONTRACTOR OF CONTRACTOR OF CONTRACTOR OF C
un oun star<br>181 Bark Star<br>1911 Bark Star<br>1911 I Wit Na Star<br>Unterna waarterna
OM BOOK SHOP
PHOENIX WARKETCITY
нтилики маниски (UTY<br>65 Tin _ 35MC060990128<br>142 - 66716296<br>044 - 66716296<br>044 - 66716296
VELACHERN CHENNAL-BOUGAS
cherna Bosbooks.com
TAX INVOICE
. Un-car-usb-c1ccb<br>: 08-06:08:08:07-7024(15:41) 181
Involce No : 01-24-038-21226
PIBER OF SUPPLY : TANTI MODIGES
Date
Value
Cashier
BK (0544555, 450) 1010 Restances a first face
899.00
Code
un 1990 1990 1990 1990 1990 1990 1990 199
GST
N voonous 1 INA
1298.00
1298.00
10%
1298.00
Grand Total
2
Tot:
PAYIMETE NOOSCONDOCK
IGST
SEST
0.00
BST SUMMARY
CEST
0.00
0.00
Taxable
0.00
0.00
GST
1298.00
0.00
0%
1298.00
No Exchangel No Retund<br>Thank You
Tot:
REPRESENTATION CONTRACTOR

docTR OCR

['*', 'Floor)', 'SHOP', 'pvt', 'Ltd', 'Ground', 'OM', 'BOOK', 'Book', '-6-10', 'Shop', '(Lower', 'OBI', 'UNIT', 'NO', 'MANCIETTE', 'ROAD', '-', 'MAIN', 'SEanEe', 'PHOENK', 'Tin', 'GS', 'VELADIERT', '142', '-', '-', 'BETIEMEN', 'Doetiane', '044', 'VALADERT', 'LOReIOn', 'Chennais', 'TAX', 'INVOICE', 'T#1', ':', 'CBECATIEN', 'No', ':', 'EERLIOnT', 'Invoice', ':', 'kanden', 'Date', ':', 'Value', 'TIARLARORD', 'Cashier', 'Supply', 'Of', 'Desrinion', 'Disch', 'Rama', 'Place', 'HSN', 'with', '698.00', 'Code', 'Qty', 'Rate', 'Tamenpba', '0.00%', 'Probl', 'GST', '1821010', '688.00', '508.00', '1', 'INR', '0394355', 'The', 'hvoe-sent', 'BK', '0.00%6', '0%', 'A0010100', '599.00', '1298.00', 'BK', '032753', '1', 'INR', '0%', '12983.00', ':', '2', 'Total', '12983.00', 'Tot:', 'Grand', 'DORSRERUREL', 'IGST', 'PANTHEDC', 'SGST', '0.00', 'Summany', 'CGST', '0.00', 'GST', 'Tauabie', '0.00', '0.00', 'GST', '0.00', '1298300', '0.00', '0%', '1298.00', 'Refund', 'Tot:', 'No', 'Eechenvel', 'You', 'No', 'Thank', 'Soft', 'by', 'Logic', 'Sortwere', 'CERLION']

olmOCR OCR

['**TAX INVOICE**\n\nInvoice No.: Ch-24-GSR-21228  \nDate: 08-December-2024 (5:41)  \nCashier: Kandan  \nPlace Of Supply: Tamilnadu (03)\n\n| Code | HSN | Description | Rate | Quantity | Value |\n|------|-----|-------------|------|----------|-------|\n| BK 0949955 | 48011010 | Reversible With Base | 698.00 | 0.00% | 698.00 |\n| BK 0327753 | 49011010 | The Three-Body Problem | 598.00 | 0.00% | 598.00 |\n\nTotal: 1296.00\n\nGrand Total: 1296.00\n\nPAYMENT: XXXXXXXXXX\n\nGST Summary:\n\n| GST Taxable | CGST | SGST | IGST |\n|-------------|------|------|------|\n| 0% | 1296.00 | 0.00 | 0.00 | 0.00 |\n\nTotal: 1296.00\n\nNo Exchange/No Refund\n\nThank You\n\nSoftware by LogicSoft\nwww.logicsoft.co.in']

Qwen2.5vl OCR

['\n**CM BOOK SHOP**\n**OM Book Shop Pvt Ltd**\n**UNIT NO: 10-10 (Lower Ground Floor)**\n**PHOENIX MARKETCITY**\n**GS TIN : 33ABAN090217B**\n**142, VELACHERY MAIN ROAD**\n**VELACHERY, CHENNAI - 600042**\[n**chennai@ombooks.com](mailto:n**chennai@ombooks.com)**\n\n**TAX INVOICE**\n\n**Invoice No:** Ch-24-GSR-21226\n**Date:** 08-December-2024 (5:41) TIR\n**Cashier:** Kandan\n\n**Place Of Supply:** TAMILNADU (03)\n\n| Code | HSN | Description | Qty | Rate | Disc | Value |\n|------|-----|-------------|-----|------|------|-------|\n| BK 0949955 | 49011010 | Rendezvous With Rama | 1 | INR 698.00 | 0.00% | 698.00 |\n| BK 0927753 | 49011010 | The Three-Body Problem | 1 | INR 598.00 | 0.00% | 598.00 |\n\n**Tot:** 2\n**Grand Total:** 1296.00\n\n**PAYMENTDC:** XXXXXXXXXX\n\n**GST Summary**\n| GST Taxable | CGST | SGST | IGST |\n|--------------|------|------|------|\n| 1296.00      | 0.00 | 0.00 | 0.00 |\n\n**Tot:** 1296.00\n**No Exchange/ No Refund**\n**Thank You**\n\n**Software by Logic Soft**\n**[www.logisoft.co.in**\n```](http://www.logisoft.co.in%2A%2A/n%60%60%60)']

Handwritten Document

Evaluates recognition of cursive and handwritten text, one of the toughest OCR use cases.

Download sample →

EasyOCR

Processing page 1... 
Page 1 OCR Results: 
['Ewdizoo', 'Only', 'matker', 'syle', 'For eclucalioncl purposes 062 anclyse #he enins Pges Pe9: arfi ci Jha} \'Peare The American Mathemctica Monthly, Volume {02 Number February 1995 We nave added line numbers #ne right margin line Since 4his article squares don\'F gel alter nating co lours cou la be argued #hat the +erm chessboard misplaced, line 4: The introdluction o? the name "B seems unnece sscry usre ~in +he combinalion the board #he tex} 6r figure and Ilne boin cascs Jus} the bocrd ould have done Rne In line 77 occurs tne Ics} use U, "XcB which du bious since B wGs board and not set ; Iine would have Preferred Siven set e8 ce Ils Iine 7/8 The frst move being move like ans other does not deserve separate discrip Tion - The term \'step redundant line 3 : why no} move consists of" line {0 /11 At This stage 4he iFalics ere puzzling , since m ove Poss; ble 18'] 

Processing page 2... 
Page 2 OCR Results: 
['ELD1200 -1', '{r Som0 \'j cell (4) contains pebble and cells (inj) and "Cijt\') ure empty line 10 Tvice the term pesi Hi on s whct eveyywhere else called cells line 12 : Why not After moves the board ncs T+ pebbles Iine 12 /14 In the one sentence counts moves in the tner counts Pels bles Since #he prose does not indicote the scope 8 dummies Inis double use tne same Iitfle bil unfergivable line 14 ; and we set Tr := Uxz` RCk) Ve remark the use ~8 +he Jerb 440 se} when deRning (+he set can considered cinfkrtunate since 7ot used On 4ne nex F two peges, #he name seems 40 be introdu ced Jo0 esrly #he inioauelion ~8 the name \'R seems nnceess- ari in #he rest op +he Paper 5a ) used once any Anere reccnable con Igure Tion LoL have dune Nele Tn +ne contert ques Hicm 116 _ +ne reacnable cantex} can remain ano- mnous the quoted ccurrente \'33 tre Iy occurrence tn2 idenhi Rec thar context Ns Conclusisn Inat 4he rea chable con- \'fgureF Jn ncs been']

Surya OCR

Processing page 1...
Page 1 OCR Results:
EWD1200 -0
Only a matter of style?
For educational purposes we analyse the
opening pages of an 11-page article that
appeared in The American Mathematical
Monthly, Volume 102 Number 2/February 1995.
We have added line numbers in the right
margin.
line 4: Since in this article, squares don't get
alternating colours, it could be argued that
the term "chessboard" is misplaced.
line 4. The introduction of the name "B"
seems unnecessary: it is used -in the
combination "the board B" - in the text
for Figure 1 and in line 71; in both cases
just "the board" would have done fine.
In line 77 occurs the last use of B,
viz. in "XCB", which is dubious since
B was a board and not a set; in line
77. I would have preferred "Given a set X
of cells ".
line 7/8 : The first move, being a move<br>like any other, does not deserve a separate
discription. The term "step" is redundant.
line 8: Why not "a move consists of "?
line 10/11: At this stage the italics are
puzzling, since a move is possible if,

Processing page 2...
Page 2 OCR Results:
EWD 1200-1
för some i,j , cell (i,j) contains a pebble<br>and cells (i+1,j) and (i,j+1) are empty.
line 10: Twice the term "positions" for
what everywhere else is called "cells".
line 12: Why not "After k moves the
board has k+1 pebbles on it."?
line 12/14: In the one sentence, k counts
moves, in the other k counts pebbles.<br>Since the prose does not indicate the<br>scope of dummies, this double use of<br>the same k is a little bit unforgivable.
line 14: "and we set <math>R := \bigcup_{k>1} R(k)</math> ". We
remark
· the use of the verb "to set" when defining
(the set!) <math display="inline">{\mathbb R}</math> can be considered unfortunate <math display="inline">\bullet</math> since <math display="inline">{\mathbb R}^n</math> is not used on the next two
pages, the name seems to be introduced
too early
· the introduction of the name R seems
unnecessary; in the rest of the paper I<br>saw it used once in "any C e R", where<br>"any reachable configuration" would have
done. (Note. In the context in question
-p 116- the reachable context can remain
anonymous: the quoted occurrence of<br>C is the only occurrence of the identi-<br>fier C in that context. My conclusion is<br>that the reachable configuration has been

docTR OCR

['EWDI200', '-0', 'Only', 'a', 'matter', 'of', 'style', '?', 'For', 'educational', 'purposes', 'we', 'analyse', 'the', 'opening', 'pages', 'ofan', 'I-page', 'article', 'that', 'appeared', 'in', 'The', 'American', 'Mathematical', 'Monthly,', 'Volume', '102', 'Number', '2/1', 'February', '1995.', 'We', 'have', 'added', 'line', 'numbers', 'in', 'the', 'right', 'margin.', 'line', '4:', 'Since', 'in', 'this', 'article,', 'squares', "don't", 'get', 'alternating', 'colours,', 'it', 'could', 'be', 'argued', 'that', 'the', 'term', '11', "chessboard'", 'is', 'misplaced.', 'line', '4:', 'The', 'introduction', 'of', 'the', 'name', '"B"', 'seems', 'unneces', 'ssary:', 'it', 'is', 'used', '-in', 'the', 'combination', 'I', 'the', 'board', 'B"-', 'in', 'the', 'text', 'for', 'tigure', '1', 'and', 'in', 'line', '71', 'i', 'in', 'both', 'cases', 'just', 'K', 'the', 'board"', 'would', 'have', 'done', 'fine.', 'In', 'line', '77', 'occurs', 'the', 'last', 'use', 'op', 'B,', 'vi2.', 'in', '"XcB",', 'which', 'is', 'dubious', 'since', 'B', 'was', 'C', 'board', 'and', 'not', 'a', 'set;', 'in', 'line', '77,', 'I', 'would', 'have', 'preferred', '11', 'Given', 'a', 'set', 'X', 'of', 'cells', 'I1', 'line', '7/8:', 'The', 'first', 'move', '7', 'being', 'a', 'move', 'like', 'any', 'other,', 'does', 'not', 'deserve', 'a', 'separate', 'discription.', 'The', 'term', '"step\'', 'is', 'redundant.', 'line', '8:', 'Why', 'not', 'I1', 'a', 'move', 'consists', 'of"?', 'line', '10/11:', 'At', 'this', 'stage', 'the', 'italics', 'are', 'puzzling,', 'Since', 'C', 'move', 'is', 'possible', 'F,', 'EWD', '12', '00-1', 'for', 'Some', 'csj,', 'cell', 'Cij', 'contains', 'a', 'pebble', 'and', 'cells', '(inj)', 'and', 'Ki,j+1)', 'Cs', 're', 'empty.', 'line', '10:', 'Twice', 'the', 'term', 'positions"', 'for', 'what', 'everywhere', 'else', 'is', 'called', '11', '\'cells".', 'line', '12:', 'Why', 'not', 'n', 'Afer', 'k', 'moves', 'the', 'board', 'has', 'k+l', 'pebbles', 'on', 'it.', "'1", '?', 'line', '12/14:', 'In', 'the', 'one', 'sentence,', 'k', 'counts', 'moves', '?', 'in', 'the', 'other', 'k', 'counts', 'pebbles.', 'Since', 'the', 'prose', 'does', 'not', 'indicate', 'the', 'scope', 'of', 'dummies,', 'this', 'double', 'use', 'of', 'the', 'Same', 'k', 'is', 'a', 'litHle', 'bit', 'unforgivable.', 'line', '14:', '11', 'and', 'we', 'set', 'R:=', 'Uxai', 'RCk)', '11I', 'We', 'remark', 'the', 'use', 'of', 'the', 'verb', '"to', 'set', '11', 'when', 'defining', '(the', 'set!)', 'R', 'can', 'be', 'considered', 'unfortunate', 'Since', 'R"', 'is', 'not', 'used', 'on', 'the', 'next', 'two', 'pages,', 'the', 'nai', 'me', 'seems', 'to', 'be', 'introduced', 'too', 'early', 'the', 'intro', 'duction', 'of', 'the', 'name', 'R', 'Seems', 'i', 'n', 'necessaryi', 'in', 'the', 'rest', 'of', 'the', 'paper', 'I', 'saw', 'it', 'used', 'once', 'in', 'I', 'any', 'CER"', 'where', 'H', 'ary', 'reachable', 'C', 'omiiguration', 'would', 'have', 'dune.', '(Note.', 'Tn', 'the', 'context', 'in', 'question', '-P', '116-', 'the', 'reachable', 'context', 'can', 'remain', "anony'", 'mous:', ':', 'the', 'quoted', 'occurrens', 'ce', 'of', 'C', 'is', 'the', 'only', 'occurrence', 'of', 'the', 'identin', 'fier', 'C', 'in', 'that', 'context.', 'My', 'Conclusisn', 'is', 'that', 'the', 'reachable', 'configuration', 'has', 'been']

olmOCR

['Only a matter of style?\n\nFor educational purposes we analyse the opening pages of an 11-page article that appeared in The American Mathematical Monthly, Volume 102 Number 2/February 1995. We have added line numbers in the right margin.\n\nline 4: Since in this article, squares don\'t get alternating colours, it could be argued that the term "chessboard" is misplaced.\n\nline 4: The introduction of the name "B" seems unnecessary: it is used -in the combination "the board B"- in the text for Figure 1 and in line 71; in both cases just "the board" would have done fine. In line 77 occurs the last use of B, viz. in "X c B", which is dubious since B was a board and not a set; in line 77, I would have preferred "Given a set X of cells".\n\nline 7/8: The first move, being a move like any other, does not deserve a separate description. The term "step" is redundant.\n\nline 8: Why not "a move consists of"?\n\nline 10/11: At this stage the italics are puzzling, since a move is possible if,'] 

['for some \\( c_{ij} \\), cell \\((c_{ij})\\) contains a pebble and cells \\((i+1,j)\\) and \\((i,j+1)\\) are empty.\n\nline 10: Twice the term "positions" for what everywhere else is called "cells".\n\nline 12: Why not "After k moves the board has \\( k+1 \\) pebbles on it."?\n\nline 12/14: In the one sentence, \\( k \\) counts moves, in the other \\( k \\) counts pebbles. Since the prose does not indicate the scope of dummies, this double use of the same \\( k \\) is a little bit unforgivable.\n\nline 14: "and we set \\( R := \\bigcup_{k \\geq 1} R(k) \\)". We remark\n\n- the use of the verb "to set" when defining (the set!) \\( R \\) can be considered unfortunate\n- since "R" is not used on the next two pages, the name seems to be introduced too early\n- the introduction of the name \\( R \\) seems unnecessary; in the rest of the paper I saw it used once in "any \\( C \\in R \\)", where "any reachable configuration" would have done. (Note. In the context in question - p116 - the reachable context can remain anonymous: the quoted occurrence of \\( C \\) is the only occurrence of the identifier \\( C \\) in that context. My conclusion is that the reachable configuration has been']

Qwen2.5vl OCR

['# Only a matter of style?\n\nFor educational purposes we analyse the opening pages of an 11-page article that appeared in The American Mathematical Monthly, Volume 102 Number 2/Febuary 1995. We have added line numbers in the right margin.\n\n- **Line 4:** Since in this article, squares don\'t get alternating colours, it could be argued that the term "chessboard" is misplaced.\n- **Line 4:** The introduction of the name "B" seems unnecessary: it is used -in the combination "the board B"- in the text for Figure 1 and in line 71; in both cases just "the board" would have done fine.\n- **Line 77:** In line 77 occurs the last use of B, viz. in "X ∈ B", which is dubious since B was a board and not a set; in line 77, I would have preferred "Given a set X of cells".\n- **Line 7/8:** The first move, being a move like any other, does not deserve a separate description. The term "step" is redundant.\n- **Line 8:** Why not "a move consists of"?\n- **Line 10/11:** At this stage the italics are puzzling, since a move is possible if,\n\n'] ['EWD1200-1\n\nFor some \\(i, j\\), cell \\((i, j)\\) contains a pebble and cells \\((i+1, j)\\) and \\((i, j+1)\\) are empty.\n\nline 10: Twice the term "positions" for what everywhere else is called "cells".\n\nline 12: Why not "After k moves the board has \\(k+1\\) pebbles on it."?\n\nline 12/14: In the one sentence, k counts moves, in the other k counts pebbles. Since the prose does not indicate the scope of dummies, this double use of the same k is a little bit unforgivable.\n\nline 14: "and we set R := \\(\\bigcup_{k \\geq 1} R(k)\\)". We remark\n\n- the use of the verb "to set" when defining (the set!) R can be considered unfortunate\n- since "R" is not used on the next two pages, the name seems to be introduced too early\n- the introduction of the name R seems unnecessary; in the rest of the paper I saw it used once in "any C ∈ R", where "any reachable configuration" would have done. (Note. In the context in question - p 116 - the reachable context can remain anonymous: the quoted occurrence of C is the only occurrence of the identifier C in that context. My conclusion is that the reachable configuration has been']

Best Open-Source OCR: Key Insights

The results show clear differences between traditional OCR engines (e.g., EasyOCR) and modern LLM-driven systems (e.g., Qwen2.5-VL).

Observed Strengths & Weaknesses

Traditional OCR Engines (EasyOCR, Surya, docTR) are lightweight, fast, and well-suited for clean, digital documents with simple layouts. However, they struggle with layout-heavy inputs, handwritten text, and low-quality scans.
LLM-Enhanced Systems (olmOCR, Qwen2.5-VL) excel at reconstructing complex layouts, preserving semantic structure, and interpreting handwritten or noisy documents. Their primary drawbacks are higher computational requirements and potential risks of hallucination if not carefully controlled.

Notable Trends

A clear trade-off exists between speed/efficiency (traditional OCR) and accuracy/semantic richness (LLM-based OCR).
Layout preservation emerges as the biggest differentiator: traditional engines flatten documents, while LLM-based approaches maintain sections, headers, and structured tables.
Handwriting recognition remains a weak spot for most traditional tools, with notable improvements only in advanced LLM-enhanced solutions.
Structured output formats (e.g. schema-aware tables) are becoming standard in newer tools, highlighting the shift from plain-text OCR toward document intelligence.

When Open-Source OCR Isn’t Enough

Open-source OCR tools are excellent for experimentation, prototyping, and smaller-scale projects. They provide developers with flexibility, researchers with a testbed for innovation, and organizations with a cost-effective way to start digitizing their documents. For many use cases, such as extracting tables from PDFs or automating simple form data entry, these solutions can work remarkably well.

At the same time, open-source OCR has matured significantly, with both traditional engines and modern LLM-based approaches offering practical options. Tools like Tesseract, PaddleOCR, and Docling remain strong choices for structured, predictable documents, while MistralOCR and other LLM-driven solutions shine when dealing with irregular layouts or complex content.

The trade-off is clear:

Traditional OCR tools deliver speed, stability, and simplicity.
LLM-based OCR tools offer flexibility and context awareness, but may struggle with consistency and efficiency at scale.

Looking ahead, the future of open-source OCR is likely to be hybrid models that combine the efficiency of traditional engines with the adaptability of AI-driven methods. This will reduce trade-offs and bring more balance between accuracy, speed, and scalability.

However, for enterprise-scale operations, the limitations of open-source OCR become more evident. High document volumes, compliance-heavy industries, and mission-critical workloads demand guaranteed reliability, uptime, and dedicated support, which are areas where community-driven projects may fall short.

That’s why many organizations turn to LLMWhisperer as the next step. It builds on the strengths of open-source OCR while providing a scalable, production-ready pipeline designed for enterprise needs. With managed infrastructure, compliance features, and the ability to process millions of documents efficiently, LLMWhisperer transforms OCR from a helpful tool into a mission-critical capability.

For teams exploring open-source OCR today, the path is clear: start with open source, experiment, learn, and when the time comes to scale, move forward with solutions like LLMWhisperer.

Get Started with LLMWhisperer OCR in Minutes

Best Open-Source OCR 2025: FAQ

Which open source OCR tools does the comparison highlight as the strongest performers in 2025?

The article lists several open source OCR tools—Tesseract, PaddleOCR, EasyOCR, Surya, and docTR on the traditional side, plus olmOCR and Qwen2.5-VL on the LLM side. Each shines in a different area: for instance, PaddleOCR handles complex layouts well, while olmOCR preserves tables and markdown structure via multimodal LLMs.

How do the newest AI open source OCR models differ from older engines?

Traditional engines focus on pixel-level character extraction, whereas the latest open source OCR models (like Qwen2.5-VL or olmOCR) integrate vision-language LLMs. These models understand context, infer structure, and can even interpret handwriting, but they require more GPU resources and careful prompt design to avoid hallucinations.

When should a team move from a basic open source OCR solution to something more robust, like LLMWhisperer OCR?

According to the article, a basic open source OCR solution is great for prototypes or low-volume workloads. However, once you face millions of pages, strict compliance, or 24/7 uptime requirements, you’ll need managed infrastructure, stronger SLAs, and advanced post-processing—features that community projects seldom guarantee.

Is LLMWhisperer OCR better than traditional open source OCR models?

Yes, LLMWhisperer OCR is better than traditional open source OCR models in contexts requiring semantic understanding, layout preservation, and handling of irregular documents. While traditional models are faster and more deterministic, LLMWhisperer offers superior accuracy and contextual interpretation, making it ideal for complex real-world use cases.

When should I consider an open source OCR solution over a commercial one?

An open source OCR solution is ideal for prototyping, research, or projects with limited budgets. It offers flexibility and customization. However, for enterprise-scale, high-volume, or compliance-critical applications, a commercial solution like LLMWhisperer OCR may be more suitable.

???? Related topics to explore

Best OCR for parsing receipts

Best PDF OCR for extracting data from invoice

Best OCR for reading bookkeeping documents

Best OCR for accounts payable documents

UNSTRACT

AI Driven Document Processing

The platform purpose-built for LLM-powered unstructured data extraction

Leveraging AI to Convert Unstructured Documents into Usable Data

Best OCR Software in 2025 — A Tool Comparison & Evaluation Guide

LLMWhisperer vs. Mistral OCR: The Best Mistral AI OCR Alternative

A Guide to Optical Character Recognition (OCR) With Tesseract

About Author

Nuno Bispo

Nuno Bispo is a Senior Software Engineer with more than 15 years of experience in software development. He has worked in various industries such as insurance, banking, and airlines, where he focused on building software using low-code platforms. Currently, Nuno works as an Integration Architect for a major multinational corporation. He has a degree in Computer Engineering.

Best Open-Source OCR Tools in 2025: A Comparison

Introduction

What is Open-Source OCR?

How We Evaluated the OCR Tools

Traditional Open-Source OCR Tools

Previously Reviewed OCR Engines

Tesseract → Recap: Learn more about the pros and cons of using Tesseract OCR

PaddleOCR→ Recap: Learn more about the pros and cons of using PaddleOCR

Docling→ Recap: Learn more about the pros and cons of using PaddleOCR

Open-Source OCR Comparison: Updated with New Tools

EasyOCR

Surya OCR

docTR OCR

Modern LLM-Based Open-Source OCR Tools

Recap of Past Evaluations

MistralOCR

Mistral OCR vs. LLMWhisperer OCR→ Learn more about the pros and cons of using Mistral OCR

New additions to the evaluation:

olmOCR

Qwen2.5vl

The Role of LLMs in OCR

Advantages

Drawbacks

When to Use LLM/AI-based OCR (and When Not To)

LLMWhisperer: Best OCR for PDF Checkbox Extraction

Best Open-Source OCR: Results & Insights

Insurance Plan – Complex Tables

EasyOCR

Surya OCR

docTR OCR

olmOCR

Qwen2.5vl OCR

Loan Application – Checkboxes

EasyOCR

Surya OCR

docTR OCR

olmOCR

Qwen2.5vl OCR

Bank Statement – Complex Layout

EasyOCR

Surya OCR

docTR OCR

olmOCR

Qwen2.5vl OCR

Receipt – Scan and Poorly Aligned

EasyOCR

Surya OCR

docTR OCR

olmOCR OCR

Qwen2.5vl OCR

Handwritten Document

EasyOCR

Surya OCR

docTR OCR

olmOCR

Qwen2.5vl OCR

Best Open-Source OCR: Key Insights

When Open-Source OCR Isn’t Enough

Best Open-Source OCR 2025: FAQ

AI Driven Document Processing

About Author

Nuno Bispo

Recent Posts

Tesseract
→ Recap: Learn more about the pros and cons of using Tesseract OCR

PaddleOCR
→ Recap: Learn more about the pros and cons of using PaddleOCR

Docling
→ Recap: Learn more about the pros and cons of using PaddleOCR

Mistral OCR vs. LLMWhisperer OCR
→ Learn more about the pros and cons of using Mistral OCR