Optical Character Recognition (OCR) is the backbone of modern digital workflows. From scanning historical archives and automating data entry to enabling AI-powered document analysis, OCR transforms static images and PDFs into actionable, searchable text. As businesses and researchers continue to digitize processes, the demand for accurate and flexible OCR solutions is only growing.
Open-source OCR software has provided tools that are not only free to use but also provide transparency, community-driven improvements, and the flexibility to customize solutions for specific needs. Open-source OCR has become a powerful option for developers, researchers, and teams who want control over their technology stack without relying on proprietary platforms.
This guide is written for developers, researchers, small teams, and even enterprises looking to experiment with or adopt open-source OCR in their workflows. We’ll cover both traditional OCR engines and modern LLM-powered approaches, giving you a practical sense of their strengths and limitations.
What is Open-Source OCR?
Optical Character Recognition (OCR) is the process of converting text contained in images, scanned documents, or PDFs into machine-readable text. It’s the technology that makes it possible to digitize books, extract information from invoices, process forms automatically, or enable search across large document collections. In short, OCR bridges the gap between the physical and digital worlds.
When we talk about open-source OCR, we mean software that is freely available under open-source licenses. These tools are cost-free, transparent in how they work, and often highly customizable. Developers can adapt them to specific use cases, researchers can experiment with algorithms, and organizations can integrate them into their workflows without being locked into proprietary solutions.
That said, open-source OCR comes with its own set of challenges. Setup and configuration can be more complex compared to polished commercial tools. Maintenance often depends on community activity, which can vary widely across projects. And performance may differ significantly between tools, with some excelling at structured text while struggling with handwriting or complex document layouts.
How We Evaluated the Tools
To make this guide practical, we focused on how each OCR tool performs against real-world document challenges rather than running exhaustive benchmarks.
Our evaluation centered on three key criteria:
Tables – Can the tool correctly recognize and preserve tabular structures?
Forms – How well does it handle checkboxes, radio buttons, and handwriting elements commonly found in forms?
Complex layouts – Does it cope with multi-column text, mixed fonts, and non-standard document designs?
In terms of approach, we’re not re-testing every tool from scratch. For OCR engines we’ve already explored in a previous article, we will provide a summary of past results. For the others, we conducted fresh evaluations.
It’s worth noting that this is not intended as a comprehensive benchmark study. Instead, our goal is to deliver a practical guide that helps developers, researchers, and teams quickly understand which open-source OCR tools might fit their use cases and what trade-offs they should expect.
Traditional Open-Source OCR Tools
Traditional open-source OCR engines form the base of many document digitization workflows. These tools have been battle-tested across countless projects, from academic research to enterprise automation, and remain some of the most reliable free options available. While they may lack the adaptability of newer LLM-based approaches, they excel in stability, performance, and community support.
In this section, we’ll first provide a brief recap of the tools we’ve already evaluated in detail, before moving on to highlight new additions worth exploring.
Previously Evaluated
Tesseract
Tesseract remains one of the most widely used and battle-tested OCR engines. Backed by Google, it has a long history and an active community. Its strength lies in handling clean, high-quality images, making it a dependable choice for straightforward OCR tasks. Tesseract supports more than 100 languages and even allows custom training for domain-specific vocabularies.
Solid accuracy with clean, well-structured documents.
Extensive language support.
Custom training options to improve recognition.
Weaknesses:
Struggles with complex layouts and formatting-heavy documents.
Poor performance on handwritten text.
PaddleOCR
Developed by Baidu, PaddleOCR has quickly gained traction as a robust open-source alternative for multilingual and layout-aware OCR. It performs significantly better than Tesseract when dealing with multi-language documents and complex layouts such as tables and mixed formatting. It’s also optimized for speed, making it a practical option for real-time applications.
Lightweight and fast, suitable for real-time use cases.
Weaknesses:
Structured data extraction is less advanced compared to enterprise cloud services.
Docling
IBM’s Docling is designed for converting documents into lightweight, markdown-like formats. It’s particularly effective when dealing with digital-born PDFs where the layout is relatively simple. Its strength lies in its ability to output clean, human-readable markdown, which makes it appealing for workflows that rely on lightweight text processing. However, its markdown-first design limits its ability to preserve complex layouts or extract structured data, and it struggles with scanned or handwritten inputs.
Struggles with non-digital inputs (scans, handwriting).
New Additions
EasyOCR
EasyOCR is one of the most widely used open-source OCR libraries, with support for 80+ languages. Built on PyTorch, it’s straightforward to install and use; developers can get started with just a few lines of Python. It works well for quick text extraction tasks and provides decent results on scanned documents and images with standard fonts.
Installation:
pip install easyocr pypdfium2
If on Windows, torch and torchvision must be installed first, as per PyTorch documentation.
Usage:
import easyocr
import pypdfium2 as pdfium
import sys
# Create a reader object (multiple languages can be specified, and here it will run on GPU)
reader = easyocr.Reader(['en'], gpu=True)
# Process the pdf file
def process_pdf(pdf_path):
# Load a document
pdf = pdfium.PdfDocument(pdf_path)
# Loop over pages and render
for i in range(len(pdf)):
page = pdf[i]
image = page.render(scale=1).to_pil()
# Save image
file_name = f"docs/output/page_{i+1}.png"
image.save(file_name)
# Read the text from the image
print(f"Processing page {i+1}...")
result = reader.readtext(file_name, detail=0, paragraph=True)
# Print the result
print(f"Page {i+1} OCR Results:")
print(result)
# Close the PDF
pdf.close()
# Main function, receives the path to the pdf file
if __name__ == "__main__":
process_pdf(sys.argv[1])
This script demonstrates how to run EasyOCR on PDF files by first converting each page into an image with pypdfium2, then passing the images to EasyOCR for text extraction. It supports multiple languages (configurable at initialization), runs on GPU by default, and outputs recognized text per page.
Surya
Surya is a modern, open-source OCR system designed for document layout analysis and advanced text extraction. It supports 90+ languages and is built to handle complex documents with tables, multi-column layouts, and varied formatting. Unlike simpler OCR libraries, Surya emphasizes structural understanding of documents, making it more useful for use cases where layout preservation is important.
Installation:
pip install surya-ocr pypdfium2
Usage:
from surya.foundation import FoundationPredictor
from surya.recognition import RecognitionPredictor
from surya.detection import DetectionPredictor
import pypdfium2 as pdfium
import sys
# Process the pdf file
def process_pdf(pdf_path):
# Load a document
pdf = pdfium.PdfDocument(pdf_path)
# Loop over pages and render
for i in range(len(pdf)):
page = pdf[i]
image = page.render(scale=1).to_pil()
print(f"Processing page {i+1}...")
foundation_predictor = FoundationPredictor()
recognition_predictor = RecognitionPredictor(foundation_predictor)
detection_predictor = DetectionPredictor()
predictions = recognition_predictor([image], det_predictor=detection_predictor)
print(f"Page {i+1} OCR Results:")
for prediction in predictions:
for text_lines in prediction.text_lines:
print(text_lines.text)
# Close the PDF
pdf.close()
# Main function, receives the path to the pdf file
if __name__ == "__main__":
process_pdf(sys.argv[1])
This script shows how to apply Surya OCR to PDF documents by rendering each page into an image with pypdfium2 and then passing it through Surya’s Foundation, Recognition, and Detection predictors. The pipeline first analyses the page layout, detects text regions, and then recognizes the textual content line by line.
docTR
docTR (Document Text Recognition) is a deep-learning–based OCR library that integrates text detection and recognition into a single pipeline. Built on TensorFlow and PyTorch, it is designed for handling scanned documents, multi-column layouts, and mixed formatting, making it suitable for more complex document processing tasks than traditional OCR engines.
Installation:
pip install python-doctr
Usage:
from doctr.io import DocumentFile
from doctr.models import ocr_predictor
import sys
# Process the pdf file
def process_pdf(pdf_path):
model = ocr_predictor(pretrained=True)
# PDF
doc = DocumentFile.from_pdf(pdf_path)
# Analyze
result = model(doc)
# Export the result
json_output = result.export()
for page in json_output['pages']:
for block in page['blocks']:
for line in block['lines']:
for word in line['words']:
print(word['value'])
# Main function, receives the path to the pdf file
if __name__ == "__main__":
process_pdf(sys.argv[1])
This script demonstrates how to use docTR for PDF OCR by loading the file with DocumentFile, running it through a pretrained ocr_predictor, and exporting the recognized content as structured JSON. The pipeline captures text at multiple levels (blocks, lines, words), allowing granular access to document content. In this example, the recognized words are iterated and printed directly.
Modern LLM-Based Open-Source OCR Tools
Large Language Models (LLMs) are beginning to reshape how OCR is performed. Unlike traditional OCR engines, which focus on character recognition, LLM-based tools can interpret context, adapt to irregular layouts, and even infer structure when documents don’t follow a standard pattern. These approaches are still new and come with trade-offs, such as higher resource demands and the risk of hallucinations, but they represent a significant step forward in OCR capabilities.
Previously Evaluated
MistralOCR
MistralOCR is a modern LLM-based open-source OCR tool designed to extract text from documents while interpreting context and layout. Unlike traditional OCR engines, it leverages large language models to understand structure and content, providing better results on irregular layouts and mixed-format documents. However, as with most LLM-based systems, it has limitations, especially with structured data, handwriting, and low-quality scans.
New Additions
In the following examples, both olmOCR and Qwen2.5-VL will be executed using the Hugging Face Transformers library. This ensures standardized model loading, tokenization, and inference across different architectures, while making it easy to swap between models or run them locally/with GPU acceleration.
olmOCR
olmOCR is an open-source OCR system developed by Allen AI, built upon the Qwen-2-VL 7B vision-language model. It specializes in converting rasterized PDFs and scanned documents into clean, structured text, preserving layout, tables, equations, and even handwriting. Unlike many proprietary solutions, olmOCR is fully open-source, offering transparency in training data, code, and methodologies.
Installation:
pip install transformers pypdfium2
Usage:
import base64
from transformers import AutoProcessor, AutoModelForVision2Seq
import sys
import pypdfium2 as pdfium
import torch
from transformers import AutoProcessor, AutoModelForImageTextToText
# Process the pdf file
def process_pdf(pdf_path):
# Load a document
pdf = pdfium.PdfDocument(pdf_path)
# Loop over pages and render
image_list = []
for i in range(len(pdf)):
page = pdf[i]
image = page.render(scale=1).to_pil()
# Save image
file_name = f"docs/output/page_{i+1}.png"
image.save(file_name)
# Encode image to base64
with open(file_name, "rb") as image_file:
encoded_string = base64.b64encode(image_file.read()).decode('utf-8')
# Load the model
model_id = "allenai/olmOCR-7B-0725"
processor = AutoProcessor.from_pretrained(model_id)
model = AutoModelForImageTextToText.from_pretrained(model_id, torch_dtype=torch.float16).to("cuda").eval()
# Define the prompt
PROMPT = """
Attached is one page of a document that you must process.
Just return the plain text representation of this document as if you were reading it naturally.
Convert equations to LateX and tables to markdown.
Return your output as markdown
"""
# Define the messages
messages = [
{
"role": "user",
"content": [
{
"type": "image",
"image": encoded_string,
},
{"type": "text", "text": PROMPT},
],
}
]
# Apply the chat template
inputs = processor.apply_chat_template(
messages,
add_generation_prompt=True,
tokenize=True,
return_dict=True,
return_tensors="pt"
).to(model.device)
# Generate the output
output_ids = model.generate(**inputs, max_new_tokens=1000)
generated_ids = [output_ids[len(input_ids):] for input_ids, output_ids in zip(inputs.input_ids, output_ids)]
# Decode the output
output_text = processor.batch_decode(generated_ids, skip_special_tokens=True, clean_up_tokenization_spaces=True)
# Print the output
print(output_text)
# Close the PDF
pdf.close()
# Main function, receives the path to the pdf file
if __name__ == "__main__":
process_pdf(sys.argv[1])
This script showcases how to use olmOCR-7B by converting PDF pages into images, encoding them in Base64, and sending them with a natural language prompt to the multimodal model. Unlike traditional OCR tools, olmOCR interprets the document contextually, returning plain text output in Markdown (defined in this prompt). Running on GPU via Hugging Face’s transformers, it leverages the AutoProcessor and AutoModelForImageTextToText pipeline to generate rich, structured text representations directly from scanned PDFs.
Qwen2.5vl
Qwen2.5-VL is a state-of-the-art vision-language model developed by Alibaba Group. It integrates advanced visual perception with deep language understanding, enabling it to process and interpret images and documents at native resolutions. The model is designed to handle complex document layouts, multi-language text, and various orientations, making it highly effective for OCR tasks.
Installation:
pip install transformers pypdfium2
Usage:
import base64
from transformers import AutoProcessor, AutoModelForVision2Seq
import sys
import pypdfium2 as pdfium
import torch
from transformers import AutoProcessor, AutoModelForImageTextToText
# Process the pdf file
def process_pdf(pdf_path):
# Load a document
pdf = pdfium.PdfDocument(pdf_path)
# Loop over pages and render
image_list = []
for i in range(len(pdf)):
page = pdf[i]
image = page.render(scale=1).to_pil()
# Save image
file_name = f"docs/output/page_{i+1}.png"
image.save(file_name)
# Encode image to base64
with open(file_name, "rb") as image_file:
encoded_string = base64.b64encode(image_file.read()).decode('utf-8')
# Load the model
model_id = "Qwen/Qwen2.5-VL-7B-Instruct"
processor = AutoProcessor.from_pretrained(model_id)
model = AutoModelForImageTextToText.from_pretrained(model_id, torch_dtype=torch.float16).to("cuda").eval()
# Define the prompt
PROMPT = """
Attached is one page of a document that you must process.
Just return the plain text representation of this document as if you were reading it naturally.
Convert equations to LateX and tables to markdown.
Return your output as markdown
"""
# Define the messages
messages = [
{
"role": "user",
"content": [
{
"type": "image",
"image": encoded_string,
},
{"type": "text", "text": PROMPT},
],
}
]
# Apply the chat template
inputs = processor.apply_chat_template(
messages,
add_generation_prompt=True,
tokenize=True,
return_dict=True,
return_tensors="pt"
).to(model.device)
# Generate the output
output_ids = model.generate(**inputs, max_new_tokens=1000)
generated_ids = [output_ids[len(input_ids):] for input_ids, output_ids in zip(inputs.input_ids, output_ids)]
# Decode the output
output_text = processor.batch_decode(generated_ids, skip_special_tokens=True, clean_up_tokenization_spaces=True)
# Print the output
print(output_text)
# Close the PDF
pdf.close()
# Main function, receives the path to the pdf file
if __name__ == "__main__":
process_pdf(sys.argv[1])
This script demonstrates how to use Qwen2.5-VL-7B-Instruct via Hugging Face Transformers for document OCR and structured extraction. Each page of a PDF is rendered to an image with pypdfium2, encoded to base64, and passed along with a text prompt that instructs the model to output markdown-formatted text.
The Role of LLMs in OCR
Large Language Models bring a fundamentally different approach to OCR. Instead of only recognizing characters, they can interpret structure, context, and intent, making them especially effective for documents with irregular layouts, multi-column text, or a mix of tables, handwriting, and images. In essence, LLMs can “read” documents more like humans do, reconstructing meaning rather than just transcribing shapes.
Advantages
Context-aware extraction – LLMs understand not just words, but how they relate, enabling better handling of tables, forms, and multi-modal inputs.
Flexible layouts – Work well with semi-structured or messy documents where traditional OCR struggles.
Beyond text – Can interpret metadata, infer relationships, and sometimes even detect errors or missing content.
Drawbacks
But these strengths come with real challenges:
Hallucinations – LLMs may invent words, numbers, or structures not present in the source. For example, an invoice total might be “corrected” incorrectly because the model inferred a pattern.
Resource intensive – Running LLM-based OCR often requires GPUs, large memory, and careful optimization, making it harder to deploy at scale.
Unpredictable outputs – Unlike traditional OCR, which is deterministic, LLMs may produce slightly different results for the same input. This is problematic in compliance-heavy industries.
Maintenance complexity – Fine-tuning or prompting strategies may be needed to keep accuracy consistent across diverse document types.
When to Use (and When Not To)
LLM-based OCR is best suited for R&D, experimental projects, or innovation-driven use cases where flexibility and interpretive power matter more than strict reproducibility. It’s a great choice when documents are highly varied or contain unstructured information.
However, for enterprise-scale, compliance-driven, or mission-critical workflows, relying solely on LLMs is risky. In those cases, a hybrid approach, using traditional OCR for core text extraction, with LLMs layered on top for context-aware interpretation, often strikes the best balance.
Results & Insights
For evaluating and comparing the different open-source OCR tools, we selected a set of test documents designed to reflect real-world challenges and common use cases.
Insurance Plan – Complex Tables
Tests the ability to correctly extract structured data from multi-row, multi-column tables with nested details.
Processing page 1...
Page 1 OCR Results:
Premera Blue Cross Preferred Gold 1500
PREMERA |
Alaska plan for individuals and families
BLUE CROSS BLUE SHIELD OF ALASKA
Start date January 1, 2023
dent Licenses of the Else Cross New Yorkid Jacobian
You have access to the Legacy and Dental Select Network and the national Blue
Legacy and Dental
Non-preferred providers
Non-participating providera
Cross Blue Shield BlueCard(r) provider network. Please refer to the next page for
ot network
important plan and network information.
Annual deductible
Per calendar year (PCY)
$1,500
$3,000
$3,000
Family = 2x individual
Amount you pay after your deductible is met
30%
4000
60%
Out-of-pockat maximum
Includes deductible, copays
Family = 2x individual (in-network only)
$6,300
Not covered
Not covered
10 essential health benefits
1 Ambulatory patient services
Outpatient services
Deductible, then 30%
Deductible, then 30%
Deductible, then 30%
Office visits
Designated PCP office visit
$30 copay
Deductible then 47%
Deductible then 60%
Specialist office visit
$60 copay
Reductible then 40%
Deductible, then 60%
Urgent care
$60 copay
Deductible, then 40%
Deductible, then 60%
Spinal manipulation: 12 visits PCY;
$30 copav
Deductible then 47%
Deductible then 60%
Acupuncture: 12 visits PCY
2 Emergency services
Emergency care
Deductible, then 30%
Deductible then 30%
Deductible then 30%
Ambulance transportation (air and ground)
Deductible, then 30%
Deductible, then 40%
Deductible, then 60%
3 Hospitalization
Inpatient services
Deductible then 30%
Deductible, then 40%
Deductible, then 60%
Organ and tissue transplants, inpatient
Deductible, then 30%
Not covered
Not covered
4 Meternity and newborn care
Prenatal and postnatal care
Deductible then 30%
Deductible then 47%
Deductible, then 60%
Inpatient delivery and services
Deductible, then 30%
Deductible, then 40%
Deductible, then 60%
5 Mental heelth and substance use
Office visit
$60 copay
Deductible, then 40%
Deductible, then 60%
disorder services, including
Inpatient hospital: mental/behavioral health
Deductible then 30%
Deductible then 47%
Deductible then 60%
behavioral health treatment
Deductible, then 30%
Outpatient services
Deductible then 47%
Deductible then 60%
6 Prescription drugs
Preferred generic
$15 copay
$15 copay
$15 copay
Retail/Specialty: 30-day supply
Preferred brand
$45 copay
$45 copay
$45 copav
Mail order: 90-day supply
Non-preferred drugs
Deductible then 50%
Not covered
Not covered
Deductible then 40%
Deductible then 47%
Reductible then 40%
Specialty
Drug list
Inpatient rehabilitation: 30 days PCY
7 Rehabilitative and habilitative
Deductible, then 30%
Deductible, then 40%
Deductible, then 60%
services and devices
Physical, speech, occupational, massage
Deductible, then $60 copey
Deductible then 47%
Deductible then 60%
therapy: 25 visits combined PCY
Durable medical equipment
Deductible, then 30%
Deductible, then 40%
Deductible, then 60%
Endoratory services
Includes x-ray, pathology, imaging and
Deductible, then 30%
Deductible, then 60%
diagnostic, standard ultrasound
Deductible then 47%
Major imaging, including MRI, CT, PET
Deductible, then 30%
Deductible, then 40%
Deductible, then 60%
(preapproval required for certain services)
Preventive/wellness services
Screenings
Covered in full
Deductible then 47%
Deductible, then 60%
Exams and vaccinations
Covered in full
Deductible, then 40%
Deductible, then 60%
10 Pediatric services, including vision Eye exam: 1 PCY
$30 copay
$30 copay
$30 copay
and dental
Eyewear: 1 pair of glasses PCY (frames and
under 19 years of age
lenses); 12-month supply of contacts PCY,
Covered in full
Covered in full
Covered in full
in lieu of glasses (frames and lenses)
Covered in full
Deductible, then 30%
Deductible, then 30%
Rental: nreventive/hasic/maine
Orthodontia (medically necessary only)
Deductible, then 50%
Deductible, then 50%
Deductible, then 50%
Virtual care
Doctor On Demand: general medicine
Covered in full
Not covered
Not covered
Boulder Care or Workit Health: mental health
$60 copay
Not covered
Not covered
including substance use disorder
All other virtual providers
Deductible, then 40%
Deductible, then 60%
$60 copay
* The deductible applies unless otherwise noted
028509 (09.15.2022)
Processing page 2...
Page 2 OCR Results:
...
['| Benefit Category | Legacy and Dental Select network | Non-preferred providers | Non-participating providers |\n|------------------|---------------------------------|------------------------|---------------------------|\n| **Annual deductible** | $1,500 (Family - 2x individual) | $3,000 | $3,000 |\n| **Out-of-pocket maximum** | Includes deductible, copays | Not covered | Not covered |\n| **10 essential health benefits** | | | |\n| **Ambulatory patient services** | Outpatient services | Deductible, then 30% | Deductible, then 30% |\n| Office visits | Designated PCP office visit | $30 copay | Deductible, then 40% |\n| | Specialist office visit | $60 copay | Deductible, then 40% |\n| | Urgent care | $90 copay | Deductible, then 40% |\n| | Special population: 12 visits PCY | $30 copay | Deductible, then 40% |\n| | Acupuncture: 12 visits PCY* | | |\n| **Emergency services** | Emergency care | Deductible, then 30% | Deductible, then 30% |\n| | Ambulance transportation (air and ground) | Deductible, then 30% | Deductible, then 60% |\n| **Hospitalization** | Inpatient services | Deductible, then 30% | Deductible, then 60% |\n| | Organ and tissue transplants, inpatient | Not covered | Not covered |\n| **Maternity and newborn care** | Pregnancy and postpartum care | Deductible, then 30% | Deductible, then 60% |\n| | Inpatient delivery and services | Deductible, then 30% | Deductible, then 60% |\n| **Mental health and substance use disorder services, including behavioral health treatment** | Inpatient hospital: mental/behavioral health | $60 copay | Deductible, then 40% |\n| | Outpatient services | Deductible, then 30% | Deductible, then 40% |\n| **Prescription drugs** | Preferred generic | $15 copay | $15 copay |\n| | Retail/Specialty: 30-day supply | $45 copay | $45 copay |\n| | Mail order: 90-day supply | Not covered | Not covered |\n| | Specialty | Deductible, then 40% | Deductible, then 40% |\n| | Drug list | M4 | |\n| **Rehabilitative and habilitative services and devices** | Inpatient rehabilitation: 30 days PCY | Deductible, then 30% | Deductible, then 60% |\n| | Physical, speech, occupational, massage therapy: 25 visits combined PCY | Deductible, then $60 copay | Deductible, then 60% |\n| | Durable medical equipment | Deductible, then 30% | Deductible, then 60% |\n| **Laboratory services** | Includes x-ray, pathology, imaging and diagnostic, standard ultrasound | Deductible, then 30% | Deductible, then 60% |\n| | Major imaging, including MRI, CT, PET (geographic area required for certain services) | Deductible, then 30% | Deductible, then 60% |\n| **Preventive/wellness services** | Screenings | Covered in full | Deductible, then 40% |\n| | Exams and vaccinations | Covered in full | Deductible, then 60% |\n| **Pediatric services, including vision and dental under 19 years of age** | Eye exam: 1 PCY | $30 copay | $30 copay |\n| | Eyewear: 1 pair of glasses PCY (frames and lenses): 12 month supply of contacts PCY, in lieu of glasses (frames and lenses) | Covered in full | Covered in full |\n| | Dental preventive/basic/major | Covered in full | Deductible, then 30% |\n| | Orthodontics (medically necessary only) | Deductible, then 50% | Deductible, then 50% |\n| **Virtual care** | Doctor on Demand: primary medicine | Covered in full | Not covered |\n| | Boulder Care or World Health: mental health including substance use disorder |']
Qwen2.5vl
['Premera Blue Cross Preferred Gold 1500\nAlaska plan for individuals and families\nStart date January 1, 2023\n\nYou have access to the Legacy and Dental Select Network and the national Blue Cross Blue Shield BlueCard(r) provider network. Please refer to the next page for important plan and network information.\n\n| Annual deductible | Legacy and Dental Select network | Non-preferred providers | Non-participating providers |\n|---|---|---|---|\n| Family = 2+ individual | $1,500 | $3,000 | $3,000 |\n| Amount you pay after your deductible is met | 30% | 40% | 60% |\n| Out-of-pocket maximum | Includes deductible, copays | Not covered | Not covered |\n| Family = 2+ individual (in-network only) | $6,300 | | |\n\n## 10 essential health benefits\n\n| 1 | Ambulatory patient services | Outpatient services | Designated PCP office visit | Specialist office visit | Urgent care | Special consultation: 12 visits PCY; Acupuncture: 12 visits PCY |\n|---|---|---|---|---|---|---|\n| | | Deductible, then 30% | $30 copay | Deductible, then 40% | Deductible, then 40% | Deductible, then 40% |\n| | | | | | | |\n| 2 | Emergency services | Emergency care | Deductible, then 30% | Deductible, then 30% | Deductible, then 30% | Deductible, then 30% |\n| | | Ambulance transportation (air and ground) | Deductible, then 30% | Deductible, then 40% | Deductible, then 60% | Deductible, then 60% |\n| 3 | Hospitalization | Inpatient services | Deductible, then 30% | Deductible, then 30% | Not covered | Deductible, then 60% |\n| | | Organ and tissue transplants, inpatient | Deductible, then 30% | | | |\n| 4 | Maternity and newborn care | Prenatal and postpartum care | Deductible, then 30% | Deductible, then 40% | Deductible, then 60% | Deductible, then 60% |\n| | | Inpatient delivery and services | Deductible, then 30% | | | |\n| 5 | Mental health and substance use disorder services, including behavioral health treatment | Inpatient hospital: mental/behavioral health | Deductible, then 30% | Deductible, then 40% | Deductible, then 60% | Deductible, then 60% |\n| | | Outpatient services | Deductible, then 30% | | | |\n| 6 | Prescription drugs | Preferred generic | $15 copay | $15 copay | $15 copay | |\n| | | Retail/Specialty: 30-day supply | Preferred brand | $45 copay | $45 copay | $45 copay |\n| | | Mail order: 90-day supply | Non-preferred drugs | Deductible, then 50% | Not covered | Not covered |\n| | | Specialty | Deductible, then 40% | Deductible, then 40% | Deductible, then 40% |\n| | | Drug list | M4 | | | |\n| 7 | Rehabilitative and habilitative services and devices | Inpatient rehabilitation: 30 days PCY | Deductible, then 30% | Deductible, then 40% | Deductible, then 60% | Deductible, then 60% |\n| | | Physical, speech, occupational, massage therapy: 25 visits combined PCY | Deductible, then $60 copay | Deductible, then 40% | Deductible, then 60% | Deductible, then 60% |\n| | | Duraplasty: 1 visit | Deductible, then 30% | | | |\n| 8 | Laboratory services | Includes x-ray, pathology, imaging and diagnostic, standard ultrasound | Deductible, then 30% | Deductible, then 40% | Deductible, then']
Loan Application – Checkboxes
Evaluates how well tools detect and interpret form fields, checkboxes, and radio buttons.
Processing page 1...
Page 1 OCR Results:
['Te ScoTpnrdorirljtr CFTEFAdenmnm dezhi', '""genzy Laa %a', 'Uniform Residential Loan Application Verity andcompletc thcintomationon thu applitaton Inyorl nio marico Jircccd DrYCurLenozr', '"PPnnotcrin65 Icunathothan Cao7 ocditicnalJcTC MEI PIOvice', 'Section 1: Borrower Information: This section asks about your personal information &ndyour ircorne from emiplovmneril othe! sources such asreluernen(al vou wvani coisideredlooualifytor this loan-', 'peronaiintonaton', 'Rame ethdal', 'Alno', 'Logalquntynumar (cr Individual Toxpoycr IdentuicohonNumBen) Date 0f Wirth Cmrenknin Yayyl I Cibcen JermeacntpengentAen Ton-ermenentpenoentMen', 'Jchn Acama nteratenama Lufaninamc Drmhiyou cnonotongcama Lindcr Which credtmasprypuilYrrrered (Fetuidle Last Jumx', 'Typc of Credi Ianapalng for individual credit Ianapatina foricint credi Totl Numocr cf Bcutowcix Womo2Nintendr poly tr joint crede + Your initialz', 'List Nimelilof Other Vomowerkl Appking forthis Loan Afirtn Wnste Use @icpuro D( benvrenname', 'attaletatus', 'Depcnocnts(notMtez Numjcr', 'anomeriomoe', 'contacntomaticn', 'Vetkd Scponted Unmetico (Singke Drvced Wiowed,Uvil Urion Danestk Partnersh P; Reguterd heoprcoi BenchncyrachonshIp_', 'Homc Phcnc Ucpacor', 'Monk Phone', 'Lnorareccm', 'cumten Addness Stiect 145 Fro Hijjo Pliza Cit; Ham Kjw Lcngat Cunent nddres!', 'Unit $ 23101 Ccuntn ORent ($ 53oo', 'Montha Aousino', 'No pnnaYnouo EDeOte', 'Umorai', 'Mat CurrcntAddress torLESsthanzycatsIut Formcr Addtes= Etteet', 'Dousnotopply', 'Wnie', 'Ccuntn', 'Kjw cngattorrn Adole', 'Montha Aousino', 'nenenouag ecrc', 'monthi', 'Milling^0orczs oumerenimomeutentaddmess Stiect 345 Fro Hijjo Pliza', 'Dogsnotapny', 'Unit $ Gqwrtn', 'Cit; Ham', 'cuntcnt Lmploymentzcil-EmploymcntandIncomt', 'dDoes', 'GrosMonthly Incomt', 'Employcr or Buzincs; Name Strect Fonddn Djdia', 'Talc CCMciG Sacms', 'Fnonc', 'eraccon Keeeh', 'nm=', 'Oraimt', 'Mnneh', '@ 3101', 'Lountr', 'ecdommeren', 'Posticn or Tiex Enjreerra Manjocr stan Datc Ui', 'Cncorm thu ntementappler Dlam tndoytdbyafamik Ncmb; Cropttly tcllt Gale JenL 0i ott FJI; [0 thetnac', 'Commia', 'Keeeh', 'Lan', 'Kjw IcngInthiinc Ciork?', 'Montha', 'EEtiLnan 5 5', 'Jnexen', 'Mnneh', 'Ocheck ifyou are the Dusiness Owncr or Sem-Employco', 'Inave an ovncnnip tnarc ofleaathan *4_ EcnthlvinromeiGTlon Totes navc an ovncnnip tnarc 0f184mcrc;', '0loo AFuneh', 'Medenn D _Annnnnrocie Ficddicenc Fomai Fanna Auc Fom ICO BtTir [JM1']
Surya
Processing page 1...
Page 1 OCR Results:
To be completed by the Lender:
Lender Loan No./Universal Loan Identifier 8849949003
Agency Case No. 7748800
Uniform Residential Loan Application
Verify and complete the information on this application. If you are applying for this loan with others, each additional Borrower must provide
information as directed by your Lender.
Section 1: Borrower Information. This section asks about your personal information and your income from
employment and other sources, such as retirement, that you want considered to qualify for this loan.
1a. Personal Information
Name (First, Middle, Last, Suffix)
Social Security Number
(or Individual Taxpayer Identification Number)
John Adams
Date of Birth
Alternate Names - List any names by which you are known or any names
Citizenship
(mm/dd/yyyy)
under which credit was previously received (First, Middle, Last, Suffix)
U.S. Citizen
02 / 02 /
Ö Permanent Resident Alien
1976
Non-Permanent Resident Alien
Type of Credit
List Name(s) of Other Borrower(s) Applying for this Loan
(First, Middle, Last, Suffix) - Use a separator between names
I am applying for individual credit.
O I am applying for joint credit. Total Number of Borrowers:
Each Borrower intends to apply for joint credit. Your initials:
Marital Status
Dependents (not listed by another Borrower)
Contact Information
Married
Number 3
Home Phone (<u>884</u>) <u>443</u> -
3433
O Separated
Ages 12,6,4
Cell Phone
(01)233
3454
O Unmarried
Work Phone
(
Ext.
(Single, Divorced, Widowed, Civil Union, Domestic Partnership, Registered
Email [johna@infoware.com](mailto:johna@infoware.com)
Reciprocal Beneficiary Relationship)
Current Address
Street 345, Fire Ridge Plaza
Unit #
City Miami
State FL
ZIP 33101
Country US
Months Housing () No primary housing expense () Own () Rent ($
How Long at Current Address?
Years
(month)
$4300
If at Current Address for LESS than 2 years, list Former Address
Does not apply
Street
Unit #
City
State
ZIP
Country
Months Housing ONo primary housing expense O Own O Rent ($
How Long at Former Address?
Years
(month)
Mailing Address - if different from Current Address Does not apply
Street 345, Fire Ridge Plaza
Unit #
City Miami
7IP 33101
Country US
State EL
Does not apply
1b. Current Employment/Self-Employment and Income
Gross Monthly Income
Employer or Business Name Info ware computer systems
Phone
Base
$ $140000 /month
Street 23, Rohde Building
Unit #
Overtime
/month
S
State FL
ZIP 33101
Country US
City Miami
Bonus
$2000 /month
ς
Position or Title Engineering Manager
Check if this statement applies:
Commission S
/month
I am employed by a family member,
Start Date
/ /
(mm/dd/yyyy)
Military
property seller, real estate agent, or other
Entitlements S
/month
How long in this line of work? 4 Years 3 Months
party to the transaction.
Other
/month
Check if you are the Business I have an ownership share of less than 25%. Monthly Income (or Loss)
TOTAL $
0.00/month
Owner or Self-Employed
O I have an ownership share of 25% or more. $
Uniform Residential Loan Application
Freddie Mac Form 65 • Fannie Mae Form 1003
Effective 1/2021
['Uniform Residential Loan Application\n\nVerify and complete the information on this application. If you are applying for this loan with others, each additional Borrower must provide information as directed by your Lender.\n\nSection 1: Borrower Information. This section asks about your personal information and your income from employment and other sources, such as retirement, that you want considered to qualify for this loan.\n\n1a. Personal Information\n\nName (First, Middle, Last, Suffix) John Adams\nAlternate Names - List any names by which you are known or any names under which credit was previously received (First, Middle, Last, Suffix)\n\nSocial Security Number (or Individual Taxpayer Identification Number)\nDate of Birth (mm/dd/yyyy) 02 / 02 / 1976\nCitizenship\n☐ US Citizen ☐ Permanent Resident Alien ☐ Non-Permanent Resident Alien\n\nType of Credit\n☐ I am applying for individual credit.\n☐ I am applying for joint credit. Total Number of Borrowers: Each Borrower intends to apply for joint credit. Your initials:\n\nList Name(s) of Other Borrower(s) Applying for this Loan (First, Middle, Last, Suffix) - Use a separator between names\n\nMarital Status\n☐ Married\n☐ Separated\n☐ Unmarried\nDependents (not listed by another Borrower)\nNumber 3 Ages 12, 8, 4\n\nContact Information\nHome Phone ( ) 884 - 443 - 3433\nCell Phone ( ) 01 - 233 - 3454\nWork Phone ( ) - Ext.\nEmail [johns@infoware.com](mailto:johns@infoware.com)\n\nCurrent Address\nStreet 345, Fire Ridge Plaza\nCity Miami\nState FL\nZIP 33101\nCountry US\nUnit #\nHow Long at Current Address? Years Months Housing No primary housing expense Own Rent ($) /month\n\nIf at Current Address for LESS than 2 years, list Former Address Does not apply\nStreet\nCity\nState\nZIP\nCountry\nUnit #\nHow Long at Former Address? Years Months Housing No primary housing expense Own Rent ($) /month\n\nMailing Address - if different from Current Address Does not apply\nStreet 345, Fire Ridge Plaza\nCity Miami\nState FL\nZIP 33101\nCountry US\nUnit #\n\n1b. Current Employment/Self-Employment and Income\n\nEmployer or Business Name Into ware computer systems\nPhone ( )\nStreet 23, Rodeo Building\nCity Miami\nState FL\nZIP 33101\nCountry US\nUnit #\nGross Monthly Income\nBase $14000/month\nOvertime $/month\nBonus $2000/month\nCommission $/month\nMilitary Entitlements $/month\nOther $/month\nTOTAL $0.00/month\n\nPosition or Title Engineering Manager\nStart Date (mm/dd/yyyy)\nHow long in this line of work? 4 Years 2 Months\n\nCheck if this statement applies:\n☐ I am employed by a family member, property seller, real estate agent, or other party to the transaction\n☐ Check if you are the Business Owner or Self-Employed\n☐ I have an ownership share of less than 25% Monthly income (or Loss)\n☐ I have an ownership share of 25% or more\n\nUniform Residential Loan Application\nFreddie Mac Form 65 - Fannie Mae Form 1003\nEffective 1/2021']
Qwen2.5vl
['\n# Uniform Residential Loan Application\n\n**To be completed by the Lender:**\n- Lender Loan No./Universal Loan Identifier: 884549023\n- Agency Case No.: 7148833\n\n## Section 1: Borrower Information\n\n### 1a. Personal Information\n\n| Field | Value |\n|-------|-------|\n| Name (First, Middle, Last, Suffix) | John Adams |\n| Social Security Number (or Individual Taxpayer Identification Number) | - |\n| Date of Birth (mm/dd/yyyy) | 02/02/1976 |\n| Citizenship | U.S. Citizen |\n| Type of Credit | I am applying for individual credit. |\n| List Name(s) of Other Borrower(s) Applying for this Loan | - |\n| Marital Status | Married |\n| Dependents (not listed by another Borrower) | Number 3 |\n| Contact Information | Home Phone: 884-443-3433<br>Cell Phone: 01-233-3545<br>Email: [johna@infoware.com](mailto:johna@infoware.com) |\n| Current Address | Street: 345, Fire Ridge Plaza<br>City: Miami<br>State: FL<br>ZIP: 33101<br>Country: US |\n| How Long at Current Address? | Years: 4<br>Months: 2 |\n| Housing | No primary housing expense |\n| Mailing Address | Same as current address |\n\n### 1b. Current Employment/Self-Employment and Income\n\n| Field | Value |\n|-------|-------|\n| Employer or Business Name | Info ware computer systems |\n| Position or Title | Engineering Manager |\n| Start Date | 02/02/1976 |\n| How Long in this Line of Work? | 4 years, 2 months |\n| Gross Monthly Income | Base: $140000/month<br>Bonus: $2000/month<br>Total: $0.00/month |\n\n**Check if this statement applies:** I am employed by a family member, property seller, real estate agent, or other party to the transaction.\n\n**Check if you are the Business Owner or Self-Employed:** I have an ownership share of less than 25%.\n\n**Monthly Income (or Loss):** $0.00/month\n']
Bank Statement – Complex Layout
Challenges OCR engines with mixed layouts including headers, transaction tables, and scattered text.
Processing page 1...
Page 1 OCR Results:
Interpretation of the second and the second and the second and the second and the second and the second and the second and the second and the second and the second and the second and the second and the second and the secon
Customer Service:<br>1-000-031-000
C.
Mible Doubleding
freedom
Channe blocks Furger locks
New Balance
CHASE FREEDOM: ULTIMATE
February 2024
$5,084,29
REWARDS(r) SUMMARY
1.4
т
<b>W</b>
Ŧ
.
.....
8
Minimum Payment Due
28 29 30 31 1 2 3
Previous points balance
40,458
$50.00
+ 1% (1 Pt)/$1 earned on all purchases
5,085
5 6 7 8 9
4
10
Payment Due Date
Total points available for
11 12 13 14 15 16 17
02/28/24
redemption
45,553
18 19 20 21 22 23 24
25 26 27 28 29 1 2
Start receering today. Vait Ultimate Rewards(r) at
[www.ultimaterewards.com](http://www.ultimaterewards.com/)
3 4 5 6 7 8 9
You always earn unlimited 1% cash back on all your purchases.<br>Activate new bonus categories every quarter. You'll earn an
Late Payment Warning: If we do not receive your minimum payment
by the date listed above, you may have to pay a late fee of up to<br>$40.00 and your APIY's will be subject to increase to a maximum
additional 4% cash back, for a total of 5% cash back on up to
$1,500 in combined bonus calegory purchases each quarter.
Penalty APR of 29.99%.
Activate for free at chase com/freedom, visit a Chase branch or
call the number on the back of your card.
Minimum Payment Warning: If you make only the minimum
payment each period, you will pay more in interest and it will take you <br>longer to pay off your balance. For example:
If you make no
You will pay off the
And you will end up
additional charpes
balance shown on this
paying an estimated
using this card and
statement in about ...
total of ...
ach month you pay.
Paid $6000 on oct 25/24
Only the minimum
15 years
$12.128
including previous balance
Destrent
$189
$5,817
3 years
(Savings=$5,311)
If you would like information about credit counseling services, call
1-866-797-2885
ACCOUNT SUMMARY
Account Number: 4342 3780 1050 7320
President Balance
$4 233 99
Payment, Credits
.$4 233 93
+$5,084.29
Purchases
Cash Advances
$0.00
Balance Transfers
$0.00
Fees Charged
$0.00
Interest Charged
$0.00
$5,084.29
New Balance
Opening/Closing Date
01/04/24 - 02/03/24
Courlet Accesse Line
$31 700
Available Credit
$26,615
Cash Access Line
$1,585
Available for Cash
$1,585
Past Due Amount
$0.00
Balance over the Credit Access Line
$0.00
YOUR ACCOUNT MESSAGES
Reminder: It is important to continue making your payments on time. Your APRs may increase if the minimum payment is not made on
time or payments are returned.
Your next AutoPay payment for $5,081,29 will be deducted from your Pay From account and orbidied on your due <br>date. If your due date fails on a Saturday, well credit your payment the Friday before.
0000001 F6833339 D 12
Y # 03 24/02/02
Page 1 of 2
08810 MA MA 26942 02410000120000694201
Freedom
433232701040241/20000506429000002
P.O. BOX 15123
Payment Due Date:
02/28/24
AUTOPAY IS ON
WILMINGTON, DE 19850-5123
$5,084.29
New Balance:
See Your Account <br>Messages for details.
For Undeliverable Mail Only
$50.00
Minimum Payment Due:
Account number: 4342 3780 1050 7320
$_
_ Amount Enclosed
AUTOPAY IS ON
NEET BEY & FAMILIE
LARRY PAGE<br>C PAGE
24917 KEYSTONE AVE<br>LOS ANGELES CA 97015-5505
CARDMEMBER SERVICE
PO BOX 6294<br>CAROL STREAM IL 60197-6294
19000 160 280 3 23 20 1060 21 17 20
['**CHASE FREEDOM: ULTIMATE REWARDS(r) SUMMARY**\n\n**February 2024**\n\nNew Balance: $5,084.29\n\nMinimum Payment Due: $50.00\n\nDue Date: 02/28/24\n\n**Late Payment Warning:** If we do not receive your minimum payment by the date listed above, you may have to pay a late fee in addition to your regular interest rate. Your APR will be subject to increase to a maximum Penalty APR of 29.99%.\n\n**Minimum Payment Warning:** If you make only the minimum payment, you could end up paying much more than you expect because it will take you longer to pay off your balance. For example:\n\n- If you make only the minimum payment using this card and don't pay it off, you'll pay:\n - 15 years: $12,128\n - 3 years: $5,617 (Savings: $6,511)\n\nIf you would like information about credit counseling services, call 1-800-350-2227.\n\n**ACCOUNT SUMMARY**\n\nAccount Number: 0432 3780 1050 7320\n\nPrevious Balance: $4,233.99\n\nPayment Credits: -$4,233.99\n\nCash Advances: -$5,084.29\n\nCash Advances: $0.00\n\nBalance Transfers: $0.00\n\nInterest Charged: $0.00\n\nNew Balance: $5,084.29\n\n**YOUR ACCOUNT MESSAGES**\n\nReminder: It is important to continue making your payments on time. Your APRs may increase if the minimum payment is not made on time or payments are returned.\n\nYou can AutoPay by calling for $50.00 to be deducted from your Pay From account and directed to your Chase card. If your AutoPay fails one day, they will recall your payment the next day.\n\n**AUTOPAY IS ON**\n\nPayment Due Date: 02/28/24\n\nNew Balance: $5,084.29\n\nMinimum Payment Due: $50.00\n\nAccount number: 0432 3780 1050 7320\n\nAmount Enclosed: $0\n\nAUTOPAY IS ON\n\nCardmember Service\nP.O. Box 16324\nCarmel, IN 46032-6324\n\n**5000 460 2813 23 70 1650 7147?**']
Qwen2.5vl
\n# Chase Freedom Ultimate Rewards Summary\n\n## February 2024\n\n| Date | S | M | T | W | T | F | S |\n|------------|---|---|---|----|----|----|----|\n| 1 | | | | | | | |\n| 2 | | | | | | | |\n| 3 | | | | | | | |\n| 4 | | | | | | | |\n| 5 | | | | | | | |\n| 6 | | | | | | | |\n| 7 | | | | | | | |\n| 8 | | | | | | | |\n| 9 | | | | | | | |\n| 10 | | | | | | | |\n| 11 | | | | | | | |\n| 12 | | | | | | | |\n| 13 | | | | | | | |\n| 14 | | | | | | | |\n| 15 | | | | | | | |\n| 16 | | | | | | | |\n| 17 | | | | | | | |\n| 18 | | | | | | | |\n| 19 | | | | | | | |\n| 20 | | | | | | | |\n| 21 | | | | | | | |\n| 22 | | | | | | | |\n| 23 | | | | | | | |\n| 24 | | | | | | | |\n| 25 | | | | | | | |\n| 26 | | | | | | | |\n| 27 | | | | | | | |\n| 28 | | | | | | | |\n| 29 | | | | | | | |\n| 30 | | | | | | | |\n| 1 | | | | | | | |\n\n**New Balance:** $5,084.29 \n**Minimum Payment Due:** $50.00 \n**Payment Due Date:** 02/28/24 \n\n## Total Points Available for Redemption: 45,553\n\n### Late Payment Warning:\nIf you do not receive your minimum payment by the date listed above, you may have to pay a late fee of up to $25. Your APR will be subject to increase to a maximum Penalty APR of 29.99%.\n\n### Minimum Payment Warning:\nIf you make only the minimum payment, it could take you longer to pay off your balance. For example:\n\n- If you make no payments at all using this card and only pay the minimum payment, you will pay off the balance in about 15 years and end up with an additional total of $12,128.\n- If you make only the minimum payment, you will pay off the balance in about 3 years and end up with an additional total of $5,617 (Saving $6,511).\n\nIf you would like information about credit counseling services, call 1-866-757-2885.\n\n## Account Summary\n\n- **Previous Balance:** $4,233.99\n- **Payment, Credits:** -$4,233.99\n- **Cash Advances:** +$5,084.29\n- **Cash Advances:** $0.00\n- **Balance Transfers:** $0.00\n- **Interest Charged:** $0.0']
Receipt – Scan and Poorly Aligned
Tests OCR resilience on low-quality scans, skewed text, and imperfect document alignment.
Processing page 1...
Page 1 OCR Results:
RESISTANCE CONTRACTOR OF CONTRACTOR OF CONTRACTOR OF CONTRACTOR OF CONTRACTOR OF CONTRACTOR OF CONTRACTOR OF CONTRACTOR OF CONTRACTOR OF CONTRACTOR OF CONTRACTOR OF CONTRACTOR OF CONTRACTOR OF CONTRACTOR OF CONTRACTOR OF C
un oun star<br>181 Bark Star<br>1911 Bark Star<br>1911 I Wit Na Star<br>Unterna waarterna
OM BOOK SHOP
PHOENIX WARKETCITY
нтилики маниски (UTY<br>65 Tin _ 35MC060990128<br>142 - 66716296<br>044 - 66716296<br>044 - 66716296
VELACHERN CHENNAL-BOUGAS
cherna Bosbooks.com
TAX INVOICE
. Un-car-usb-c1ccb<br>: 08-06:08:08:07-7024(15:41) 181
Involce No : 01-24-038-21226
PIBER OF SUPPLY : TANTI MODIGES
Date
Value
Cashier
BK (0544555, 450) 1010 Restances a first face
899.00
Code
un 1990 1990 1990 1990 1990 1990 1990 199
GST
N voonous 1 INA
1298.00
1298.00
10%
1298.00
Grand Total
2
Tot:
PAYIMETE NOOSCONDOCK
IGST
SEST
0.00
BST SUMMARY
CEST
0.00
0.00
Taxable
0.00
0.00
GST
1298.00
0.00
0%
1298.00
No Exchangel No Retund<br>Thank You
Tot:
REPRESENTATION CONTRACTOR
['**TAX INVOICE**\n\nInvoice No.: Ch-24-GSR-21228 \nDate: 08-December-2024 (5:41) \nCashier: Kandan \nPlace Of Supply: Tamilnadu (03)\n\n| Code | HSN | Description | Rate | Quantity | Value |\n|------|-----|-------------|------|----------|-------|\n| BK 0949955 | 48011010 | Reversible With Base | 698.00 | 0.00% | 698.00 |\n| BK 0327753 | 49011010 | The Three-Body Problem | 598.00 | 0.00% | 598.00 |\n\nTotal: 1296.00\n\nGrand Total: 1296.00\n\nPAYMENT: XXXXXXXXXX\n\nGST Summary:\n\n| GST Taxable | CGST | SGST | IGST |\n|-------------|------|------|------|\n| 0% | 1296.00 | 0.00 | 0.00 | 0.00 |\n\nTotal: 1296.00\n\nNo Exchange/No Refund\n\nThank You\n\nSoftware by LogicSoft\nwww.logicsoft.co.in']
Qwen2.5vl
['\n**CM BOOK SHOP**\n**OM Book Shop Pvt Ltd**\n**UNIT NO: 10-10 (Lower Ground Floor)**\n**PHOENIX MARKETCITY**\n**GS TIN : 33ABAN090217B**\n**142, VELACHERY MAIN ROAD**\n**VELACHERY, CHENNAI - 600042**\[n**chennai@ombooks.com](mailto:n**chennai@ombooks.com)**\n\n**TAX INVOICE**\n\n**Invoice No:** Ch-24-GSR-21226\n**Date:** 08-December-2024 (5:41) TIR\n**Cashier:** Kandan\n\n**Place Of Supply:** TAMILNADU (03)\n\n| Code | HSN | Description | Qty | Rate | Disc | Value |\n|------|-----|-------------|-----|------|------|-------|\n| BK 0949955 | 49011010 | Rendezvous With Rama | 1 | INR 698.00 | 0.00% | 698.00 |\n| BK 0927753 | 49011010 | The Three-Body Problem | 1 | INR 598.00 | 0.00% | 598.00 |\n\n**Tot:** 2\n**Grand Total:** 1296.00\n\n**PAYMENTDC:** XXXXXXXXXX\n\n**GST Summary**\n| GST Taxable | CGST | SGST | IGST |\n|--------------|------|------|------|\n| 1296.00 | 0.00 | 0.00 | 0.00 |\n\n**Tot:** 1296.00\n**No Exchange/ No Refund**\n**Thank You**\n\n**Software by Logic Soft**\n**[www.logisoft.co.in**\n```](http://www.logisoft.co.in%2A%2A/n%60%60%60)']
Handwritten Document
Evaluates recognition of cursive and handwritten text, one of the toughest OCR use cases.
![[handwritten-document-1 (3).pdf]]
EasyOCR
Processing page 1...
Page 1 OCR Results:
['Ewdizoo', 'Only', 'matker', 'syle', 'For eclucalioncl purposes 062 anclyse #he enins Pges Pe9: arfi ci Jha} \'Peare The American Mathemctica Monthly, Volume {02 Number February 1995 We nave added line numbers #ne right margin line Since 4his article squares don\'F gel alter nating co lours cou la be argued #hat the +erm chessboard misplaced, line 4: The introdluction o? the name "B seems unnece sscry usre ~in +he combinalion the board #he tex} 6r figure and Ilne boin cascs Jus} the bocrd ould have done Rne In line 77 occurs tne Ics} use U, "XcB which du bious since B wGs board and not set ; Iine would have Preferred Siven set e8 ce Ils Iine 7/8 The frst move being move like ans other does not deserve separate discrip Tion - The term \'step redundant line 3 : why no} move consists of" line {0 /11 At This stage 4he iFalics ere puzzling , since m ove Poss; ble 18']
Processing page 2...
Page 2 OCR Results:
['ELD1200 -1', '{r Som0 \'j cell (4) contains pebble and cells (inj) and "Cijt\') ure empty line 10 Tvice the term pesi Hi on s whct eveyywhere else called cells line 12 : Why not After moves the board ncs T+ pebbles Iine 12 /14 In the one sentence counts moves in the tner counts Pels bles Since #he prose does not indicote the scope 8 dummies Inis double use tne same Iitfle bil unfergivable line 14 ; and we set Tr := Uxz` RCk) Ve remark the use ~8 +he Jerb 440 se} when deRning (+he set can considered cinfkrtunate since 7ot used On 4ne nex F two peges, #he name seems 40 be introdu ced Jo0 esrly #he inioauelion ~8 the name \'R seems nnceess- ari in #he rest op +he Paper 5a ) used once any Anere reccnable con Igure Tion LoL have dune Nele Tn +ne contert ques Hicm 116 _ +ne reacnable cantex} can remain ano- mnous the quoted ccurrente \'33 tre Iy occurrence tn2 idenhi Rec thar context Ns Conclusisn Inat 4he rea chable con- \'fgureF Jn ncs been']
Surya
Processing page 1...
Page 1 OCR Results:
EWD1200 -0
Only a matter of style?
For educational purposes we analyse the
opening pages of an 11-page article that
appeared in The American Mathematical
Monthly, Volume 102 Number 2/February 1995.
We have added line numbers in the right
margin.
line 4: Since in this article, squares don't get
alternating colours, it could be argued that
the term "chessboard" is misplaced.
line 4. The introduction of the name "B"
seems unnecessary: it is used -in the
combination "the board B" - in the text
for Figure 1 and in line 71; in both cases
just "the board" would have done fine.
In line 77 occurs the last use of B,
viz. in "XCB", which is dubious since
B was a board and not a set; in line
77. I would have preferred "Given a set X
of cells ".
line 7/8 : The first move, being a move<br>like any other, does not deserve a separate
discription. The term "step" is redundant.
line 8: Why not "a move consists of "?
line 10/11: At this stage the italics are
puzzling, since a move is possible if,
Processing page 2...
Page 2 OCR Results:
EWD 1200-1
för some i,j , cell (i,j) contains a pebble<br>and cells (i+1,j) and (i,j+1) are empty.
line 10: Twice the term "positions" for
what everywhere else is called "cells".
line 12: Why not "After k moves the
board has k+1 pebbles on it."?
line 12/14: In the one sentence, k counts
moves, in the other k counts pebbles.<br>Since the prose does not indicate the<br>scope of dummies, this double use of<br>the same k is a little bit unforgivable.
line 14: "and we set <math>R := \bigcup_{k>1} R(k)</math> ". We
remark
· the use of the verb "to set" when defining
(the set!) <math display="inline">{\mathbb R}</math> can be considered unfortunate <math display="inline">\bullet</math> since <math display="inline">{\mathbb R}^n</math> is not used on the next two
pages, the name seems to be introduced
too early
· the introduction of the name R seems
unnecessary; in the rest of the paper I<br>saw it used once in "any C e R", where<br>"any reachable configuration" would have
done. (Note. In the context in question
-p 116- the reachable context can remain
anonymous: the quoted occurrence of<br>C is the only occurrence of the identi-<br>fier C in that context. My conclusion is<br>that the reachable configuration has been
['Only a matter of style?\n\nFor educational purposes we analyse the opening pages of an 11-page article that appeared in The American Mathematical Monthly, Volume 102 Number 2/February 1995. We have added line numbers in the right margin.\n\nline 4: Since in this article, squares don\'t get alternating colours, it could be argued that the term "chessboard" is misplaced.\n\nline 4: The introduction of the name "B" seems unnecessary: it is used -in the combination "the board B"- in the text for Figure 1 and in line 71; in both cases just "the board" would have done fine. In line 77 occurs the last use of B, viz. in "X c B", which is dubious since B was a board and not a set; in line 77, I would have preferred "Given a set X of cells".\n\nline 7/8: The first move, being a move like any other, does not deserve a separate description. The term "step" is redundant.\n\nline 8: Why not "a move consists of"?\n\nline 10/11: At this stage the italics are puzzling, since a move is possible if,']
['for some \\( c_{ij} \\), cell \\((c_{ij})\\) contains a pebble and cells \\((i+1,j)\\) and \\((i,j+1)\\) are empty.\n\nline 10: Twice the term "positions" for what everywhere else is called "cells".\n\nline 12: Why not "After k moves the board has \\( k+1 \\) pebbles on it."?\n\nline 12/14: In the one sentence, \\( k \\) counts moves, in the other \\( k \\) counts pebbles. Since the prose does not indicate the scope of dummies, this double use of the same \\( k \\) is a little bit unforgivable.\n\nline 14: "and we set \\( R := \\bigcup_{k \\geq 1} R(k) \\)". We remark\n\n- the use of the verb "to set" when defining (the set!) \\( R \\) can be considered unfortunate\n- since "R" is not used on the next two pages, the name seems to be introduced too early\n- the introduction of the name \\( R \\) seems unnecessary; in the rest of the paper I saw it used once in "any \\( C \\in R \\)", where "any reachable configuration" would have done. (Note. In the context in question - p116 - the reachable context can remain anonymous: the quoted occurrence of \\( C \\) is the only occurrence of the identifier \\( C \\) in that context. My conclusion is that the reachable configuration has been']
Qwen2.5vl
['# Only a matter of style?\n\nFor educational purposes we analyse the opening pages of an 11-page article that appeared in The American Mathematical Monthly, Volume 102 Number 2/Febuary 1995. We have added line numbers in the right margin.\n\n- **Line 4:** Since in this article, squares don\'t get alternating colours, it could be argued that the term "chessboard" is misplaced.\n- **Line 4:** The introduction of the name "B" seems unnecessary: it is used -in the combination "the board B"- in the text for Figure 1 and in line 71; in both cases just "the board" would have done fine.\n- **Line 77:** In line 77 occurs the last use of B, viz. in "X ∈ B", which is dubious since B was a board and not a set; in line 77, I would have preferred "Given a set X of cells".\n- **Line 7/8:** The first move, being a move like any other, does not deserve a separate description. The term "step" is redundant.\n- **Line 8:** Why not "a move consists of"?\n- **Line 10/11:** At this stage the italics are puzzling, since a move is possible if,\n\n'] ['EWD1200-1\n\nFor some \\(i, j\\), cell \\((i, j)\\) contains a pebble and cells \\((i+1, j)\\) and \\((i, j+1)\\) are empty.\n\nline 10: Twice the term "positions" for what everywhere else is called "cells".\n\nline 12: Why not "After k moves the board has \\(k+1\\) pebbles on it."?\n\nline 12/14: In the one sentence, k counts moves, in the other k counts pebbles. Since the prose does not indicate the scope of dummies, this double use of the same k is a little bit unforgivable.\n\nline 14: "and we set R := \\(\\bigcup_{k \\geq 1} R(k)\\)". We remark\n\n- the use of the verb "to set" when defining (the set!) R can be considered unfortunate\n- since "R" is not used on the next two pages, the name seems to be introduced too early\n- the introduction of the name R seems unnecessary; in the rest of the paper I saw it used once in "any C ∈ R", where "any reachable configuration" would have done. (Note. In the context in question - p 116 - the reachable context can remain anonymous: the quoted occurrence of C is the only occurrence of the identifier C in that context. My conclusion is that the reachable configuration has been']
Insights
The results show clear differences between traditional OCR engines (e.g., EasyOCR) and modern LLM-driven systems (e.g., Qwen2.5-VL).
Observed Strengths & Weaknesses
Traditional OCR Engines (EasyOCR, Surya, docTR) are lightweight, fast, and well-suited for clean, digital documents with simple layouts. However, they struggle with layout-heavy inputs, handwritten text, and low-quality scans.
LLM-Enhanced Systems (olmOCR, Qwen2.5-VL) excel at reconstructing complex layouts, preserving semantic structure, and interpreting handwritten or noisy documents. Their primary drawbacks are higher computational requirements and potential risks of hallucination if not carefully controlled.
Notable Trends
A clear trade-off exists between speed/efficiency (traditional OCR) and accuracy/semantic richness (LLM-based OCR).
Layout preservation emerges as the biggest differentiator: traditional engines flatten documents, while LLM-based approaches maintain sections, headers, and structured tables.
Handwriting recognition remains a weak spot for most traditional tools, with notable improvements only in advanced LLM-enhanced solutions.
Structured output formats (e.g. schema-aware tables) are becoming standard in newer tools, highlighting the shift from plain-text OCR toward document intelligence.
When Open-Source OCR Isn’t Enough
Open-source OCR tools are excellent for experimentation, prototyping, and smaller-scale projects. They provide developers with flexibility, researchers with a testbed for innovation, and organizations with a cost-effective way to start digitizing their documents. For many use cases, such as extracting tables from PDFs or automating simple form data entry, these solutions can work remarkably well.
At the same time, open-source OCR has matured significantly, with both traditional engines and modern LLM-based approaches offering practical options. Tools like Tesseract, PaddleOCR, and Docling remain strong choices for structured, predictable documents, while MistralOCR and other LLM-driven solutions shine when dealing with irregular layouts or complex content.
The trade-off is clear:
Traditional OCR tools deliver speed, stability, and simplicity.
LLM-based OCR tools offer flexibility and context awareness, but may struggle with consistency and efficiency at scale.
Looking ahead, the future of open-source OCR is likely to be hybrid models that combine the efficiency of traditional engines with the adaptability of AI-driven methods. This will reduce trade-offs and bring more balance between accuracy, speed, and scalability.
However, for enterprise-scale operations, the limitations of open-source OCR become more evident. High document volumes, compliance-heavy industries, and mission-critical workloads demand guaranteed reliability, uptime, and dedicated support, which are areas where community-driven projects may fall short.
That’s why many organizations turn to LLMWhisperer as the next step. It builds on the strengths of open-source OCR while providing a scalable, production-ready pipeline designed for enterprise needs. With managed infrastructure, compliance features, and the ability to process millions of documents efficiently, LLMWhisperer transforms OCR from a helpful tool into a mission-critical capability.
For teams exploring open-source OCR today, the path is clear: start with open source, experiment, learn, and when the time comes to scale, move forward with solutions like LLMWhisperer.
Get Started with LLMWhisperer in Minutes
UNSTRACT
End Manual Document Processing
Leveraging AI to Convert Unstructured Documents into Usable Data
Nuno Bispo is a Senior Software Engineer with more than 15 years of experience in software development.
He has worked in various industries such as insurance, banking, and airlines, where he focused on building software using low-code platforms.
Currently, Nuno works as an Integration Architect for a major multinational corporation.
He has a degree in Computer Engineering.
Necessary cookies help make a website usable by enabling basic functions like page navigation and access to secure areas of the website. The website cannot function properly without these cookies.
We do not use cookies of this type.
Marketing cookies are used to track visitors across websites. The intention is to display ads that are relevant and engaging for the individual user and thereby more valuable for publishers and third party advertisers.
We do not use cookies of this type.
Analytics cookies help website owners to understand how visitors interact with websites by collecting and reporting information anonymously.
We do not use cookies of this type.
Preference cookies enable a website to remember information that changes the way the website behaves or looks, like your preferred language or the region that you are in.
We do not use cookies of this type.
Unclassified cookies are cookies that we are in the process of classifying, together with the providers of individual cookies.