Contract OCR: Guide to Extracting Data from Contracts

Table of Contents

What is Contract Processing?

Contract processing refers to the systematic management of contracts throughout their lifecycle, from creation and negotiation to execution, storage, and renewal. Contracts govern business transactions, partnerships, and compliance obligations, making their accurate management essential for operational efficiency.

Key aspects of contract processing include:

  • Contract Review & Approval: Assessing obligations, risks, and compliance before signing.
  • Validation & Compliance Checks: Ensuring legal and regulatory alignment.
  • Contract Data Extraction: Capturing key details such as parties involved, contract dates, obligations, and payment terms.
  • Storage & Retrieval: Keeping contracts accessible for audits, legal cases, or negotiations.
  • Renewal & Expiry Management: Tracking expiration dates to prevent lapses or missed renegotiations.

Industries like finance, healthcare, real estate, supply chain, and legal services rely heavily on contract processing to maintain regulatory compliance and operational integrity.

Why Contract Processing Matters?

  • Enterprises Handle High Volumes of Contracts—Manually managing thousands of contracts is inefficient, leading to missed deadlines and compliance risks. Automated contract OCR enables businesses to process large volumes accurately.
  • Contracts Contain Critical Business Data – Key contract terms impact financial planning, vendor obligations, and risk management. OCR contract management software extracts and organizes this data for better decision-making.
  • Ensuring Compliance & Risk Mitigation – Contracts are legally binding, and missed deadlines or misinterpreted clauses can result in disputes or penalties. Automated contract data extraction ensures adherence to payment terms, renewal clauses, and service obligations.
  • Digital Transformation & AI Adoption – AI-driven contract extraction replaces outdated manual workflows, improving contract accessibility, searchability, and integration with enterprise systems.

Example: A procurement team can use contract OCR to extract vendor details, pricing structures, and renewal dates, ensuring timely renegotiations and compliance.


Importance of Automating Contract Processing

Business Benefits of Contract OCR & Data Extraction

  • Faster Contract Processing – AI-powered contract OCR software scans and extracts key contract details within minutes, compared to hours of manual review.
  • Improved Accuracy & Reduced Errors – AI-driven contract extraction minimizes human errors in financial terms, renewal clauses, and compliance checks.
  • Cost Savings in Legal & Administrative Operations – Automating contract data extraction reduces the need for manual data entry, document storage, and compliance audits, leading to a 40-50% reduction in administrative expenses.
  • Better Compliance & Risk Management – AI-based contract OCR management alerts businesses about contract obligations, expiration dates, and legal risks, preventing penalties or disputes.
  • Enhanced Searchability & Document Retrieval – OCR contract management converts scanned PDFs into structured data formats (JSON, Excel, or databases), allowing quick search and retrieval of specific contract terms.
  • Seamless Integration with Enterprise Systems – Contract data integrates smoothly into ERP, CRM, and CLM platforms, allowing automated workflows for payment approvals, compliance tracking, and contract renewals.

Example: A finance department can automate contract OCR to extract payment schedules and set up automated invoice processing, reducing late fees and financial mismanagement.

Impact on Contract Management

  • Centralized Insights for Decision-Making – AI-driven contract OCR allows businesses to track obligations, vendor performance, and negotiation opportunities.
  • Better Lifecycle Tracking – Automated contract data extraction ensures smooth renewals, amendments, and audits, reducing contract disputes.
  • Scalability for Large Enterprises – AI-powered contract extraction software handles thousands of contracts without additional workforce costs.
  • AI-Driven Contract Analytics – Machine learning detects risky clauses, frequent disputes, and supplier performance trends, enhancing contract intelligence.

Challenges in Automating Contract Data Extraction

1. Dealing with Unstructured Data in Contracts

Contracts exist in multiple formats:

  • Standardized digital contracts – Easily extractable structured data.
  • Scanned PDFs and faxed documents – Require OCR conversion and image enhancement.
  • Handwritten agreements – Demand advanced handwriting recognition.

AI-powered contract OCR solutions leverage NLP and deep learning to interpret legal language and extract structured contract data.

2. OCR Limitations in Contract Data Extraction

  • Poor Scan Quality & Misalignment Issues – Contracts may be faded, distorted, or misaligned, leading to text recognition errors. Advanced OCR contract software applies image enhancement, noise reduction, and de-skewing techniques to improve text clarity.
  • Inconsistent Formatting & Complex Layouts – Contracts contain tables, clauses, and checklists, making structured data extraction difficult.

3. Data Privacy & Security Concerns

Contracts often contain confidential financial data, legal terms, and intellectual property rights. Businesses need contract extraction tools that comply with GDPR, SOC 2, HIPAA, and other data protection regulations.

4. Scalability & High-Volume Contract Processing

Organizations processing thousands of contracts across multiple departments require high-speed AI-driven OCR contract solutions that:

  • Process bulk contracts efficiently.
  • Support multiple languages and legal terminologies.
  • Integrate with contract management and ERP systems.

Example: A global enterprise using automated contract data extraction can process multi-language contracts from different vendors, ensuring compliance across all legal jurisdictions.

Automating contract OCR and data extraction is no longer a luxury—it’s a necessity for businesses handling large contract volumes. AI-powered contract OCR management solutions provide:

  • Speed & Efficiency – Extract contract data in minutes, reducing manual review time.
  • Accuracy & Compliance – Minimize contract risks, legal disputes, and missed obligations.
  • Cost Savings – Cut down administrative costs by 40-50%.
  • Scalability – Handle thousands of contracts without increasing manual workload.

Businesses leveraging contract extraction software gain structured, searchable, and actionable contract data, enabling faster decision-making, risk mitigation, and seamless contract lifecycle management.


Introducing Unstract: AI-Powered Contract Data Extraction

Overview: Unstract’s Role in Structuring Unstructured Data

Contracts are inherently unstructured documents, often presented as scanned PDFs, images, or handwritten agreements. Extracting critical data manually from these contracts is not only time-consuming but also prone to human errors. This is where Unstract comes into play.

Unstract is an AI-powered contract extraction software that automates contract OCR and contract data extraction, transforming unstructured contract documents into structured, machine-readable formats such as JSON, Excel, or databases. By leveraging advanced AI models, NLP, and OCR technology, Unstract ensures high accuracy, speed, and compliance in contract processing.

With Unstract, businesses can:

  • Extract data from contracts in seconds, eliminating the need for manual review.
  • Process handwritten, scanned, and complex contract layouts with AI-enhanced accuracy.
  • Ensure compliance and legal transparency with automated tracking and structured data.
  • Integrate extracted contract data into CLM (Contract Lifecycle Management), ERP, and CRM platforms seamlessly with APIs.

From procurement and vendor contracts to employment agreements and legal documents, Unstract streamlines OCR contract management, ensuring businesses capture, analyze, and manage contract data effortlessly.

Capabilities That Simplify Contract Data Extraction

1. Contract OCR for Text & Layout Preservation

Traditional OCR contract tools often struggle with complex contract layouts, multi-column text, and tabular data.

  • Preservation of contract layouts, including sections, tables, and clauses.
  • Accurate contract data extraction, even from badly aligned, low-quality scans.
  • Recognition of handwritten notes, annotations, and checkboxes in contracts.

Example: A procurement contract containing pricing tables, clauses, and multiple signatures can be extracted with perfect structure, allowing seamless contract analysis and automation.

3. Extraction from Scanned & Handwritten Contracts

Handwritten contracts, scanned documents, and faxed agreements are common in industries such as real estate, legal services, and finance. Unstract can:

  • Recognize printed and handwritten text with high accuracy.
  • Extract contract metadata, including signature blocks, handwritten notes, and key financial figures.
  • Process distorted, smudged, or low-resolution contract documents without data loss.

Example: A financial institution processing mortgage agreements can use Unstract to extract handwritten borrower details, loan terms, and repayment schedules, ensuring seamless contract compliance tracking.

Key Features of Unstract for Contract Data Extraction

1. LLMWhisperer: General purpose contract ocr/parser, specifically built for LLM consumption

At the core of Unstract’s contract OCR solution is LLMWhisperer, an advanced AI engine built for high-accuracy contract data extraction. LLMWhisperer ensures:

  • Multi-format contract recognition (scanned PDFs, digital PDFs, images, and handwritten contracts).
  • Layout-preserving extraction, maintaining contract tables, line items, and structured data.
  • Intelligent clause detection, allowing businesses to search, analyze, and retrieve contract obligations easily.

Example: A legal team managing franchise agreements can use LLMWhisperer to scan and extract all renewal clauses automatically, ensuring no deadlines are missed.

2. Handling Low-Quality Scans & Handwritten Contracts

Most OCR contract extraction tools fail when dealing with blurred scans, handwritten annotations, or document misalignment. LLMWhisperer overcomes these challenges by:

  • Automated text enhancement (contrast adjustment, de-skewing, noise reduction).
  • Intelligent text recognition for damaged or aged documents.
  • Auto-mode switching between OCR and NLP-based extraction to ensure optimal contract data capture.

Example: A real estate firm digitizing old property lease contracts can use Unstract to extract rental agreements, even from faded or handwritten documents, ensuring historical data remains searchable.

3. Structured Data Output: JSON, Excel & API Integration

Extracted contract data needs to be usable and accessible for business intelligence, legal analytics, and contract automation. Unstract enables:

  • Automated contract data extraction into JSON, Excel, or database-ready formats.
  • Seamless integration with contract management platforms (CLM, ERP, CRM) through APIs.

Example: A procurement team using SAP ERP can export extracted vendor contract data from Unstract into their finance system, ensuring automatic reconciliation of purchase orders and payments.

Unstract is an open-source no-code LLM platform to launch APIs and ETL pipelines to structure unstructured documents. Get started with this quick guide.

Why Unstract is the Future of Contract OCR & Contract Data Extraction

With businesses handling thousands of contracts, the need for automated contract data extraction has never been more critical. Unstract revolutionizes OCR contract management with:

  • Faster, AI-driven contract processing, reducing manual workload.
  • High-accuracy contract OCR, handling complex layouts, scanned PDFs, and handwritten agreements.
  • Structured contract data extraction, enabling seamless integration with enterprise systems via API.
  • Advanced AI models for contract intelligence, improving compliance, negotiation, and risk management.

Whether you’re a legal firm, procurement team, financial institution, or enterprise managing vendor contracts, Unstract ensures error-free, scalable, and automated contract data extraction—empowering businesses to make data-driven decisions with confidence.

Step-by-Step: Extracting Data from Contracts Using Unstract

Contract data extraction no longer has to be a slow, error-prone manual process. With Unstract’s LLMWhisperer, businesses can now extract structured and highly accurate contract data automatically, even from scanned, handwritten, or misaligned documents.

In this section, we will explore two efficient methods to extract contract data:

  • Using the LLMWhisperer Playground – A no-code solution for testing real-time OCR contract extraction.
  • Using the LLMWhisperer API with Postman – Extract contract data programmatically with structured JSON output.

Real-Time Contract OCR Extraction Using LLMWhisperer Playground

Unstract’s LLMWhisperer Playground provides a quick and interactive way to upload and extract data from contracts in real-time. The technology ensures that even badly aligned, tilted, or handwritten contracts are processed with high accuracy while maintaining a structured format.


Uploading a Misaligned Contract for Testing

  1. Open the LLMWhisperer Playground.
  2. Click on Upload Document and select a contract that is tilted or misaligned (e.g., a sales contract tilted at 25 degrees).
  3. The system will automatically process the document and outputs a layout-preservered text version of the contract.

Key Observations from the Extraction

  • Extracted data is well-aligned, fully structured, and complete.
  • 100% accuracy ensures no missing details, preserving every section of the contract.
  • Tables, clauses, and metadata are extracted and formatted properly for further processing.

This demonstrates the power of Unstract’s LLMWhisperer, which can handle skewed, rotated, or complex contract layouts while preserving formatting—something traditional OCR tools often struggle with.

LLMWhisperer: A general-purpose PDF-to-text converter service from Unstract.


LLMs are powerful, but their output is as good as the input you provide. LLMWhisperer is a technology that presents data from complex documents (different designs and formats) to LLMs in a way that they can best understand.

If you want to quickly test LLMWhisperer with your own documents, you can check our free playground. Alternatively, you can sign up for our free trial which allows you to process up to 100 pages a day for free.

Contract Data OCR via LLMWhisperer API

Unstract’s LLMWhisperer API enables businesses to extract contract data programmatically, making it ideal for batch processing and integration with contract lifecycle management (CLM), enterprise resource planning (ERP), and customer relationship management (CRM) systems.

Why Use the API?

  • Automate large-scale contract processing efficiently.
  • Preserve tables, checkboxes, and structured elements using table border detection settings.

Step 2: Upload a Contract via Postman

  1. Open Postman and create a new POST request.
  2. Set the Request Type to POST.
  3. Paste the API URL: https://llmwhisperer-api.us-central.unstract.com/api/v2/whisper
  4. Go to the Headers tab and add:
    • Key: unstract-key
    • Value: (Paste your API Key here)
  5. Navigate to the Body tab and select binary.
  6. Click Select File and upload a contract document (e.g., a titled Services Agreement with handwritten text).
  7. Click Send to initiate the request.

API Response

If the request is successful, you will receive a processing confirmation message with a whisper_hash, which is required for retrieving the extracted data.

{
  "message": "Whisper Job Accepted",
  "status": "processing",
  "whisper_hash": "xxxxxa96|xxxxxxxxxxxxxxxxxxx4ed3da759ef670f"
}

Step 3: Retrieve the Extracted Contract Data

  1. Create a new GET request in Postman.
  2. Paste the Retrieval URL:  https://llmwhisperer-api.us-central.unstract.com/api/v2/whisper-retrieve
  3. Go to the Headers tab and add:
    • Key: unstract-key
    • Value: (Paste your API Key here)
  4. In the Params tab, add the following parameters:
    • Key: whisper_hash → (Paste the whisper_hash from Step 2)
    • Key: textonly → true (Extracts plain text from the contract)
    • Key: mark_vertical_lines → true (Preserves vertical table borders for better structure)
    • Key: mark_horizontal_lines → true (Preserves horizontal table borders for better structure)
  5. Click Send to retrieve the structured contract data.

Extracted Contract Data Output

This output demonstrates how Unstract’s LLMWhisperer API effectively extracts and organizes contract data, preserving all relevant details, including tables, metadata, and financial terms.



                                                                  FormNO: 05672AT3 

State of California 

                 SERVICES                  AGREEMENT 

 This Services Agreement (this "Agreement") is entered into as of the 5 day of March 
 2024, by and among/between: 

 Service Provider(s): Valley wood        works Inc                    [Name], located at 
            23, rosewood avenue, CA 96162 [Address] (collectively "Service 
 Provider") and 

 Buyer(s):    Twinings         threads Inc                  [Name], located at 
        123B, Beach walk               avenue, CA [Address] (collectively "Buyer"). 

 Each Service Provider and Buyer may be referred to in this Agreement individually as a "Party" and 
 collectively as the "Parties." 

 1. Services. Service Provider agrees to provide and Buyer agrees to purchase the following services for 
 the specific projects described below: 

                      Description of Services                 Number Projects of Price Project per 
         Building     Paint    work,      external                1      $   2350 
        Building      paint work,          internal               1      $   3000 
                       logistics         estimate                 1      $   430 
                 Tools      cost    estimate                             $ 3000 
                                                                         $ 
                                                                         $ 

 2. Purchase Price. Buyer will pay to Service Provider and for all obligations specified in this Agreement, 
if any, as the full and complete purchase price, the sum of $ 10000 

Unless otherwise stated, (Check one) [X] Service Provider [ ] Buyer shall be responsible for all taxes 
in connection with the purchase of Services in this Agreement. 

3. Payment. Payment for the Services will be by: (Check on) 

   [ ] Cash                                        [X] Credit or debit card 
   [ ] Personal check                              [ ] Wire transfer 
   [ ] Cashier's check                             [ ] Other: 
   [ ] Money order 
<<<

 according to the following schedule: (Check all that apply) 

         [ ] Amount previously paid by the Buyer. $            previously paid by Buyer. 
         [ ] Down payment. $              upon the execution of this Agreement. 
         [X] Payment for the Services. Full payment: $ 10000    upon the completion of the 
        services. OR Installments: $           on                                  [Due day of 
        installment payments], until the purchase price has been paid in full. 

 4. Right of Inspection. (Check one) 

 [X] There is NO right to inspection. 

 [X] Buyer shall be allowed to examine the final products once received and shall do so within 
 days after the receipt of the final products. In the event that Buyer discovers any problems, shortcomings, 
 errors, or other nonconformance of the services, Buyer shall notify Service Provider within days 
 after completion of the services or discovery of the problems, whichever is sooner. Failure to notify 
 Service Provider by such date shall constitute an acceptance of Services. In the event the services do not 
meet the standards of this contract, Buyer may at its option: (Check all that apply) 

    [X] Request one revision of the product provided 
    [X] Terminate the contract following payment for 50% of the services 

    The above shall be the sole remedies of Buyer and only obligations of Service Provider with respect to 
   any Services. 

5. Security Interest. Buyer hereby grants to Service Provider a security interest in any final products 
resulting from said services, until Buyer has paid Service Provider in full. Buyer shall sign and deliver any 
document needed to perfect the security interest that Service Provider reasonably requests. 

6. Force Majeure. Service Provider shall not be responsible for any claims or damages resulting from any 
delays in performance or for non-performance due to unforeseen circumstances or causes beyond 
Service Provider's reasonable control. 

7. Limitation of Liability. Service Provider will not be liable for any indirect, special, consequential, or 
punitive damages (including lost profits) arising out of or relating to this Agreement or the transactions it 
contemplates (whether for breach of contract, tort, negligence, or other form of action) and irrespective of 
whether Service Provider has been advised of the possibility of any such damage. In no event will Service 
Provider's action. liability exceed the price paid by Buyer for the Services giving rise to the claim or cause of 

8. Assignment. (Check one) 

[X] SERVICE PROVIDER needs permission to assign to a third party. Seller may not assign any of its 
rights under this Agreement or delegate any performance under this Agreement, except with the prior 

 L 
<<<

Why Use LLMWhisperer for Contract OCR?

  • High-accuracy contract OCR & data extraction for structured contract processing.
  • Handles scanned, misaligned, and handwritten contracts with near-perfect accuracy.
  • Preserves structured contract layouts, including tables, checkboxes, and metadata.

Extracting contract data no longer needs to be a manual, time-consuming process. Whether dealing with procurement agreements, vendor contracts, or sales agreements, Unstract’s LLMWhisperer allows businesses to digitize, structure, and analyze contract data effortlessly.

  • Test in Playground – Upload a contract and experience real-time OCR extraction.
  • API Access – Start automating contract data extraction immediately.

With Unstract’s AI-powered contract extraction, businesses can streamline contract management, reduce processing time, enhance compliance, and improve decision-making.


Setting Up Unstract for Automated Contract Data Extraction

Extracting data from contracts efficiently and accurately is crucial for legal, financial, and operational processes. Unstract simplifies this by integrating AI-powered OCR, embeddings, vector databases, and LLMs to process contracts seamlessly.

In this section, we will set up Unstract, configure LLMWhisperer, and use the LLMChallenge feature to compare multiple LLMs for contract OCR and automated contract data extraction.

Step 1: Setting Up Unstract

To begin contract extraction, follow these steps:

1.1 Sign Up for Unstract

  • Go to the Unstract platform → Sign Up Here
  • Create a free account – Gain access to Prompt Studio, LLMWhisperer, and API integrations.

1.2 Configure OpenAI for LLM Integration

Unstract uses Large Language Models (LLMs) to extract structured contract data. We will set up two LLM profiles to enable LLMChallenge mode, allowing us to compare results from different AI models.

Steps to Configure LLM Profiles:

  1. Navigate to Settings → LLMs.
  2. Click on “New LLM Profile” and select an LLM provider.
  3. Enter the required details, such as:
    • LLM Name: (e.g., OpenAI GPT-3.5 Turbo)
    • Provider: Choose OpenAI or another provider.
    • API Key: Paste the corresponding API key.
    • Vector Database: Select the configured vector DB.
    • Embedding Model: Choose an embedding model for better data mapping.
  4. Click “Add” to finalize the profile.

Now, set up a second LLM (e.g., Gemini or Claude) to use in LLMChallenge.

1.3 Configure Embeddings for Contract Data Mapping

Embeddings convert contract text into structured numerical vectors, allowing for better data retrieval and relationship mapping.

Steps to Set Up Embeddings:

  1. Go to Settings → Embedding Profiles.
  2. Click on “New Embedding Profile”.
  3. Select an embedding provider (e.g., OpenAI, Cohere, or Hugging Face).
  4. Configure the profile by entering the necessary API details.
  5. Click “Save” to complete the setup.

Embeddings help detect contract patterns, terms, and clauses more efficiently.

1.4 Connect a Vector Database for Contract Data Storage

A Vector Database (Vector DB) stores and retrieves embeddings efficiently, making it faster to extract contract data from scanned and unstructured documents.

Steps to Set Up a Vector Database:

  1. Go to Settings → Vector DBs.
  2. Click on “New Vector DB Profile”.
  3. Choose a database type (e.g., PostgreSQL, Pinecone, or Weaviate).
  4. Enter the configuration details (e.g., connection URL, API key).
  5. Click “Save” to integrate the database with Unstract.

Vector DB ensures that extracted contract data is stored, indexed, and retrieved quickly.

1.5 Integrate LLMWhisperer for Contract OCR & Data Extraction

LLMWhisperer processes scanned contracts, preserves layout, and extracts structured data with high precision.

Steps to Set Up LLMWhisperer:

  1. Go to Settings → Text Extractor.
  2. Click on “New Text Extractor”.
  3. Select “LLMWhisperer” as the extraction engine.

Now, we are ready to extract contract data using Unstract! 🚀

Step 2: Creating a Prompt Studio Project

2.1 Create a New Project in Prompt Studio

  1. Navigate to Prompt Studio in Unstract.
  2. Click on “New Project”.
  3. Enter project details (e.g., “Contract Data Extraction”).
  4. Click “Save” to create the project.

2.2 Upload Contracts for Testing

  1. Go to Manage Documents within Prompt Studio.
  2. Click on “Upload Files”.
  3. Upload sample contract PDFs (both structured and handwritten).

The uploaded contracts will be used to train and test contract OCR extraction.

Step 3: Enabling LLMChallenge for Comparing Multiple LLMs

3.1 Add a Second LLM Profile for Comparison

  1. Go to Settings → LLM Profiles.
  2. Click on “Add New LLM Profile”.
  3. Enter required details:
    • Name (e.g., “Gemini Contract Extractor”).
    • LLM Provider (e.g., OpenAI GPT-4, Claude, Gemini).
    • Vector DB, Embedding Model, and Text Extractor.
    • Define chunk size and overlap settings.
  4. Click “Add” to save the second LLM profile. This enables side-by-side comparisons between multiple LLMs for contract extraction.

3.2 Activate LLMChallenge Mode

Unstract’s LLMChallenge allows you to compare multiple LLMs for contract OCR, helping you pick the best-performing model based on accuracy, cost, and speed.

Steps to Enable LLMChallenge Mode:

  1. Click the Settings button (top-right corner in Prompt Studio).
  2. Go to the LLMChallenge tab.
  3. Select a secondary LLM provider (e.g., Gemini or Claude).
  4. Tick the box to Enable LLMChallenge.
  5. Click Save to apply settings.

Now, contract data will be extracted using multiple LLMs, and we can compare their outputs!

Step 4: Writing Prompts for Key Contract Fields

In Prompt Studio, we define key contract fields and their extraction prompts.

4.1 Define Key Fields & Prompts (Output Format: JSON)

  1. Field: contract_number
    • Prompt: “Extract the contract number mentioned in the document in JSON format.”
  2. Field: contract_date
    • Prompt: “Extract the date on which the contract was signed in JSON format.”
  3. Field: buyer_details
    • Prompt: “Extract full buyer details from the contract in structured JSON format.”
  4. Field: vendor_information
    • Prompt: “Extract vendor details such as company name, address, and contact info in JSON format.”

4.2 Comparing LLM Outputs

  • View results from both LLMs (default + challenger).
  • Click “Combined Output” to see a merged JSON response.
  • Compare accuracy, cost, and processing time.

This step ensures you select the most accurate and cost-efficient LLM for contract OCR.

Step 5: Exporting & Deploying the Contract Extraction API

5.1 Export as a Reusable API Tool

  1. Click “Export as Tool” in Prompt Studio.
  2. Assign a name (e.g., “Contract Parser API”).

Create a Workflow

  • Navigate to Build → Workflows and click New Workflow.
  • Drag and drop the exported tool into the workflow area.
  • Define: Input: API and Output: API

5.2 Deploy the API in Unstract

  1. Navigate to Manage → API Deployments.
  2. Click “+ API Deployment”.
  3. Enter API details:
    • Name: “Contract Extraction API”
    • Description: Extract structured contract data.
    • API Name: Set a unique identifier.
  4. Click “Save”.

Copy the API endpoint & API key for integration.

Now, your Contract OCR API is live and ready for automation! 

Testing the Deployed API with Postman for Contract Data Extraction

Once we have successfully deployed the contract extraction API using Unstract, the next step is to validate the workflow using Postman. This ensures that the API extracts key contract details accurately, maintains formatting, and aligns all extracted fields properly.

In this section, we will go through the Postman testing process, including:

  • Uploading a sample contract to test extraction.
  • Reviewing extracted data in structured JSON format.
  • Confirming correct metadata, clauses, formatting, and alignment.

Step 1: Open Postman & Create a New Request

  1. Go to Postman and sign in (or create a free account).
  2. Click “New Request” and set the request method to POST.
  3. Paste the API deployment URL (copied from Unstract’s API deployment step).

This URL will be used to send contract files for extraction.

Step 2: Configure Authorization Header

  1. Navigate to the “Headers” tab.
  2. Add the following key-value pair:
    • Key: unstract-key
    • Value:YOUR_API_KEY (copied from Unstract’s “Manage Keys” section).
This authenticates your API request and allows Unstract to process the document.

Step 3: Upload the Contract File

  1. Go to the “Body” tab.
  2. Select the “form-data” option.
  3. Add a new key-value pair:
    • Key: files
    • Type: File
    • Value: Upload the contract document (e.g., a scanned or unstructured contract PDF).

Now, we are ready to send the request and extract contract data!

Step 4: Send the API Request

  1. Click the “Send” button to initiate the contract data extraction.
  2. The API will process the uploaded document and return an initial response like this:
{
    "message": {
        "execution_status": "PENDING",
        "status_api": "/deployment/api/org_sfEFu2S9NcBVUTaR/contract_ocr/?execution_id=3a434d2a-e323-4546-b3ff-b88a8d95e97d",
        "error": null,
        "result": null
    }
}

Step 5: Retrieve Extracted Data in JSON Format

1. Open the Status API

  • Click the status_api link from the response above.
  • It will automatically open a new GET request in Postman.
  • Ensure that the Authorization header is still set correctly.

2. Send the GET Request

  • Click “Send” to check the extraction status.
  • If processing is complete, the API will return extracted contract data in JSON format.

The API efficiently extracts key contract details, including structured metadata and clauses.

Step 6: Reviewing the JSON Output

Once processing is complete, the API will return structured contract data in JSON format.

Example JSON Response (Extracted Contract Data)

{
    "status": "COMPLETED",
    "message": [
        {
            "file": "contract-ocr-1.pdf",
            "status": "Success",
            "result": {
                "output": {
                    "buyer_details": {
                        "address": "240 Hau Giang, Ward 9 District 6 Ho Chi Minh, Vietnam",
                        "company_name": "BINH MINH PLASTIC JOINT STOCK COMPANY",
                        "fax": "84-28-3960 6814",
                        "represented_by": {
                            "name": "Mr. Nguyen Hoang Ngan",
                            "position": "Sales Manager"
                        },
                        "tel": "84-28-3969 0973"
                    },
                    "commodity_details": {
                        "Commodity being sold": "High-density Polyethylene (HDPE)",
                        "Price": "1,000 USD/TON",
                        "Quantity": 3000,
                        "Total price": "3,000,000 USD",
                        "Unit": "TON"
                    },
                    "contract_date": {
                        "date": "18/06/2024"
                    },
                    "contract_number": {
                        "contract_number": "BM-SK/18/04"
                    },
                    "documents": {
                        "documents": [
                            "A full set (3/3) of original Bill of Lading",
                            "Commercial invoice in triplicate",
                            "Certificate of origin form KV",
                            "Packing list in triplicate",
                            "Certificate of quality"
                        ]
                    },
                    "packing_and_marking_details": {
                        "marking": {
                            "labeling": {
                                "address": "280, Jangpyeong-ro, Nam-gu, Ulsan, Korea",
                                "batch_number": "[Batch number]",
                                "manufacturer": "SK GLOBAL CHEMICAL CO., LTD",
                                "net_weight": "500 kg",
                                "product_name": "High-density polyethylene (HDPE)",
                                "warning_signs": [
                                    "Keep away from heat sources",
                                    "Do not tear",
                                    "Avoid moisture"
                                ]
                            }
                        },
                        "packing": {
                            "packaging_material": "PE plastic pellets will be packaged in bags made from polyethylene or polypropylene, often with an inner lining to prevent moisture.",
                            "packaging_method": "Use PP Jumbo Bags: Pack and seal 4 sides carefully to ensure the goods are not damaged during transportation and stacking on pallets.",
                            "packaging_quantity": "Plastic pellets will be packed into bags, each bag contains 500 kg"
                        }
                    },
                    "payment_details": {
                        "Banking charge": "All expenses and charges inside Vietnam are for buyer's account. All banking charges outside Vietnam are for seller's account.",
                        "Payment term": "Payment L/C 60 days from Bill of Lading date",
                        "Seller bank details": {
                            "Address": "280, Jangpyeong-ro, Nam-gu, Ulsan, Korea",
                            "Bank address": "1027-2, Dal-dong, Nam-gu, Ulsan, South Korea",
                            "Bank name": "Korea Exchange Bank (KEB Hana Bank) - Ulsan Branch",
                            "Beneficiary": "SK GLOBAL CHEMICAL CO., LTD",
                            "Seller bank account": "123456789",
                            "Swift code": "KOEXKRSEXXX"
                        }
                    },
                    "seller_details": {
                        "address": "280 Jangpyeong-ro, Nam-gu, Ulsan, South Korea",
                        "company": "SK GLOBAL CHEMICAL CO., LTD",
                        "fax": "82-2-2121-7001",
                        "represented": {
                            "name": "Mr. Gang In-gu",
                            "position": "Sales manager"
                        },
                        "tel": "82-32-6824311"
                    },
                    "shipment_details": {
                        "Notice of shipment": "Within 24 hours after the date of B/L, the Seller shall notify the Buyer by fax.",
                        "Partial shipment": "not allowed",
                        "Port of Departure": "Ulsan Seaport (KRUSN) of South Korea",
                        "Port of Destination": "Cat Lai Port of Vietnam",
                        "Time of delivery": "Within 50 days after receiving advance payment."
                    },
                    "third_party_involvement": {
                        "companies": [
                            "Saigon Port Shipping & Maritime Services Joint Stock Company (SAMSET JSC)",
                            "Korea Exchange Bank (KEB Hana Bank) - Ulsan Branch"
                        ],
                        "third_party": {
                            "documents": [
                                "Invoice",
                                "Packing",
                                "C/O",
                                "Bill of Lading"
                            ],
                            "responsibilities": [
                                "connecting with shipping line in Vietnam",
                                "taking cargoes instead of the Buyer"
                            ],
                            "role": "final buyer"
                        }
                    }
                }
            },
            "metadata": {
                "source_name": "contract-ocr-1.pdf",
                "source_hash": "17b1032099f3d32bb475a2982e05023fdea542ebce711ec245e75c3e1a92bdc3",
                "organization_id": "org_sfEFu2S9NcBVUTaR",
                "workflow_id": "f15ada33-b337-40f1-b7ae-c73a69d856eb",
                "execution_id": "3a434d2a-e323-4546-b3ff-b88a8d95e97d",
                "total_elapsed_time": 64.905783,
                "tool_metadata": [
                    {
                        "tool_name": "structure_tool",
                        "elapsed_time": 64.905774,
                        "output_type": "JSON"
                    }
                ]
            }
        }
    ]
}

What This Shows:

  • All key contract fields (number, date, parties) have been accurately extracted.
  • Clauses, payment terms, and obligations are well-structured.
  • No data loss even for handwritten or scanned contracts.

Demo of Unstract for processing unstrucutred documents

In this webinar, we’ll dive deep into all the features of Unstract and guide you through setting up an ETL pipeline tailored for unstructured data extraction.


Conclusion: Modernizing Contract Data Extraction with Unstract and LLMWhisperer

In today’s fast-paced business world, contract processing is a critical yet challenging task. Organizations deal with high volumes of unstructured contracts, requiring precise data extraction for compliance, negotiation, and financial management. Manual contract data extraction is slow, error-prone, and expensive, leading to compliance risks and operational inefficiencies.

Unstract’s AI-powered Contract OCR, coupled with LLMWhisperer, revolutionizes contract data extraction by enabling automated contract processing with near 100% accuracy. Whether it’s scanned agreements, handwritten documents, or structured digital contracts, Unstract ensures that key contract clauses, vendor details, obligations, and payment terms are extracted efficiently—ready for integration into CRM, ERP, and contract management systems.

Key Benefits of Using Unstract & LLMWhisperer for Contract OCR and Contract Data Extraction

Faster, Automated Contract Processing

Traditional contract management requires manual reviews, clause identification, and data entry. With AI driven contract processing, Unstract processes hundreds of contracts in minutes, significantly reducing turnaround times.

Reduction in Manual Errors

Human errors in contract extraction can lead to legal disputes, financial losses, and compliance risks. By leveraging automated contract data extraction, Unstract eliminates the risk of misinterpretations and ensures error-free processing.

Seamless Integration with Contract Management Systems

Extracted contract data is available in structured JSON format, allowing easy integration with contract lifecycle management (CLM) software, ERP, CRM, and compliance platforms. This enables organizations to automate contract approvals, payment processing, and renewal tracking.

Cost-Effective & Scalable Solution

By automating contract extraction, organizations reduce the need for manual processing teams, saving on labor costs while improving efficiency. Scalability allows businesses to process thousands of contracts effortlessly, adapting to high-volume workflows.

Enhanced Compliance & Audit Readiness

Contracts often include confidential clauses, NDAs, and regulatory terms. Unstract’s contract extraction software ensures that all relevant compliance clauses are accurately captured, enabling easy regulatory reporting and audit tracking.

Say goodbye to manual contract data entry and unlock efficiency, accuracy, and compliance with AI-driven contract extraction solution!

While you’re here, be sure to explore Unstract, our open-source, no-code LLM platform to launch APIs and ETL Pipelines to structure unstructured documents.



Even better, schedule a call with us. We’ll help you understand how Unstract leverages AI to help document processing automation and how it differs from traditional OCR and RPA solutions.