Automating Bank Statement Extraction: A Guide to AI-Driven Statement Processing

Guide to Extracting Data from Bank Documents | AI Bank Statement Processing
Table of Contents

TL;DR

This article examines the challenges involved in processing bank statements and demonstrates how using Large Language Models (LLMs) can enable new ways of handling bank documents parsing.

If you wish to skip directly to the solution section, where you can see how Unstract uses AI to extract data from various types of bank statements, click here.

Introduction: What is Bank Statement Processing/Extraction?

Bank statement processing, also referred to as bank statement extraction, is a fundamental process in various financial and business workflows. It involves converting detailed transaction data from bank statements into structured formats that can be easily analyzed and used for decision-making.

A bank statement is a comprehensive document issued by financial institutions, which provides an account of all the transactions—deposits, withdrawals, payments, and transfers—carried out within a specific period. This document offers insights into the financial behavior of individuals or businesses and is critical for assessing financial health, cash flow, and creditworthiness.

In essence, bank statement extraction refers to the process of capturing and organizing the financial data found in these statements. This process can be manual, which is slow and prone to errors, or automated, which allows businesses to handle large volumes of data quickly and accurately. The latter has become increasingly popular, especially in sectors like lending, auditing, and financial analysis, where high volumes of data need to be processed efficiently.

The Importance of Bank Statement Extraction in Business Processes

Extracting data from bank statements plays an integral role in various industries. For instance, mortgage providers, private lenders, and financial institutions rely heavily on this data to assess a borrower’s financial standing, while tax services and auditors use it to verify income, expenses, and overall financial activity. Bank statement extraction allows businesses to harness critical information without the tedious process of manually going through each transaction.

The significance of bank statement processing extends far beyond basic record-keeping. It enables financial institutions and other businesses to automate their workflows, improve decision-making, and enhance risk management. For instance, a bank processing a loan application will use the borrower’s bank statements to assess their ability to repay the loan. The extracted data provides insights into monthly income, spending habits, and financial reliability, all of which are crucial for determining creditworthiness.

Who Uses Bank Statement Data?

Various industries depend on bank statement extraction to validate claims, assess creditworthiness, and streamline operations. Some of the primary users of bank statement data include:

  • Mortgage Providers: When processing mortgage applications, providers analyze bank statements to evaluate an applicant’s financial history and ability to manage long-term debt.
  • Government Agencies: Used for income verification, subsidy applications, and ensuring financial compliance.
  • Banks: To assess loan eligibility, credit card approvals, and overall account management.
  • Private Lenders: To determine the financial health of borrowers before approving loans.
  • Tax Services and Auditors: To verify income and expenses during tax filing and auditing processes.
  • Insurance Providers: To validate claims, especially in cases involving large sums of money, such as life insurance payouts or accident claims.

These sectors use bank statement extraction as a means to enhance accuracy in their workflows, minimize manual processing errors, and ensure regulatory compliance.

Common Use Cases and Industries That Rely on Automated Bank Statement Processing

The use cases for bank statement extraction span a wide range of industries and business needs:

  • Mortgage Applications and Refinancing: Mortgage providers require comprehensive insights into a borrower’s financial habits and stability. Extracting and analyzing bank statements enables a clear understanding of a borrower’s income, liabilities, and spending habits, essential for assessing loan eligibility.
  • Small Business Loans: For small businesses, providing bank statements is crucial for securing financing. Lenders extract financial data to verify cash flow, assess risk, and evaluate creditworthiness.
  • Visa Applications: Many countries require applicants to submit bank statements to prove financial stability. Automating the extraction of these statements ensures a faster and more accurate assessment.
  • Credit Card Applications: Financial institutions use bank statement processing to evaluate the financial behavior of potential credit cardholders, ensuring that they have the necessary income and creditworthiness to manage credit responsibly.
  • Tax Filing and Auditing: Tax professionals and auditors use bank statement parsing to verify income, expenses, and other financial activities during tax preparation and audits, helping to ensure compliance with tax laws.
  • Insurance Claims: In the insurance industry, bank statement extraction is critical for validating claims, particularly for large payouts. By automating the extraction process, insurance providers can quickly and accurately assess a claimant’s financial transactions, reducing the risk of fraud.

Automated bank statement extraction simplifies these processes by converting unstructured data into actionable insights, reducing errors, and improving operational efficiency.

Why Is Bank Statement Processing a Critical Business Process That Needs Automation?

In a fast-paced digital world, manual processing of bank statements is no longer viable for businesses that handle high volumes of transactions. The labor-intensive process of manually sifting through pages of transactions, categorizing each entry, and inputting data into systems is not only time-consuming but also prone to human error. Automating bank statement extraction addresses these challenges by streamlining the process and enhancing accuracy. Here’s why automating this critical business process is essential:

Time-Consuming and Error-Prone Nature of Manual Data Extraction

Manually extracting and analyzing data from bank statements is a laborious task. It requires financial professionals or data entry operators to meticulously comb through each line of a statement, categorize the transactions, and enter the data into their respective systems. Given the repetitive nature of this task, the risk of mistakes—such as misentered amounts or overlooked transactions—is high. These errors can have significant consequences, especially in the financial sector, where even small inaccuracies can lead to incorrect assessments of creditworthiness, loan eligibility, or financial health.

Benefits of Automating Bank Statement Processing

  1. Improved Accuracy and Consistency: Automated systems ensure that bank statement extraction is performed with a high degree of precision. They eliminate the inconsistencies that arise from human error, such as duplicated data, incorrect entries, or missed transactions. For industries like banking and lending, where financial data is critical to decision-making, automation provides much-needed reliability.
  2. Faster Data Availability and Analysis: Automation dramatically speeds up the process of extracting data from bank statements. What once took hours or even days can now be done in a matter of minutes. This rapid availability of data enables businesses to make quicker, more informed decisions, particularly in time-sensitive situations like loan approvals, credit evaluations, or auditing processes.
  3. Enhanced Compliance and Audit Readiness: Financial institutions are subject to stringent regulatory requirements, including maintaining accurate financial records and being audit-ready at all times. Automation ensures that the bank statement extraction process is not only fast and accurate but also fully auditable. It creates a digital trail that can be easily reviewed during compliance checks, helping businesses stay on the right side of the law.
  4. Integration with Financial Software and CRM Systems: One of the key benefits of automation is its ability to integrate seamlessly with other business systems, such as financial management tools, CRMs, or enterprise resource planning (ERP) software. By automating the extraction of bank statement data, businesses can ensure that their financial systems are always up-to-date, allowing for real-time analysis, reporting, and decision-making.

Challenges in Bank Statement Processing (Automation)

While automation solves many of the problems associated with manual bank statement processing, there are still some inherent challenges, particularly due to the complexity and diversity of bank statement formats, unstructured data, and regional variations. To ensure successful automation, businesses must address these challenges head-on.

Variety of Formats and Structures in Bank Statements

One of the main challenges in automating bank statement extraction is the variety of formats used by different financial institutions. Each bank has its own unique layout, which can include different columns, transaction descriptions, and ways of presenting financial information. Some bank statements follow a simple format, while others are more complex, featuring multiple columns for transaction details, fees, and taxes.

Additionally, businesses may receive bank statements in different file formats, such as PDFs, scanned documents, or images, which adds another layer of complexity to the extraction process. Automation systems need to be flexible enough to handle this diversity without sacrificing accuracy.

Handling Unstructured Data Like Transactions, Summaries, and Balances

Bank statements often contain unstructured data, especially in the transaction descriptions, which vary widely from one transaction to another. For example, a transaction might be listed as “Coffee shop” in one case and “Merchant XYZ” in another, even though both are payments for the same type of expense. Automated bank statement extraction systems must be equipped with the ability to interpret and categorize these descriptions accurately.

Summaries and balances, such as monthly or year-end summaries, also vary in their presentation. Some bank statements include these details prominently, while others may hide them in small print or use different formatting. Automated systems must be able to identify and extract these critical pieces of information reliably, ensuring that the resulting data is both accurate and useful.

Complexity in Extracting Data from Scanned or Low-Quality Documents

Many bank statements are provided as scanned documents, especially when they come from older or smaller financial institutions. The quality of these scans can vary, with some being poorly aligned, blurry, or faded over time. This poses a challenge for bank statement OCR (Optical Character Recognition) systems, which must convert these scanned images into structured, machine-readable text.

Even the most advanced OCR bank statement solutions may struggle with low-quality documents, leading to errors in data extraction. These inaccuracies can result in incomplete or incorrect data being passed on for analysis, making it crucial for businesses to invest in AI-powered OCR bank statement tools that can handle these challenges more effectively.

Managing Different Currencies, Languages, and Regional Variations

As businesses operate on a global scale, they often deal with bank statements from different countries, each with its own currency, language, and financial reporting standards. For example, a company based in the United States might receive bank statements in USD, EUR, or GBP, depending on where their clients or subsidiaries are located. Additionally, these statements may be issued in different languages, adding another layer of complexity to the bank statement extraction process.

Automated systems need to account for these variations, ensuring that the extracted data is correctly converted and categorized. This includes converting currencies, translating transaction descriptions, and ensuring compliance with local financial regulations. Advanced AI and machine learning models are essential for handling these regional differences and ensuring that the extracted data is accurate and usable across borders.

In conclusion, automating bank statement extraction is not only a necessity but also a strategic advantage for businesses and financial institutions. By addressing the challenges of diverse formats, unstructured data, and regional differences, companies can streamline their financial workflows, enhance decision-making, and improve overall operational efficiency.


The Advent of AI and How LLMs Help in Processing Unstructured Documents

AI and Large Language Models (LLMs) have revolutionized the way businesses process unstructured documents, such as bank statements, contracts, invoices, and more. Traditional data extraction methods, often reliant on templates or rule-based systems, have limitations in handling the complexity and variability of unstructured data.

AI-driven solutions bring significant advancements in accuracy, speed, and flexibility, enabling businesses to automate these tasks more effectively.

Introduction to LLMs in Document Processing

  • AI plays a crucial role in extracting structured data from complex, unstructured documents. It uses machine learning models to analyze and interpret document content, transforming it into usable data.
  • LLMs, like GPT-3 or OpenAI’s models, have advanced the way unstructured text is processed. These models are trained on massive amounts of data and can understand and generate human-like text, making them ideal for handling diverse document formats and extracting information accurately.
  • AI-powered models can go beyond basic OCR (Optical Character Recognition) and extract context-rich information from documents, understanding relationships between different elements like transactions, summaries, and headers in a bank statement.

How AI Enhances the Accuracy and Efficiency of Data Extraction

  • Automated learning and adaptability: AI models learn and adapt over time, improving their accuracy in extracting data from different document types. This reduces the need for manual intervention, allowing businesses to handle larger volumes of documents efficiently.
  • Improved precision: AI can identify key data points (such as transaction amounts, dates, and balances) with higher precision, especially when handling complex layouts like multi-column documents or mixed formats.
  • Time efficiency: AI processes documents much faster than manual methods, reducing turnaround times for critical business processes like loan approvals, risk assessments, and audits.

Role of LLMs in Understanding and Structuring Unstructured Data

  • Natural language understanding: LLMs can comprehend the context and meaning of the text, allowing them to extract key information, even from loosely structured or varied documents.
  • Contextual data extraction: LLMs can understand not just individual words but the relationships between different parts of a document. For example, they can connect a transaction entry with its associated balance or categorize expenses correctly.
  • Flexibility: Unlike traditional rule-based systems, LLMs do not require rigid templates. They can handle various document layouts and formats, which is essential for extracting data from different types of bank statements.

AI in Processing Bank Statements

Processing bank statements manually or using template-based methods often leads to errors, missed data, or lengthy delays. By leveraging the rapid innovations in AI, financial institutions can extract and structure data from bank statements more efficiently and accurately. This automation brings immense value to businesses dealing with large volumes of statements, such as lenders, auditors, and banks.

How LLMs are Applied to Extract Data from Bank Statements

  • Handling complex layouts: Bank statements often come in different formats, with a variety of transaction types, account summaries, and annotations. AI models, especially LLMs, can seamlessly navigate these variations, identifying relevant data points and structuring them into a usable format.
  • Advanced OCR: AI models, when combined with OCR technology, significantly enhance the accuracy of text extraction from scanned or low-quality documents. This is particularly important for older or scanned bank statements that may not be in optimal condition.
  • Real-time data extraction: AI models allow for near-instant extraction and processing of data, enabling businesses to make quicker decisions regarding loans, credit assessments, or audits.

Benefits of Using AI for Bank Statement Processing

  • Flexibility to handle diverse formats and structures:
    • AI models can process various bank statement formats from different financial institutions without the need for custom templates or reprogramming.
    • LLMs adapt to different languages, currencies, and regional differences in bank statement layouts.
  • Improved OCR accuracy for scanned documents:
    • AI-powered OCR bank statement tools enhance the ability to extract accurate text from scanned or low-quality bank statements.
    • These tools can also handle handwritten notes or annotations that may appear on scanned bank statements.
  • Contextual understanding of complex financial data:
    • AI models, especially LLMs, understand the financial context of transactions, balances, and account summaries.
    • This contextual insight allows for the extraction of critical information like transaction types, overdraft details, or recurring payments, providing a more comprehensive financial picture.

By leveraging LLMs in bank statement parsing, businesses can greatly reduce manual labor, enhance accuracy, and increase the speed of financial workflows, ensuring better customer service and more reliable data-driven decision-making.


Introducing Unstract and How It Leverages AI in Structuring Unstructured Data

Unstract is a cutting-edge, no-code platform designed to automate the extraction of structured data from complex, unstructured documents like bank statements, invoices, contracts, and more. By leveraging AI and Large Language Models (LLMs), Unstract goes beyond traditional data extraction tools, providing businesses with a highly efficient, scalable, and flexible solution for document processing.

Brief Introduction to Unstract and Its Capabilities

Unstract is more than just a tool for basic document parsing; it’s a complete open-source document automation platform. Its core strength lies in its ability to handle a wide variety of unstructured documents, regardless of their complexity, and transform them into structured, machine-readable formats like JSON or CSV.

Some of its standout features include:

  • No-Code Environment: Unstract offers a powerful yet intuitive platform where users can set up document processing workflows without needing programming knowledge. This makes it accessible to a wide range of users, from technical teams to business operations.
  • AI and LLM-Powered: The platform uses AI and LLMs to understand the context within documents, enabling it to extract key information more accurately and with greater precision.
  • Seamless Integration: Unstract can connect with various storage systems, like Google Drive or AWS S3, and send processed data to platforms like PostgreSQL, Snowflake, or even as APIs for downstream applications.

How Unstract Uses AI and LLMs to Extract Structured Data from Unstructured Bank Statements

Unstract uses a combination of LLMs and text extractors to parse structured data from bank statements, regardless of the document’s layout or format. Here’s how it works:

  • LLMs for Contextual Understanding: Unlike rule-based systems that rely on predefined templates, Unstract’s LLMs can interpret the context within bank statements. For instance, the models can identify and extract key elements such as transaction amounts, dates, balances, and descriptions, even when they are embedded within a complicated table or multi-column layout.
  • No Template Reliance: One of the major advantages of Unstract is that it does not rely on predefined templates. You can set up extraction for one or two bank statements, and the system will be able to handle any future statements with a similar nature. This adaptability is especially useful for handling bank statements from different banks or in different formats.
No more manual annotations. Leverage LLMs in diverse bank statement interpretation

Advantages of Using Unstract Over Traditional Extraction Tools

  1. Flexibility with Document Formats:
    • Traditional tools often rely on fixed templates, which require constant updates when a document’s format changes. Unstract’s AI-driven approach eliminates this need, allowing it to adapt to different formats and layouts seamlessly.
  2. High Accuracy:
    • Unstract’s combination of LLMs and AI-enhanced OCR ensures that even the most complex documents, like multi-page bank statements with various columns and transaction types, are processed with high accuracy.
  3. Faster Processing:
    • With its ability to handle large volumes of documents simultaneously, Unstract significantly reduces processing time, making it ideal for businesses that handle high volumes of bank statement extraction.
  4. End-to-End Automation:
    • From data extraction to structuring and sending to downstream systems, Unstract provides a complete solution that can be deployed as APIs or integrated with existing systems, ensuring a seamless workflow for businesses.
  5. Cost and Time Efficiency:
    • The no-code environment and AI automation mean fewer resources are required to manage the process, allowing businesses to scale without increasing operational costs.

Unstract’s AI-driven approach to document extraction makes it a powerful solution for businesses that need reliable, fast, and accurate bank statement extraction, offering significant advantages over traditional tools that rely on rigid templates or manual data entry.

Steps to Extract Data from Bank Statements Using Unstract

In this section, we’ll walk through the steps involved in extracting data from bank statements using Unstract. This process leverages the power of LLMWhisperer—a highly specialized technology integrated into Unstract that optimizes the extraction and preparation of complex documents for Large Language Models (LLMs). We will also explore how LLMWhisperer enhances document processing and how you can deploy the extraction as an API for flexible integration.

LLMWhisperer: A Deep Dive into Pre-Processing Documents for LLMs

At the heart of Unstract’s ability to handle complex documents like bank statements is LLMWhisperer, a technology designed to present documents in a way that Large Language Models can process with maximum accuracy. LLMWhisperer doesn’t just extract data; it prepares the document’s layout and structure in a way that LLMs can best understand.

Key features of LLMWhisperer include:

  • Auto Mode Switching: LLMWhisperer automatically switches between OCR (Optical Character Recognition) and text extraction modes based on the document type. This feature ensures that both scanned PDFs and text-based documents are processed correctly without any manual intervention. For example, a scanned bank statement will automatically trigger OCR mode to extract text accurately from images.
  • Layout Preserving Mode: One of the challenges in processing documents like bank statements is preserving the structure, such as columns and tables. LLMWhisperer’s layout-preserving mode ensures that important formatting details are retained in the extraction, allowing the LLM to better understand the document’s structure. This is crucial when dealing with bank statement extraction, where precise organization is needed for transactional data.
  • Auto-Compaction: Processing costs with LLMs are often influenced by the number of tokens in the input document. LLMWhisperer features auto-compaction, which removes unnecessary tokens while preserving the layout. This reduces the cost and time for processing without compromising on the document’s readability for the LLM.
  • Handling Checkboxes and Radio Buttons: Many documents, especially forms, include checkboxes or radio buttons. LLMWhisperer processes these elements correctly, ensuring that their values are extracted and presented in an LLM-friendly format. For bank statements, this feature ensures that any additional form data, such as signatures or manual selections, is accurately interpreted.
  • Pre-Processing Capabilities: LLMWhisperer allows you to apply various pre-processing techniques, such as Median Filtering and Gaussian Blur, to enhance the quality of scanned images before they are passed to the LLM for extraction. This pre-processing step is especially helpful when working with low-quality or faded documents.

With these advanced capabilities, LLMWhisperer ensures that the complex structures within bank statements are presented in a clean, understandable format to the LLMs for maximum extraction accuracy.

Using LLMWhisperer Playground to Showcase Extraction Capabilities

Before deploying any workflows or setting up APIs, it’s essential to test your document processing with LLMWhisperer Playground. This is a free, easy-to-use platform where you can upload bank statements and test how LLMWhisperer processes and prepares them for LLM consumption.

In the Playground, you can:

  • View OCR in Action: For scanned documents, the Playground allows you to see how LLMWhisperer automatically switches to OCR mode, ensuring that even non-text PDFs are processed accurately.
  • Upload Bank Statements: Upload different formats of bank statements (scanned or digital) to see how LLMWhisperer extracts key information like transaction history, account balances, and other financial data.
  • Test Layout Preservation: Validate how LLMWhisperer retains the original structure of your document, including tables, multiple columns, and other layout elements crucial for accurate bank statement extraction.

Step 1: Setting up Unstract

Before beginning the bank statement extraction process, you need to configure the essential components within Unstract. This involves setting up OpenAI for LLMs, embeddings for document structure, a vector database for efficient retrieval, and LLMWhisperer for pre-processing the document data.

  1. LLM Service:
    • Use OpenAI to process the raw text and structure data fields extracted from the bank statements.
  2. Embedding Model:
    • Leverage OpenAI’s embedding service to organize and structure data within the document.
  3. Vector Database:
    • Use the Postgres Free Trial VectorDB provided by Unstract to store and retrieve extracted data from the bank statements efficiently.
  4. Text Extractor:
    • Configure LLMWhisperer for extracting text from the bank statements, especially for scanned PDFs, using OCR capabilities.

Step 2: Setting Up the OpenAI LLM in Unstract

  1. Sign in to the Unstract Platform.
  2. Navigate to Settings > LLMs.
  3. Click on New LLM Profile.
  4. Select OpenAI from the list and enter your API key.

AI Bank Statement Processing adding llm model for extraction

Step 3: Setting Up the OpenAI Embedding Model

  1. In Unstract, go to Settings > Embedding.
  2. Click on New Embedding Profile.
  3. Select OpenAI and enter the required details, including the API key.

AI Bank Statement Processing Add embedding models

Step 4: Setting Up the Postgres Free Trial Vector Database

  1. Go to Settings > Vector DBs in Unstract.
  2. Click on New Vector DB Profile.
  3. Select Postgres Free Trial VectorDB from the list.
  4. Enter a name for the connector and set the URL and API key as needed.

Step 5: Set Up LLMWhisperer Text Extractor

  1. Navigate to Settings > Text Extractor in Unstract.
  2. Click on New Text Extractor.
  3. Select LLMWhisperer and enter the API key.
  4. Set the processing mode to OCR (for scanned bank statements) and the output mode to line-printer.
AI Bank Statement extraction add text parsers for LLMs

Setting Up and Deploying the Unstract Workflow

Now that you’ve configured the necessary components, you can proceed to extract data from bank statements by setting up a Prompt Studio Project and deploying it as an API.

Step 1: Setting Up the Prompt Studio Project

  1. Access Prompt Studio:
    • Click on Prompt Studio from the left pane, which will open the Prompt Studio user interface.
  2. Create a New Project:
    • Click on the New Project button at the top right corner of the Prompt Studio UI.
    • Fill in the details:
      • Tool Name: Provide a name for your tool (e.g., “Bank Statement Parser”).
      • Author/Org Name: Enter your name or the organization’s name.
      • Description: Add a brief description of the tool’s functionality.
      • Tool Icon (Optional): You can upload an icon for your tool if you have one.
    • Click Save to create the project.
  3. Upload Bank Statements:
    • Click on Manage Documents within the project interface.
    • Upload the sample bank statements you wish to process by clicking on the Upload Files button. For example, upload two sample bank statements to test.
  4. Add Prompts:
    • Click on the Add Prompts button to create extraction prompts.
    • Write prompts based on the key information you want to extract. For example:
      • Field Name: Enter names such as total_credit, total_debit, closing_balance.
      • Prompt Text: Examples include “What is the total credit?”, “What is the total debit?”, “What is the closing balance?”
    • Continue adding prompts for other relevant information such as transaction history, account balances, and dates.
AI Bank Statement Processing run prompts to extract data from PDFs
  1. Run the Prompts:
    • After setting up the prompts, click on the Run button to execute them.
    • The extracted information from the bank statements will appear in the Output section.
  2. View Combined JSON Output:
    • If you want to see the JSON format of all your responses, switch to the Combined Output tab.
    • This tab will display the structured JSON output for all the bank statements processed by your tool.

AI Bank Statement Processing extract data as JSON
  1. Export the Tool:
    • Once you have completed the extraction setup, click on Export as Tool to finalize and export your tool for use in workflows.

Step 2: Deploying the Workflow as an API

Once your extraction tool is ready, the next step is to deploy it as an API to allow for integration into other systems.

AI Bank Statement Processing deploy as api
  1. Navigate to the Workflow Section:
    • Click on Workflows on the left pane of Unstract.
  2. Create a New Workflow:
    • Click on the New Workflow button.
    • Enter the Name and Description of your workflow (e.g., “Bank Statement Parser API”).
    • Click Create Workflow.
  3. Configure the Workflow:
    • Drag and drop your tool (e.g., Bank Statement Parser) into the workflow area.
  4. Set API Connectors:
    • In the workflow builder, set the connectors:
      • API Input: Select the input connector to receive PDF bank statements.
      • API Output: Select the output connector to return the structured JSON data.
  5. Save the Workflow:
    • Add the Display Name, Description, and API Name for the workflow.
    • Click Save to finalize the workflow.
  6. Deploy the API:
    • Once the workflow is saved, deploy it by clicking on the Deploy API button.
    • You’ll receive a success message indicating that the API is deployed and ready to accept bank statements.

By following these steps, you can extract structured data from bank statements, configure Unstract, set up prompts to handle specific data points, and deploy the solution as an API for easy integration into various systems.

Use Case: Parsing Different Bank Statements

In this section, we will walk through how to test the API created in Unstract using Postman with different bank statements. This will demonstrate how the same API setup can handle various bank statement formats without needing additional configurations.

Steps to Test the API with Different Bank Statements:

  1. Navigate to the API Deployment Section:
    • On the left pane of the Unstract interface, select the API Deployment button located at the top.
    • This will display all the details of the deployed API, including its endpoint, status, and other relevant information.
  2. Copy the API URL:
    • Find the API URL that was provided during the workflow deployment.
    • Copy this URL, as it will be used to make requests in Postman.

  1. Retrieve the API Key:
    • In Unstract, click on the three dots located next to the deployed API in the details tab.
    • From the dropdown, select Manage Keys.
    • Copy the API key displayed in the dialog box. This key will be required to authenticate requests made via Postman.
  2. Open Postman:
    • Open Postman in your browser or desktop application.
    • Create a new POST request and paste the copied API URL into the URL field.
  3. Set Authorization in Postman:
    • In Postman, navigate to the Authorization tab.
    • From the dropdown menu, select Bearer Token as the authentication type.
    • Paste the copied API key into the Token field.
  4. Prepare the Request Body:
    • Move to the Body tab in Postman.
    • Select form-data as the format.
    • In the key field, enter files.
    • Change the value type from Text to File by selecting it from the dropdown menu.
    • Now, upload the bank statement PDF you want to test by clicking the New File from Local Machine option.
  5. Upload a Different Bank Statement:
    • Click on the Value field and upload a different sample bank statement from your local system.
    • This could be a statement from a different bank, a scanned document, or any other format to showcase the flexibility of the Unstract API.
  6. Send the Request:
    • After setting up the file and request details, click the Send button located at the top right corner of Postman.
    • Initially, you will receive a response indicating the status as “executing”.
  1. Check the Status of the Request:
    • Within the response, you will find a status_api link.
    • Click on this link and create another GET request in Postman to check the status of the original request.
    • Once the extraction process is complete, you will receive the JSON response with structured data from the uploaded bank statement.
AI Bank Statement Processing extract data as JSON via API

Outcome:

By following these steps, you’ll be able to see that the API can handle various bank statement formats, from scanned PDFs to different structured statements. The same prompt and workflow setup can be used for any bank statement format, demonstrating Unstract’s ability to manage complex document formats seamlessly.The JSON output will include the key data fields that were defined in the Prompt Studio setup, such as total credits, total debits, closing balance, and more. This flexibility allows financial institutions and businesses to easily integrate Unstract’s bank statement extraction capabilities into their workflows, regardless of the bank statement’s format or design.

Benefits of Using Unstract for Bank Statement Processing

Unstract offers a powerful solution for automating bank statement processing. Here are the key benefits that make Unstract an excellent tool for this task:

  • Single Setup for Various Document Formats and Designs: Unstract is designed to handle multiple document formats and layouts without the need for creating new templates or retraining models. Whether you’re dealing with different bank statement formats or scanned documents, Unstract can seamlessly extract data from all of them.
  • No Need for Retraining or Re-Templating: Traditional data extraction tools often require constant updates to templates or models when document formats change. With Unstract’s AI-driven approach, the system adapts to these changes without needing to retrain models or create new templates, reducing both cost and manual effort.
  • High Accuracy and Flexibility in Data Extraction: Unstract leverages advanced AI techniques and LLMWhisperer to enhance the accuracy of the extraction process. Even with complex layouts, such as tables, summaries, or multi-page documents, Unstract ensures reliable data extraction.
  • Seamless Integration with Financial Systems and APIs: The extracted data can be easily integrated with financial systems, such as accounting platforms, CRMs, and other applications through APIs. This flexibility simplifies the workflow for businesses looking to streamline their financial data processing.
  • Improved OCR Accuracy: Unstract’s AI technology, combined with LLMWhisperer, ensures accurate Optical Character Recognition (OCR) for scanned bank statements, even in poor-quality documents or handwritten notes. This is crucial for businesses handling large volumes of scanned or non-standardized documents.
  • Scalability: Once you deploy your workflow, Unstract’s API can process large volumes of documents across various platforms. This is especially useful for financial institutions dealing with high volumes of statements or needing to process documents in real time.
  • Security and Compliance: Unstract prioritizes the security of sensitive financial data, ensuring that customer information is processed securely and in compliance with industry standards.


Conclusion

Unstract offers a robust solution for automating bank statement processing, providing businesses with a streamlined, efficient, and highly accurate method for extracting data. Its ability to handle different document formats without requiring retraining or re-templating saves time, resources, and effort. Additionally, the flexibility of integrating the extracted data with various financial systems ensures that businesses can seamlessly adopt Unstract into their existing workflows.

With API deployments you can expose an API to which you send a PDF or an image and get back structured data in JSON format. Or with an ETL deployment, you can just put files into a Google Drive, Amazon S3 bucket or choose from a variety of sources and the platform will run extractions and store the extracted data into a database or a warehouse like Snowflake automatically. Unstract is an Open Source software and is available at https://github.com/Zipstack/unstract.

If you want to quickly try it out, signup for our free trial.