layoutlmv3 finetuned invoice

Theivaprakasham

Introduction

This repository contains a fine-tuned version of Microsoft's LayoutLMv3 model, specifically adapted for the token classification task on invoice datasets. The model is designed to extract fields such as Biller Name, Address, Invoice Date, and Total from invoices. It achieves high performance metrics including precision, recall, F1 score, and accuracy, all at value 1.0 on the evaluation set.

Architecture

The model is based on the LayoutLMv3 architecture, which is a layout-aware transformer model suitable for document image understanding tasks. This version has been fine-tuned to specialize in processing invoice data.

Training

The model was trained using specific hyperparameters:

  • Learning Rate: 1e-05
  • Train Batch Size: 2
  • Eval Batch Size: 2
  • Seed: 42
  • Optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • Learning Rate Scheduler Type: Linear
  • Training Steps: 2000

Training was conducted with a focus on minimal loss and maximum accuracy, achieving a final loss of 0.0012 and perfect precision, recall, F1, and accuracy on the evaluation set.

Guide: Running Locally

  1. Clone the Repository:

    git clone https://github.com/Theivaprakasham/layoutlmv3
    cd layoutlmv3
    
  2. Install Dependencies:
    Ensure you have Python installed, then run:

    pip install transformers==4.20.0.dev0 torch==1.11.0+cu113 datasets==2.2.2 tokenizers==0.12.1
    
  3. Run the Model:
    Use the provided scripts to load and test the model on your data.

  4. Cloud GPUs:
    For better performance, consider using cloud services such as AWS, Google Cloud, or Azure, which provide powerful GPU instances suitable for model training and inference.

License

The repository does not specify a license. It is advisable to contact the author or check the repository for any updates regarding licensing.

More Related APIs in Token Classification