lmv2 g passport 197 doc 09 13

Sebabrata

Introduction

The LMV2-G-PASSPORT-197-DOC-09-13 model is a fine-tuned version of the microsoft/layoutlmv2-base-uncased model. It is designed for token classification tasks using the LayoutLMv2 architecture, and specifically fine-tuned on a passport-related dataset. The model achieves high precision, recall, and F1 scores across various fields such as Country Code, Date Of Birth, and Passport Number.

Architecture

The model is based on the LayoutLMv2 architecture, which is a variant of the BERT model designed for document understanding tasks. It leverages spatial layout information from documents, making it particularly useful for tasks where the position of text within a document is important.

Training

The model was trained using the following hyperparameters:

  • Learning Rate: 4e-05
  • Train Batch Size: 1
  • Eval Batch Size: 1
  • Seed: 42
  • Optimizer: Adam (betas=(0.9, 0.999), epsilon=1e-08)
  • Learning Rate Scheduler: Constant
  • Number of Epochs: 30

Training was performed using PyTorch with the Transformers library, and the performance metrics indicate high accuracy and precision for the targeted document fields.

Guide: Running Locally

To run the model locally, follow these steps:

  1. Install Dependencies: Ensure you have Python and pip installed. Then, install the necessary packages:

    pip install torch transformers datasets
    
  2. Load the Model: Use the transformers library to load the model:

    from transformers import LayoutLMv2Tokenizer, LayoutLMv2ForTokenClassification
    
    tokenizer = LayoutLMv2Tokenizer.from_pretrained("microsoft/layoutlmv2-base-uncased")
    model = LayoutLMv2ForTokenClassification.from_pretrained("Sebabrata/lmv2-g-passport-197-doc-09-13")
    
  3. Prepare Data: Format your input data as required by the LayoutLMv2 model, ensuring that spatial layout information is included.

  4. Inference: Run the model on your data to get predictions.

For better performance, especially with larger datasets, consider using a cloud GPU service such as AWS, GCP, or Azure.

License

The model is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (CC BY-NC-SA 4.0). This means you can share and adapt the model for non-commercial purposes, provided you give appropriate credit and distribute any derivative works under the same license.

More Related APIs in Token Classification