xlm roberta ner japanese

tsmatz

Introduction

The XLM-ROBERTA-NER-JAPANESE model is a fine-tuned version of the xlm-roberta-base model designed for named entity recognition (NER) in Japanese text. It is trained on a dataset provided by Stockmark Inc., derived from Japanese Wikipedia articles.

Architecture

This model leverages the xlm-roberta-base, a pre-trained cross-lingual RoBERTa model, to perform token classification for NER tasks. The model assigns specific labels to tokens, identifying entities like persons, organizations, locations, and more.

Token Labels:

  • O: Others or nothing
  • PER: Person
  • ORG: General corporation organization
  • ORG-P: Political organization
  • ORG-O: Other organization
  • LOC: Location
  • INS: Institution, facility
  • PRD: Product
  • EVT: Event

Training

The model was fine-tuned using specific hyperparameters:

  • Learning Rate: 5e-05
  • Train and Eval Batch Size: 12
  • Seed: 42
  • Optimizer: Adam with betas=(0.9, 0.999) and epsilon=1e-08
  • Learning Rate Scheduler: Linear
  • Number of Epochs: 5

Training Results:

  • Epoch 1: Validation Loss: 0.1510, F1 Score: 0.8457
  • Epoch 2: Validation Loss: 0.0626, F1 Score: 0.9261
  • Epoch 3: Validation Loss: 0.0366, F1 Score: 0.9580
  • Epoch 4: Validation Loss: 0.0196, F1 Score: 0.9792
  • Epoch 5: Validation Loss: 0.0173, F1 Score: 0.9864

Framework Versions:

  • Transformers: 4.23.1
  • PyTorch: 1.12.1+cu102
  • Datasets: 2.6.1
  • Tokenizers: 0.13.1

Guide: Running Locally

To run this model locally:

  1. Install Dependencies: Ensure you have transformers, torch, and datasets installed.

    pip install transformers torch datasets
    
  2. Load the Model:

    from transformers import pipeline
    
    model_name = "tsmatz/xlm-roberta-ner-japanese"
    classifier = pipeline("token-classification", model=model_name)
    result = classifier("鈴井は4月の陽気の良い日に、鈴をつけて北海道のトムラウシへと登った")
    print(result)
    
  3. Cloud GPUs: For efficient processing, consider using cloud GPU services like AWS, Google Cloud, or Azure.

License

The model and its components are released under the MIT License, allowing for broad usage and modification with proper attribution.

More Related APIs in Token Classification