xlm roberta large masakhaner

Davlan

Introduction

xlm-roberta-large-masakhaner is a Named Entity Recognition (NER) model fine-tuned on the XLM-RoBERTa large model for 10 African languages. It is designed to identify entities such as dates, locations, organizations, and persons. The model is based on datasets from the Masakhane MasakhaNER project.

Architecture

The model uses the XLM-RoBERTa large architecture, which is a multilingual transformer model. It has been fine-tuned specifically for NER tasks in African languages, leveraging the MasakhaNER dataset to achieve state-of-the-art performance.

Training

The model was refined using 10 African NER datasets, with training conducted on a single NVIDIA V100 GPU. The training followed hyperparameters recommended by the original MasakhaNER paper. Key to its training is the ability to distinguish between entities, especially when they appear consecutively.

Guide: Running Locally

  1. Set Up Environment
    Ensure you have Python and PyTorch installed. Use a virtual environment for best practices.

  2. Install Transformers

    pip install transformers
    
  3. Load the Model

    from transformers import AutoTokenizer, AutoModelForTokenClassification, pipeline
    
    tokenizer = AutoTokenizer.from_pretrained("Davlan/xlm-roberta-large-masakhaner")
    model = AutoModelForTokenClassification.from_pretrained("Davlan/xlm-roberta-large-masakhaner")
    nlp = pipeline("ner", model=model, tokenizer=tokenizer)
    
  4. Perform Inference

    example = "Emir of Kano turban Zhang wey don spend 18 years for Nigeria"
    ner_results = nlp(example)
    print(ner_results)
    
  5. Utilize Cloud GPUs
    For improved performance, consider using cloud GPU services like AWS EC2, Google Cloud, or Azure to train or fine-tune models.

License

The model and its training data follow the licensing agreements as specified by the Masakhane project and Hugging Face. Users should ensure compliance with these licenses when using the model.

More Related APIs in Token Classification