xlm roberta large masakhaner
DavlanIntroduction
xlm-roberta-large-masakhaner
is a Named Entity Recognition (NER) model fine-tuned on the XLM-RoBERTa large model for 10 African languages. It is designed to identify entities such as dates, locations, organizations, and persons. The model is based on datasets from the Masakhane MasakhaNER project.
Architecture
The model uses the XLM-RoBERTa large architecture, which is a multilingual transformer model. It has been fine-tuned specifically for NER tasks in African languages, leveraging the MasakhaNER dataset to achieve state-of-the-art performance.
Training
The model was refined using 10 African NER datasets, with training conducted on a single NVIDIA V100 GPU. The training followed hyperparameters recommended by the original MasakhaNER paper. Key to its training is the ability to distinguish between entities, especially when they appear consecutively.
Guide: Running Locally
-
Set Up Environment
Ensure you have Python and PyTorch installed. Use a virtual environment for best practices. -
Install Transformers
pip install transformers
-
Load the Model
from transformers import AutoTokenizer, AutoModelForTokenClassification, pipeline tokenizer = AutoTokenizer.from_pretrained("Davlan/xlm-roberta-large-masakhaner") model = AutoModelForTokenClassification.from_pretrained("Davlan/xlm-roberta-large-masakhaner") nlp = pipeline("ner", model=model, tokenizer=tokenizer)
-
Perform Inference
example = "Emir of Kano turban Zhang wey don spend 18 years for Nigeria" ner_results = nlp(example) print(ner_results)
-
Utilize Cloud GPUs
For improved performance, consider using cloud GPU services like AWS EC2, Google Cloud, or Azure to train or fine-tune models.
License
The model and its training data follow the licensing agreements as specified by the Masakhane project and Hugging Face. Users should ensure compliance with these licenses when using the model.