xlm roberta base finetuned swahili finetuned ner swahili
mbeukmanIntroduction
The XLM-RoBERTa-Base Finetuned Swahili model is a token classification model specifically fine-tuned for Named Entity Recognition (NER) on Swahili language datasets. This model builds upon the xlm-roberta-base architecture and has been further optimized using the MasakhaNER dataset.
Architecture
This model is based on the transformer architecture and is a variant of XLM-RoBERTa. The fine-tuning process involved 50 epochs on the MasakhaNER dataset, which comprises news articles in various African languages. The model's configuration includes a maximum sequence length of 200, a batch size of 32, and a learning rate of 5e-5. The model was trained using different random seeds, with the best-performing seed being selected for deployment.
Training
The model was fine-tuned using the MasakhaNER dataset, which is particularly suited for NER tasks. Training was conducted on an NVIDIA RTX3090 GPU, taking approximately 10 to 30 minutes per model. The minimum required GPU memory was 14GB for a batch size of 32, though it could be reduced to 6.5GB of VRAM with a batch size of one. The model's performance metrics include an F1 score, precision, and recall.
Guide: Running Locally
-
Setup Environment:
- Install the Transformers library from Hugging Face.
- Ensure PyTorch is installed in your environment.
-
Load Model and Tokenizer:
from transformers import AutoTokenizer, AutoModelForTokenClassification, pipeline model_name = 'mbeukman/xlm-roberta-base-finetuned-swahili-finetuned-ner-swahili' tokenizer = AutoTokenizer.from_pretrained(model_name) model = AutoModelForTokenClassification.from_pretrained(model_name) nlp = pipeline("ner", model=model, tokenizer=tokenizer)
-
Run Inference:
example = "Wizara ya afya ya Tanzania imeripoti Jumatatu kuwa , watu takriban 14 zaidi wamepata maambukizi ya Covid - 19 ." ner_results = nlp(example) print(ner_results)
-
Cloud GPUs: For faster training and inference, consider using cloud GPU services like AWS EC2, Google Cloud, or Azure.
License
This model is licensed under the Apache License, Version 2.0. For more details, refer to the Apache License, Version 2.0.