xlm roberta base finetuned swahili
DavlanIntroduction
The xlm-roberta-base-finetuned-swahili
is a Swahili language model derived from the xlm-roberta-base
model by fine-tuning it on Swahili text datasets. It enhances performance in tasks like text classification and named entity recognition compared to the original XLM-RoBERTa model.
Architecture
The model builds upon the xlm-roberta-base
architecture, which is a multilingual transformer model. It has been fine-tuned specifically on Swahili texts to optimize its performance for Swahili language tasks.
Training
- Training Data: The model was fine-tuned using the Swahili CC-100 corpus.
- Training Procedure: Training involved utilizing a single NVIDIA V100 GPU.
- Evaluation: In tests, the model achieved an F1 score of 89.46 on the MasakhaNER dataset, surpassing the XLM-R's score of 87.55.
Guide: Running Locally
-
Installation: Ensure you have the
transformers
library installed.pip install transformers
-
Usage: Implement the model using the Hugging Face
transformers
pipeline for masked token prediction.from transformers import pipeline unmasker = pipeline('fill-mask', model='Davlan/xlm-roberta-base-finetuned-swahili') results = unmasker("Jumatatu, Bwana Kagame alielezea shirika la France24 huko <mask> kwamba hakuna uhalifu ulitendwa") print(results)
-
Cloud GPUs: For intensive tasks, consider using cloud-based GPU services like AWS EC2, Google Cloud Platform, or Azure for efficient processing.
License
The licensing terms for this model are not specified in the provided document. Check the Hugging Face model repository or contact the author for detailed licensing information.