xlm roberta base finetuned panx all
transformersbookIntroduction
The xlm-roberta-base-finetuned-panx-all
model is a fine-tuned version of the xlm-roberta-base
model, specifically adapted for multilingual named entity recognition (NER) tasks. This fine-tuning process is documented in Chapter 4 of the "NLP with Transformers" book. The model demonstrates strong performance on the PAN-X dataset with notable F1 and accuracy scores.
Architecture
The model is based on the XLM-RoBERTa architecture, which is known for its robust performance across multiple languages. It has been fine-tuned using the PAN-X dataset, focusing on token classification tasks.
Training
The model was trained with specific hyperparameters, including a learning rate of 5e-05
, a batch size of 24 for both training and evaluation, and the Adam optimizer. The training process spanned three epochs, achieving an F1 score of 0.8581 on the evaluation set. The training utilized the following framework versions:
- Transformers 4.12.0.dev0
- PyTorch 1.9.1+cu102
- Datasets 1.12.1
- Tokenizers 0.10.3
Guide: Running Locally
To run the model locally, you need to have the appropriate environment set up. Follow these basic steps:
- Install Dependencies: Ensure you have Python installed and set up a virtual environment. Install the required libraries:
pip install transformers==4.12.0.dev0 torch==1.9.1 datasets==1.12.1 tokenizers==0.10.3
- Download the Model: You can download the model using Hugging Face's
transformers
library:from transformers import AutoModelForTokenClassification, AutoTokenizer model = AutoModelForTokenClassification.from_pretrained("transformersbook/xlm-roberta-base-finetuned-panx-all") tokenizer = AutoTokenizer.from_pretrained("transformersbook/xlm-roberta-base-finetuned-panx-all")
- Run Inference: Use the model and tokenizer to perform inference on your data.
For efficient processing, especially with large datasets or longer sequences, it is advisable to use cloud GPUs such as those provided by AWS, GCP, or Azure.
License
The model is licensed under the MIT License, allowing for broad use and modification with minimal restrictions.