roberta ner multilingual
julian-schelbIntroduction
The RoBERTa NER Multilingual model is designed for named entity recognition (NER), classifying tokens in text according to the IOB format. It is a fine-tuned version of XLM-RoBERTa, capable of recognizing entities such as persons, organizations, and locations across 21 languages.
Architecture
The model is based on XLM-RoBERTa, a transformer that utilizes masked language modeling (MLM) for pre-training. This pre-training involves masking words in a sentence and predicting them, allowing RoBERTa to learn bidirectional representations from large datasets in 100 languages.
Training
The model is fine-tuned using the WikiANN dataset, which includes entity-annotated examples across 21 languages. The training set comprises 375,100 sentences, with a validation set of 173,100 examples. The NER tags follow the IOB2 format, categorizing entities into locations, persons, or organizations. The evaluation results indicate high precision and recall, particularly for person entities.
Guide: Running Locally
-
Install Dependencies: Ensure you have Python and PyTorch installed, along with the
transformers
library.pip install torch transformers
-
Load the Model: Use the
AutoTokenizer
andAutoModelForTokenClassification
classes to load the model.from transformers import AutoTokenizer, AutoModelForTokenClassification tokenizer = AutoTokenizer.from_pretrained("julian-schelb/roberta-ner-multilingual/", add_prefix_space=True) model = AutoModelForTokenClassification.from_pretrained("julian-schelb/roberta-ner-multilingual/")
-
Prepare Input: Tokenize your input text.
text = "Your text here." inputs = tokenizer(text, add_special_tokens=False, return_tensors="pt")
-
Inference: Run the model and get predictions.
with torch.no_grad(): logits = model(**inputs).logits predicted_token_class_ids = logits.argmax(-1)
-
Interpret Results: Map predicted IDs to token classes.
predicted_tokens_classes = [model.config.id2label[t.item()] for t in predicted_token_class_ids[0]] print(predicted_tokens_classes)
Cloud GPUs: For faster processing, consider using cloud-based GPU services like AWS, Google Cloud, or Azure.
License
The model is licensed under the MIT License, allowing wide use and modification with attribution.