ner german large
flairIntroduction
The NER-GERMAN-LARGE
model is a Named Entity Recognition (NER) model for German, built using the Flair NLP library. It is designed to recognize four types of entities: person names (PER), location names (LOC), organization names (ORG), and miscellaneous names (MISC). The model achieves an F1-Score of 92.31 on the CoNLL-03 German revised dataset, utilizing document-level XLM-R embeddings and FLERT.
Architecture
The model uses a sequence tagger approach without Conditional Random Fields (CRF) or Recurrent Neural Networks (RNN). It employs fine-tunable transformer embeddings with document context, specifically the xlm-roberta-large
model. The architecture is streamlined with a hidden size of 256 and does not reproject embeddings.
Training
The training process involves:
- Loading the CoNLL-03 German corpus.
- Defining the NER tag type.
- Creating a tag dictionary from the corpus.
- Initializing transformer embeddings with context.
- Configuring a sequence tagger with these embeddings.
- Using the AdamW optimizer for training.
- Training the model over 20 epochs with a small learning rate, leveraging the OneCycleLR scheduler.
Guide: Running Locally
- Install Flair:
pip install flair
- Load the model and predict:
from flair.data import Sentence from flair.models import SequenceTagger # Load tagger tagger = SequenceTagger.load("flair/ner-german-large") # Make example sentence sentence = Sentence("George Washington ging nach Washington") # Predict NER tags tagger.predict(sentence) # Print results for entity in sentence.get_spans('ner'): print(entity)
For optimal performance, consider using cloud GPU services such as AWS, Google Cloud, or Azure to handle the computational demands of the model.
License
The usage of this model should be accompanied by a citation of the paper: "FLERT: Document-Level Features for Named Entity Recognition" by Stefan Schweter and Alan Akbik, available on arXiv (eprint: 2011.06993).