ner german large LLM Model

Introduction

The NER-GERMAN-LARGE model is a Named Entity Recognition (NER) model for German, built using the Flair NLP library. It is designed to recognize four types of entities: person names (PER), location names (LOC), organization names (ORG), and miscellaneous names (MISC). The model achieves an F1-Score of 92.31 on the CoNLL-03 German revised dataset, utilizing document-level XLM-R embeddings and FLERT.

Architecture

The model uses a sequence tagger approach without Conditional Random Fields (CRF) or Recurrent Neural Networks (RNN). It employs fine-tunable transformer embeddings with document context, specifically the xlm-roberta-large model. The architecture is streamlined with a hidden size of 256 and does not reproject embeddings.

Training

The training process involves:

Loading the CoNLL-03 German corpus.
Defining the NER tag type.
Creating a tag dictionary from the corpus.
Initializing transformer embeddings with context.
Configuring a sequence tagger with these embeddings.
Using the AdamW optimizer for training.
Training the model over 20 epochs with a small learning rate, leveraging the OneCycleLR scheduler.

Guide: Running Locally

Install Flair:
```
pip install flair
```

Load the model and predict:

from flair.data import Sentence
from flair.models import SequenceTagger

# Load tagger
tagger = SequenceTagger.load("flair/ner-german-large")

# Make example sentence
sentence = Sentence("George Washington ging nach Washington")

# Predict NER tags
tagger.predict(sentence)

# Print results
for entity in sentence.get_spans('ner'):
    print(entity)

For optimal performance, consider using cloud GPU services such as AWS, Google Cloud, or Azure to handle the computational demands of the model.

License

The usage of this model should be accompanied by a citation of the paper: "FLERT: Document-Level Features for Named Entity Recognition" by Stefan Schweter and Alan Akbik, available on arXiv (eprint: 2011.06993).

More Related APIs in Token Classification