bert large N E R
dslimIntroduction
The bert-large-NER
model is a fine-tuned BERT model designed for Named Entity Recognition (NER), achieving state-of-the-art performance on the task. This model can recognize four types of entities: location (LOC), organization (ORG), person (PER), and miscellaneous (MISC). It is based on the bert-large-cased
model and fine-tuned using the CoNLL-2003 NER dataset.
Architecture
The model utilizes the BERT architecture, specifically the bert-large-cased
variant, which is composed of multiple layers of Transformer blocks. This architecture enables the model to capture complex patterns in text data, making it suitable for tasks like NER.
Training
The bert-large-NER
model was fine-tuned using the English version of the CoNLL-2003 NER dataset, which is derived from the Reuters news corpus. Training involved classifying each token into specific categories such as beginning or continuation of an entity. The model was trained on a single NVIDIA V100 GPU, using hyperparameters recommended in the original BERT paper.
Training Data
- Dataset: CoNLL-2003
- Entities: LOC, MISC, ORG, PER
- Training Examples: 7140 LOC, 3438 MISC, 6321 ORG, 6600 PER
- Articles/Sentences/Tokens: 946 articles, 14,987 sentences, 203,621 tokens (Train set)
Guide: Running Locally
Basic Steps
-
Install Dependencies: Ensure you have Python and the Transformers library installed.
pip install transformers
-
Load the Model:
from transformers import AutoTokenizer, AutoModelForTokenClassification, pipeline tokenizer = AutoTokenizer.from_pretrained("dslim/bert-large-NER") model = AutoModelForTokenClassification.from_pretrained("dslim/bert-large-NER") nlp = pipeline("ner", model=model, tokenizer=tokenizer)
-
Run an Example:
example = "My name is Wolfgang and I live in Berlin" ner_results = nlp(example) print(ner_results)
Cloud GPUs
For better performance, especially with large models like bert-large-NER
, consider using cloud-based GPUs such as those offered by AWS, Google Cloud, or Azure.
License
The bert-large-NER
model is licensed under the MIT License, allowing for extensive reuse and modification with proper attribution.