roberta base corener
aiolaIntroduction
The roberta-base-corener
model by aiola is a multi-task model designed for named-entity recognition (NER), relation extraction (RE), entity mention detection (EMD), and coreference resolution (CR). This model uses the RoBERTa architecture and is trained on English datasets like Ontonotes and CoNLL04.
Architecture
The model leverages the RoBERTa architecture and formulates NER as a span classification task. It treats relation extraction as a multi-label classification problem of NER span tuples. Similarly, EMD is modeled as a span classification task, and CR is addressed using binary classification of EMD span tuples. CR clusters are constructed by identifying the top antecedent of each mention and computing the connected components of the mentions' undirected graph.
Training
The training of the model involves recognizing entity types such as GPE, ORG, PERSON, DATE, among others, and relation types like Kill, Live_In, Located_In, OrgBased_In, and Work_For.
Guide: Running Locally
To run the model locally, follow these steps:
-
Install the necessary libraries:
pip install transformers pip install git+https://github.com/aiola-lab/corener.git
-
Import the required modules and load the model:
from transformers import AutoTokenizer from corener.models import Corener tokenizer = AutoTokenizer.from_pretrained("aiola/roberta-base-corener") model = Corener.from_pretrained("aiola/roberta-base-corener")
-
Prepare your dataset and make predictions:
from corener.data import MTLDataset from corener.utils.prediction import convert_model_output import json examples = ["Your text here"] dataset = MTLDataset(types=model.config.types, tokenizer=tokenizer, train_mode=False) dataset.read_dataset(examples) example = dataset.get_example(0) output = model(input_ids=example.encodings, context_masks=example.context_masks, inference=True) print(json.dumps(convert_model_output(output=output, batch=example, dataset=dataset), indent=2))
-
Consider using cloud GPUs like AWS, Google Cloud, or Azure for enhanced performance if required.
License
The model is distributed under the Apache-2.0 license, which allows for modification and distribution with proper attribution.