roberta base corener

aiola

Introduction

The roberta-base-corener model by aiola is a multi-task model designed for named-entity recognition (NER), relation extraction (RE), entity mention detection (EMD), and coreference resolution (CR). This model uses the RoBERTa architecture and is trained on English datasets like Ontonotes and CoNLL04.

Architecture

The model leverages the RoBERTa architecture and formulates NER as a span classification task. It treats relation extraction as a multi-label classification problem of NER span tuples. Similarly, EMD is modeled as a span classification task, and CR is addressed using binary classification of EMD span tuples. CR clusters are constructed by identifying the top antecedent of each mention and computing the connected components of the mentions' undirected graph.

Training

The training of the model involves recognizing entity types such as GPE, ORG, PERSON, DATE, among others, and relation types like Kill, Live_In, Located_In, OrgBased_In, and Work_For.

Guide: Running Locally

To run the model locally, follow these steps:

  1. Install the necessary libraries:

    pip install transformers
    pip install git+https://github.com/aiola-lab/corener.git
    
  2. Import the required modules and load the model:

    from transformers import AutoTokenizer
    from corener.models import Corener
    tokenizer = AutoTokenizer.from_pretrained("aiola/roberta-base-corener")
    model = Corener.from_pretrained("aiola/roberta-base-corener")
    
  3. Prepare your dataset and make predictions:

    from corener.data import MTLDataset
    from corener.utils.prediction import convert_model_output
    import json
    
    examples = ["Your text here"]
    dataset = MTLDataset(types=model.config.types, tokenizer=tokenizer, train_mode=False)
    dataset.read_dataset(examples)
    example = dataset.get_example(0)
    
    output = model(input_ids=example.encodings, context_masks=example.context_masks, inference=True)
    print(json.dumps(convert_model_output(output=output, batch=example, dataset=dataset), indent=2))
    
  4. Consider using cloud GPUs like AWS, Google Cloud, or Azure for enhanced performance if required.

License

The model is distributed under the Apache-2.0 license, which allows for modification and distribution with proper attribution.

More Related APIs in Fill Mask