t5 base finetuned question answering

MaRiOrOsSi

Introduction

The T5-Base-Finetuned-Question-Answering model is designed for Generative Question Answering, developed by Christian Di Maio and Giacomo Nunziati. It fine-tunes Google’s T5 on the DuoRC dataset by prepending questions to contexts.

Architecture

This model leverages the T5 architecture, which is a transformer-based model, suitable for text-to-text tasks. The model is fine-tuned using the DuoRC dataset for enhancing its question-answering capabilities.

Training

The training code for the T5 model is accessible via a GitHub repository, which provides the script used for fine-tuning. The model's performance is evaluated on the DuoRC SelfRC and ParaphraseRC test subsets, and the SQUADv1 validation subset, with F1 and EM scores comparing favorably with a BERT baseline.

Guide: Running Locally

To run the model locally, follow these steps:

  1. Install the Transformers Library:

    pip install transformers
    
  2. Load the Model and Tokenizer:

    from transformers import AutoTokenizer, AutoModelWithLMHead, pipeline
    
    model_name = "MaRiOrOsSi/t5-base-finetuned-question-answering"
    tokenizer = AutoTokenizer.from_pretrained(model_name)
    model = AutoModelWithLMHead.from_pretrained(model_name)
    
  3. Prepare Input and Generate Output:

    question = "What is 42?"
    context = "42 is the answer to life, the universe and everything"
    input = f"question: {question} context: {context}"
    encoded_input = tokenizer([input], return_tensors='pt', max_length=512, truncation=True)
    output = model.generate(input_ids=encoded_input.input_ids, attention_mask=encoded_input.attention_mask)
    output = tokenizer.decode(output[0], skip_special_tokens=True)
    print(output)
    

Cloud GPU Recommendation: For optimal performance, consider using cloud platforms like AWS, Google Cloud, or Azure that offer GPU instances.

License

The model was created by Christian Di Maio and Giacomo Nunziati, as indicated in the citation, with their professional LinkedIn profiles mentioned for further reference.

More Related APIs in Text2text Generation