t5 small finetuned contradiction

domenicrosati

Introduction

The t5-small-finetuned-contradiction model is a fine-tuned version of T5, optimized for the SNLI dataset, focusing on sequence-to-sequence language modeling and text-to-text generation tasks. It achieves notable evaluation metrics, including a Rouge1 score of 34.4237.

Architecture

This model is built upon the T5 architecture, which is designed for text-to-text tasks. The architecture allows for the transformation of one text sequence into another, making it suitable for a variety of language processing tasks including summarization and contradiction detection.

Training

Training Procedure

The model was trained using the Adam optimizer with specific hyperparameters:

  • Learning Rate: 5.6e-05
  • Batch Size: 64 for both training and evaluation
  • Epochs: 8
  • Seed: 42
  • Mixed Precision Training: Native AMP

The training process involved multiple epochs with detailed results showing incremental improvements in loss and Rouge scores.

Training Results

Throughout 8 epochs, the model demonstrated a reduction in loss and improved Rouge scores. The final epoch achieved a loss of 2.0458 with the following Rouge metrics:

  • Rouge1: 34.5
  • Rouge2: 14.5695
  • Rougel: 32.6219
  • Rougelsum: 32.6478

Guide: Running Locally

To run the t5-small-finetuned-contradiction model locally, follow these steps:

  1. Install Dependencies:
    Ensure you have the following libraries installed:

    pip install transformers==4.18.0 torch==1.11.0 datasets==2.1.0 tokenizers==0.12.1
    
  2. Load the Model: Utilize the Hugging Face Transformers library to load the model:

    from transformers import T5ForConditionalGeneration, T5Tokenizer
    
    model = T5ForConditionalGeneration.from_pretrained("domenicrosati/t5-small-finetuned-contradiction")
    tokenizer = T5Tokenizer.from_pretrained("domenicrosati/t5-small-finetuned-contradiction")
    
  3. Inference: Prepare your input text and generate predictions:

    input_text = "Your input text here"
    input_ids = tokenizer.encode(input_text, return_tensors="pt")
    outputs = model.generate(input_ids)
    print(tokenizer.decode(outputs[0]))
    
  4. Hardware Recommendations: For optimal performance, especially during inference, consider using cloud GPUs such as those provided by AWS, GCP, or Azure.

License

The model is available under the Apache 2.0 license, permitting free use, modification, and distribution with proper attribution.

More Related APIs in Summarization