t5 small finetuned contradiction
domenicrosatiIntroduction
The t5-small-finetuned-contradiction
model is a fine-tuned version of T5, optimized for the SNLI dataset, focusing on sequence-to-sequence language modeling and text-to-text generation tasks. It achieves notable evaluation metrics, including a Rouge1 score of 34.4237.
Architecture
This model is built upon the T5 architecture, which is designed for text-to-text tasks. The architecture allows for the transformation of one text sequence into another, making it suitable for a variety of language processing tasks including summarization and contradiction detection.
Training
Training Procedure
The model was trained using the Adam optimizer with specific hyperparameters:
- Learning Rate: 5.6e-05
- Batch Size: 64 for both training and evaluation
- Epochs: 8
- Seed: 42
- Mixed Precision Training: Native AMP
The training process involved multiple epochs with detailed results showing incremental improvements in loss and Rouge scores.
Training Results
Throughout 8 epochs, the model demonstrated a reduction in loss and improved Rouge scores. The final epoch achieved a loss of 2.0458 with the following Rouge metrics:
- Rouge1: 34.5
- Rouge2: 14.5695
- Rougel: 32.6219
- Rougelsum: 32.6478
Guide: Running Locally
To run the t5-small-finetuned-contradiction
model locally, follow these steps:
-
Install Dependencies:
Ensure you have the following libraries installed:pip install transformers==4.18.0 torch==1.11.0 datasets==2.1.0 tokenizers==0.12.1
-
Load the Model: Utilize the Hugging Face Transformers library to load the model:
from transformers import T5ForConditionalGeneration, T5Tokenizer model = T5ForConditionalGeneration.from_pretrained("domenicrosati/t5-small-finetuned-contradiction") tokenizer = T5Tokenizer.from_pretrained("domenicrosati/t5-small-finetuned-contradiction")
-
Inference: Prepare your input text and generate predictions:
input_text = "Your input text here" input_ids = tokenizer.encode(input_text, return_tensors="pt") outputs = model.generate(input_ids) print(tokenizer.decode(outputs[0]))
-
Hardware Recommendations: For optimal performance, especially during inference, consider using cloud GPUs such as those provided by AWS, GCP, or Azure.
License
The model is available under the Apache 2.0 license, permitting free use, modification, and distribution with proper attribution.