cnn_dailymail summarization t5 small 2022 09 05

farleyknight

Introduction

The CNN_DAILYMAIL-SUMMARIZATION-T5-SMALL-2022-09-05 model is a fine-tuned version of t5-small on the CNN/DailyMail dataset version 3.0.0. It is designed for text summarization tasks and achieves notable results in terms of ROUGE metrics.

Architecture

The model is based on the T5 architecture, which is a text-to-text transformers model. It utilizes the PyTorch library and has been generated using Hugging Face's Trainer.

Training

The model was trained using the following hyperparameters:

  • Learning Rate: 5e-05
  • Train Batch Size: 8
  • Eval Batch Size: 8
  • Seed: 42
  • Optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • LR Scheduler Type: Linear
  • Number of Epochs: 3.0

The evaluation results on the dataset show:

  • Loss: 1.6455
  • Rouge1: 41.4235
  • Rouge2: 19.0263
  • Rougel: 29.2892
  • Rougelsum: 38.6338
  • Gen Len: 73.7273

Guide: Running Locally

To run the model locally, follow these steps:

  1. Clone the repository and navigate to the main directory.
  2. Install the necessary Python packages using pip:
    pip install transformers torch datasets
    
  3. Load the model using the transformers library:
    from transformers import T5Tokenizer, T5ForConditionalGeneration
    
    tokenizer = T5Tokenizer.from_pretrained("t5-small")
    model = T5ForConditionalGeneration.from_pretrained("farleyknight/cnn_dailymail-summarization-t5-small-2022-09-05")
    
  4. Use the tokenizer and model for summarization tasks.

For enhanced performance, consider using cloud GPUs such as AWS EC2 instances with NVIDIA GPUs, Google Cloud Platform, or Azure.

License

This model is licensed under the Apache 2.0 License.

More Related APIs in Text2text Generation