flan t5 base samsum LLM Model

Introduction

The FLAN-T5-BASE-SAMSUM model is a fine-tuned version of Google's FLAN-T5-BASE, specifically trained on the Samsum dataset for text-to-text generation tasks. It is designed to perform sequence-to-sequence language modeling and achieves notable evaluation results in the Rouge metrics.

Architecture

FLAN-T5-BASE-SAMSUM is built on top of the T5 architecture and is fine-tuned to enhance its performance on dialogue summarization tasks using the Samsum dataset. The model integrates with the Hugging Face Transformers library and is compatible with PyTorch.

Training

The model was trained using the Samsum dataset with the following hyperparameters:

Learning Rate: 5e-05
Train Batch Size: 8
Eval Batch Size: 8
Seed: 42
Optimizer: Adam with betas (0.9, 0.999) and epsilon 1e-08
LR Scheduler Type: Linear
Number of Epochs: 5

Training Results

Loss: 1.3716
Rouge1: 47.2358
Rouge2: 23.5135
Rougel: 39.6266
Rougelsum: 43.3458
Gen Len: 17.3907

Guide: Running Locally

Setup Environment: Ensure you have Python installed. Set up a virtual environment and install necessary libraries.
Install Dependencies: Use pip to install the Transformers, Datasets, and PyTorch libraries:
```
pip install transformers datasets torch
```

Download the Model: Use the Hugging Face Transformers library to download the model:

from transformers import T5ForConditionalGeneration, T5Tokenizer

model = T5ForConditionalGeneration.from_pretrained("philschmid/flan-t5-base-samsum")
tokenizer = T5Tokenizer.from_pretrained("philschmid/flan-t5-base-samsum")

Run Inference: Use the model and tokenizer to generate summaries from input text.
Cloud GPU: For faster processing, consider using cloud GPU services such as AWS, GCP, or Azure.

License

The FLAN-T5-BASE-SAMSUM model is licensed under the Apache-2.0 License. This license permits use, distribution, and modification under certain conditions, further details of which can be found in the LICENSE file of the repository.

More Related APIs in Text2text Generation