bart large cnn samsum LLM Model

Introduction

The BART-LARGE-CNN-SAMSUM model is a fine-tuned version of the BART model for summarization tasks, specifically using the SAMSum Corpus. It is optimized for text-to-text generation tasks in English and can be deployed using various platforms, including Amazon SageMaker.

Architecture

The model is based on the BART (Bidirectional and Auto-Regressive Transformers) architecture, designed for text generation and summarization. Its configuration uses the facebook/bart-large-cnn as the base model, fine-tuned on the SAMSum dataset to improve performance on dialogue summarization.

Training

This model was trained on Amazon SageMaker using the Hugging Face Deep Learning Container. Key hyperparameters include:

Dataset: SAMSum
Training epochs: 3
Learning rate: 5e-05
Batch size: 4 (for both training and evaluation)
Mixed precision training enabled (fp16)

The model achieves notable performance metrics on the SAMSum dataset, with ROUGE scores reflecting its summarization capabilities.

Guide: Running Locally

To run the model locally, follow these steps:

Install the Transformers Library:
Ensure you have the transformers library installed:
```
pip install transformers
```

Load the Model:

from transformers import pipeline
summarizer = pipeline("summarization", model="philschmid/bart-large-cnn-samsum")

Summarize a Conversation:

conversation = '''Jeff: Can I train a 🤗 Transformers model on Amazon SageMaker? 
Philipp: Sure you can use the new Hugging Face Deep Learning Container. 
Jeff: ok.
Jeff: and how can I get started? 
Jeff: where can I find documentation? 
Philipp: ok, ok you can find everything here. https://huggingface.co/blog/the-partnership-amazon-sagemaker-and-hugging-face'''

print(summarizer(conversation))

For optimal performance, consider using cloud GPUs from providers like AWS, GCP, or Azure.

License

This model is released under the MIT License, allowing for wide usage and modification with minimal restrictions.

More Related APIs in Summarization