bart large cnn samsum
philschmidIntroduction
The BART-LARGE-CNN-SAMSUM
model is a fine-tuned version of the BART model for summarization tasks, specifically using the SAMSum Corpus. It is optimized for text-to-text generation tasks in English and can be deployed using various platforms, including Amazon SageMaker.
Architecture
The model is based on the BART (Bidirectional and Auto-Regressive Transformers) architecture, designed for text generation and summarization. Its configuration uses the facebook/bart-large-cnn
as the base model, fine-tuned on the SAMSum dataset to improve performance on dialogue summarization.
Training
This model was trained on Amazon SageMaker using the Hugging Face Deep Learning Container. Key hyperparameters include:
- Dataset: SAMSum
- Training epochs: 3
- Learning rate: 5e-05
- Batch size: 4 (for both training and evaluation)
- Mixed precision training enabled (fp16)
The model achieves notable performance metrics on the SAMSum dataset, with ROUGE scores reflecting its summarization capabilities.
Guide: Running Locally
To run the model locally, follow these steps:
-
Install the Transformers Library:
Ensure you have thetransformers
library installed:pip install transformers
-
Load the Model:
from transformers import pipeline summarizer = pipeline("summarization", model="philschmid/bart-large-cnn-samsum")
-
Summarize a Conversation:
conversation = '''Jeff: Can I train a 🤗 Transformers model on Amazon SageMaker? Philipp: Sure you can use the new Hugging Face Deep Learning Container. Jeff: ok. Jeff: and how can I get started? Jeff: where can I find documentation? Philipp: ok, ok you can find everything here. https://huggingface.co/blog/the-partnership-amazon-sagemaker-and-hugging-face''' print(summarizer(conversation))
For optimal performance, consider using cloud GPUs from providers like AWS, GCP, or Azure.
License
This model is released under the MIT License, allowing for wide usage and modification with minimal restrictions.