bsc_roberta2roberta_shared spanish finetuned mlsum summarization
NarrativaIntroduction
The bsc_roberta2roberta_shared-spanish-finetuned-mlsum-summarization
model is a Spanish summarization model fine-tuned on the MLSUM dataset. It is designed to generate summaries of news articles using an encoder-decoder architecture based on RoBERTa.
Architecture
This model uses a RoBERTa encoder-decoder setup, leveraging the roberta-base-bne
checkpoint from BSC-TeMU. It is implemented with PyTorch and is part of the Hugging Face Transformers library, supporting text-to-text generation tasks.
Training
The model was fine-tuned using the MLSUM dataset, which is a multilingual dataset for summarization tasks. MLSUM includes over 1.5 million article-summary pairs across multiple languages such as Spanish, French, German, Russian, and Turkish. The model achieves the following ROUGE scores:
- ROUGE-1 F1: 28.83
- ROUGE-L F1: 23.15
- ROUGE-2 F1: 10.69
Guide: Running Locally
To run the model locally, follow these steps:
-
Install Dependencies: Ensure you have the
transformers
andtorch
libraries installed.pip install transformers torch
-
Load the Model and Tokenizer:
import torch from transformers import RobertaTokenizerFast, EncoderDecoderModel device = 'cuda' if torch.cuda.is_available() else 'cpu' ckpt = 'Narrativa/bsc_roberta2roberta_shared-spanish-finetuned-mlsum-summarization' tokenizer = RobertaTokenizerFast.from_pretrained(ckpt) model = EncoderDecoderModel.from_pretrained(ckpt).to(device)
-
Generate a Summary:
def generate_summary(text): inputs = tokenizer([text], padding="max_length", truncation=True, max_length=512, return_tensors="pt") input_ids = inputs.input_ids.to(device) attention_mask = inputs.attention_mask.to(device) output = model.generate(input_ids, attention_mask=attention_mask) return tokenizer.decode(output[0], skip_special_tokens=True) text = "Your text here..." print(generate_summary(text))
-
Use Cloud GPUs: For improved performance and faster inference, consider utilizing cloud GPU services such as AWS EC2, Google Cloud, or Azure.
License
The usage of this model is subject to the licensing terms provided by the creators, Narrativa. Ensure compliance with these terms when integrating the model into your applications or services.