bsc_roberta2roberta_shared spanish finetuned mlsum summarization LLM Model

Introduction

The bsc_roberta2roberta_shared-spanish-finetuned-mlsum-summarization model is a Spanish summarization model fine-tuned on the MLSUM dataset. It is designed to generate summaries of news articles using an encoder-decoder architecture based on RoBERTa.

Architecture

This model uses a RoBERTa encoder-decoder setup, leveraging the roberta-base-bne checkpoint from BSC-TeMU. It is implemented with PyTorch and is part of the Hugging Face Transformers library, supporting text-to-text generation tasks.

Training

The model was fine-tuned using the MLSUM dataset, which is a multilingual dataset for summarization tasks. MLSUM includes over 1.5 million article-summary pairs across multiple languages such as Spanish, French, German, Russian, and Turkish. The model achieves the following ROUGE scores:

ROUGE-1 F1: 28.83
ROUGE-L F1: 23.15
ROUGE-2 F1: 10.69

Guide: Running Locally

To run the model locally, follow these steps:

Install Dependencies: Ensure you have the transformers and torch libraries installed.
```
pip install transformers torch
```

Load the Model and Tokenizer:

import torch
from transformers import RobertaTokenizerFast, EncoderDecoderModel

device = 'cuda' if torch.cuda.is_available() else 'cpu'
ckpt = 'Narrativa/bsc_roberta2roberta_shared-spanish-finetuned-mlsum-summarization'
tokenizer = RobertaTokenizerFast.from_pretrained(ckpt)
model = EncoderDecoderModel.from_pretrained(ckpt).to(device)

Generate a Summary:

def generate_summary(text):
    inputs = tokenizer([text], padding="max_length", truncation=True, max_length=512, return_tensors="pt")
    input_ids = inputs.input_ids.to(device)
    attention_mask = inputs.attention_mask.to(device)
    output = model.generate(input_ids, attention_mask=attention_mask)
    return tokenizer.decode(output[0], skip_special_tokens=True)

text = "Your text here..."
print(generate_summary(text))

Use Cloud GPUs: For improved performance and faster inference, consider utilizing cloud GPU services such as AWS EC2, Google Cloud, or Azure.

License

The usage of this model is subject to the licensing terms provided by the creators, Narrativa. Ensure compliance with these terms when integrating the model into your applications or services.

More Related APIs in Summarization