bert2bert_shared german finetuned summarization
mrm8488Introduction
The BERT2BERT_SHARED-GERMAN-FINETUNED-SUMMARIZATION model is a German language model fine-tuned for text summarization tasks. It is based on a BERT encoder-decoder architecture and is designed to generate concise summaries of German texts, particularly in the context of news articles.
Architecture
The model utilizes the BERT base architecture, specifically the bert-base-german-cased
checkpoint. It employs an encoder-decoder setup, where both components are derived from BERT, allowing the model to effectively handle text-to-text generation tasks.
Training
This model was fine-tuned on the MLSUM dataset, which is a large-scale multilingual summarization dataset consisting of article-summary pairs from online newspapers in multiple languages, including German. The model's performance has been evaluated using the Rouge-2 metric, demonstrating reasonable precision, recall, and f-measure scores.
Guide: Running Locally
To run the model locally, follow these steps:
-
Install Dependencies: Ensure you have Python and PyTorch installed. Install the
transformers
library from Hugging Face:pip install transformers
-
Setup Device: Determine if a GPU is available:
import torch device = 'cuda' if torch.cuda.is_available() else 'cpu'
-
Load Model and Tokenizer:
from transformers import BertTokenizerFast, EncoderDecoderModel ckpt = 'mrm8488/bert2bert_shared-german-finetuned-summarization' tokenizer = BertTokenizerFast.from_pretrained(ckpt) model = EncoderDecoderModel.from_pretrained(ckpt).to(device)
-
Generate Summary:
def generate_summary(text): inputs = tokenizer([text], padding="max_length", truncation=True, max_length=512, return_tensors="pt") input_ids = inputs.input_ids.to(device) attention_mask = inputs.attention_mask.to(device) output = model.generate(input_ids, attention_mask=attention_mask) return tokenizer.decode(output[0], skip_special_tokens=True) text = "Your text here..." summary = generate_summary(text)
For better performance, consider using cloud computing platforms with GPU support such as Google Colab or AWS EC2.
License
The model is created by Manuel Romero and is supported by Narrativa. It is made with care in Spain, and users should refer to the specific licensing terms provided in the model repository on Hugging Face for detailed usage rights and permissions.