bert2bert_shared german finetuned summarization

mrm8488

Introduction

The BERT2BERT_SHARED-GERMAN-FINETUNED-SUMMARIZATION model is a German language model fine-tuned for text summarization tasks. It is based on a BERT encoder-decoder architecture and is designed to generate concise summaries of German texts, particularly in the context of news articles.

Architecture

The model utilizes the BERT base architecture, specifically the bert-base-german-cased checkpoint. It employs an encoder-decoder setup, where both components are derived from BERT, allowing the model to effectively handle text-to-text generation tasks.

Training

This model was fine-tuned on the MLSUM dataset, which is a large-scale multilingual summarization dataset consisting of article-summary pairs from online newspapers in multiple languages, including German. The model's performance has been evaluated using the Rouge-2 metric, demonstrating reasonable precision, recall, and f-measure scores.

Guide: Running Locally

To run the model locally, follow these steps:

  1. Install Dependencies: Ensure you have Python and PyTorch installed. Install the transformers library from Hugging Face:

    pip install transformers
    
  2. Setup Device: Determine if a GPU is available:

    import torch
    device = 'cuda' if torch.cuda.is_available() else 'cpu'
    
  3. Load Model and Tokenizer:

    from transformers import BertTokenizerFast, EncoderDecoderModel
    ckpt = 'mrm8488/bert2bert_shared-german-finetuned-summarization'
    tokenizer = BertTokenizerFast.from_pretrained(ckpt)
    model = EncoderDecoderModel.from_pretrained(ckpt).to(device)
    
  4. Generate Summary:

    def generate_summary(text):
        inputs = tokenizer([text], padding="max_length", truncation=True, max_length=512, return_tensors="pt")
        input_ids = inputs.input_ids.to(device)
        attention_mask = inputs.attention_mask.to(device)
        output = model.generate(input_ids, attention_mask=attention_mask)
        return tokenizer.decode(output[0], skip_special_tokens=True)
    
    text = "Your text here..."
    summary = generate_summary(text)
    

For better performance, consider using cloud computing platforms with GPU support such as Google Colab or AWS EC2.

License

The model is created by Manuel Romero and is supported by Narrativa. It is made with care in Spain, and users should refer to the specific licensing terms provided in the model repository on Hugging Face for detailed usage rights and permissions.

More Related APIs in Summarization