financial summarization pegasus

human-centered-summarization

Introduction

The Financial Summarization PEGASUS model, developed by Human-Centered Summarization, is designed for summarizing financial news. It is based on the PEGASUS architecture and fine-tuned on a dataset of Bloomberg articles covering topics like stocks, markets, currencies, and cryptocurrencies. This model offers an advanced version with improved ROUGE scores available on Rapid API.

Architecture

The model utilizes the PEGASUS architecture, specifically the variant fine-tuned on the Extreme Summarization (XSum) dataset. PEGASUS is known for its pre-training with extracted gap-sentences for abstractive summarization. The original PEGASUS model was introduced by Jingqing Zhang et al. in their paper "PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization."

Training

The model is fine-tuned on a novel financial news dataset consisting of 2,000 Bloomberg articles. The evaluation shows significant improvements in ROUGE scores after fine-tuning: ROUGE-1 (23.55), ROUGE-2 (6.99), ROUGE-L (18.14), and ROUGE-LSUM (21.36).

Guide: Running Locally

  1. Setup Environment: Install the transformers library from Hugging Face.

    pip install transformers
    
  2. Load the Model: Use the following code snippet to load and use the model for summarization.

    from transformers import PegasusTokenizer, PegasusForConditionalGeneration
    
    model_name = "human-centered-summarization/financial-summarization-pegasus"
    tokenizer = PegasusTokenizer.from_pretrained(model_name)
    model = PegasusForConditionalGeneration.from_pretrained(model_name)
    
    text_to_summarize = "Your financial news text here."
    input_ids = tokenizer(text_to_summarize, return_tensors="pt").input_ids
    
    output = model.generate(input_ids, max_length=32, num_beams=5, early_stopping=True)
    print(tokenizer.decode(output[0], skip_special_tokens=True))
    
  3. Cloud GPUs: For efficient computation, consider using cloud GPU services like AWS EC2, Google Cloud, or Azure.

License

The model card does not specify a license. For licensing details, please refer to the Hugging Face model page or contact the developers.

More Related APIs in Summarization