led base book summary

pszemraj

Introduction

The LED-Base-Book-Summary model by Pszemraj is a summarization tool designed to condense long and technical information efficiently. It is based on the Longformer Encoder-Decoder (LED) architecture and is fine-tuned to handle extensive technical, academic, and narrative content.

Architecture

The model is built upon the allenai/led-base-16384, capable of processing up to 16,384 tokens per batch. It is especially suitable for summarizing long narratives, articles, papers, textbooks, and other documents, providing concise and insightful summaries.

Training

This model was trained using the BookSum dataset provided by Salesforce and is licensed under the BSD-3-Clause license. The training involved 16 epochs with a focus on very fine-tuning-type training using a low learning rate. The model checkpoint is available at pszemraj/led-base-16384-finetuned-booksum.

Guide: Running Locally

To run the model locally, follow these steps:

  1. Install Dependencies:

    • Ensure you have transformers and torch installed.
    • Optionally, install textsum for simplified usage:
      pip install textsum
      
  2. Create the Pipeline:

    import torch
    from transformers import pipeline
    
    hf_name = "pszemraj/led-base-book-summary"
    
    summarizer = pipeline(
        "summarization",
        hf_name,
        device=0 if torch.cuda.is_available() else -1,
    )
    
  3. Summarize Text:

    wall_of_text = "your words here"
    
    result = summarizer(
        wall_of_text,
        min_length=8,
        max_length=256,
        no_repeat_ngram_size=3,
        encoder_no_repeat_ngram_size=3,
        repetition_penalty=3.5,
        num_beams=4,
        do_sample=False,
        early_stopping=True,
    )
    print(result[0]["generated_text"])
    
  4. Using TextSum:

    from textsum.summarize import Summarizer
    
    model_name = "pszemraj/led-base-book-summary"
    summarizer = Summarizer(
        model_name_or_path=model_name,
        token_batch_length=4096,
    )
    long_string = "This is a long string of text that will be summarized."
    out_str = summarizer.summarize_string(long_string)
    print(f"summary: {out_str}")
    
  5. Cloud GPUs: For better performance, consider running the model on cloud platforms providing GPU support, such as Google Colab or AWS.

License

The model is dual-licensed under the Apache-2.0 and BSD-3-Clause licenses, offering flexibility for both personal and commercial use.

More Related APIs in Summarization