led base book summary
pszemrajIntroduction
The LED-Base-Book-Summary model by Pszemraj is a summarization tool designed to condense long and technical information efficiently. It is based on the Longformer Encoder-Decoder (LED) architecture and is fine-tuned to handle extensive technical, academic, and narrative content.
Architecture
The model is built upon the allenai/led-base-16384, capable of processing up to 16,384 tokens per batch. It is especially suitable for summarizing long narratives, articles, papers, textbooks, and other documents, providing concise and insightful summaries.
Training
This model was trained using the BookSum dataset provided by Salesforce and is licensed under the BSD-3-Clause license. The training involved 16 epochs with a focus on very fine-tuning-type training using a low learning rate. The model checkpoint is available at pszemraj/led-base-16384-finetuned-booksum
.
Guide: Running Locally
To run the model locally, follow these steps:
-
Install Dependencies:
- Ensure you have
transformers
andtorch
installed. - Optionally, install
textsum
for simplified usage:pip install textsum
- Ensure you have
-
Create the Pipeline:
import torch from transformers import pipeline hf_name = "pszemraj/led-base-book-summary" summarizer = pipeline( "summarization", hf_name, device=0 if torch.cuda.is_available() else -1, )
-
Summarize Text:
wall_of_text = "your words here" result = summarizer( wall_of_text, min_length=8, max_length=256, no_repeat_ngram_size=3, encoder_no_repeat_ngram_size=3, repetition_penalty=3.5, num_beams=4, do_sample=False, early_stopping=True, ) print(result[0]["generated_text"])
-
Using TextSum:
from textsum.summarize import Summarizer model_name = "pszemraj/led-base-book-summary" summarizer = Summarizer( model_name_or_path=model_name, token_batch_length=4096, ) long_string = "This is a long string of text that will be summarized." out_str = summarizer.summarize_string(long_string) print(f"summary: {out_str}")
-
Cloud GPUs: For better performance, consider running the model on cloud platforms providing GPU support, such as Google Colab or AWS.
License
The model is dual-licensed under the Apache-2.0 and BSD-3-Clause licenses, offering flexibility for both personal and commercial use.