t5 finetuned test

osanseviero

Introduction

The Wikihow T5-small is a fine-tuned model based on T5-small architecture, optimized for text summarization tasks using the Wikihow dataset. It is designed to convert complex text into concise summaries efficiently.

Architecture

This model utilizes the T5-small architecture, which is a transformer-based model. It operates on a sequence-to-sequence (seq2seq) basis, with a particular focus on converting lengthy informational texts into shorter summaries. The model leverages PyTorch for its operations and is compatible with various transformers libraries.

Training

The model was trained for 3 epochs using the Wikihow dataset, with a batch size of 16 and a learning rate of 3e-4. The maximum input length was set to 512, and the maximum output length was set to 150. The training achieved a Rouge1 score of 31.2 and a RougeL score of 24.5, indicating its effectiveness in summarization tasks. Further details on the training process are documented in a blog post here.

Guide: Running Locally

To run the model locally, follow these steps:

  1. Install Dependencies: Ensure Python and PyTorch are installed. Use the transformers library.

    pip install torch transformers
    
  2. Load Model and Tokenizer:

    from transformers import AutoTokenizer, AutoModelWithLMHead
    import torch
    
    tokenizer = AutoTokenizer.from_pretrained("deep-learning-analytics/wikihow-t5-small")
    model = AutoModelWithLMHead.from_pretrained("deep-learning-analytics/wikihow-t5-small")
    
  3. Prepare for GPU: If a GPU is available, configure the model to use it.

    device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
    model = model.to(device)
    
  4. Generate Summary:

    text = """Your input text here..."""
    preprocess_text = text.strip()
    tokenized_text = tokenizer.encode(preprocess_text, return_tensors="pt").to(device)
    
    summary_ids = model.generate(
        tokenized_text,
        max_length=150, 
        num_beams=2,
        repetition_penalty=2.5, 
        length_penalty=1.0, 
        early_stopping=True
    )
    
    output = tokenizer.decode(summary_ids[0], skip_special_tokens=True)
    print("Summarized text:", output)
    
  5. Suggested Cloud GPUs: Services like AWS, Google Cloud, or Paperspace can be used to run the model with GPU support for better performance.

License

The model and accompanying code are available under the Apache License 2.0, allowing for both personal and commercial use provided that the original authors are credited, and modifications are tracked.

More Related APIs in Summarization