t5 base finetuned summarize news

mrm8488

Introduction

The T5-Base-Finetuned-Summarize-News model is a fine-tuned version of Google's T5 model, specifically adapted for summarizing news articles. The model is based on the T5 architecture, which utilizes a text-to-text framework for various natural language processing tasks. It has been fine-tuned on the "News Summary" dataset to generate concise summaries of news articles.

Architecture

The T5 model, introduced in the paper "Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer," explores transfer learning techniques in NLP. This model transforms every language problem into a text-to-text format, allowing a unified approach to tasks like summarization, question answering, and text classification. By leveraging the "Colossal Clean Crawled Corpus" for pre-training, the T5 model achieves state-of-the-art results across various benchmarks.

Training

The T5-Base-Finetuned-Summarize-News model was trained on a dataset compiled from various news sources, including Inshorts, Hindu, Indian Times, and Guardian. The dataset contains 4515 examples with metadata such as author name, headlines, article URLs, short texts, and complete articles. The model was trained for six epochs using a modified version of a training script provided by Abhishek Kumar Mishra.

Guide: Running Locally

To run the T5-Base-Finetuned-Summarize-News model locally, follow these steps:

  1. Install Dependencies: Ensure you have Python and the necessary libraries installed. You can install the transformers library using pip:

    pip install transformers
    
  2. Load the Model and Tokenizer: Use the transformers library to load the model and tokenizer:

    from transformers import AutoModelWithLMHead, AutoTokenizer
    
    tokenizer = AutoTokenizer.from_pretrained("mrm8488/t5-base-finetuned-summarize-news")
    model = AutoModelWithLMHead.from_pretrained("mrm8488/t5-base-finetuned-summarize-news")
    
  3. Summarize Text: Define a function to generate summaries:

    def summarize(text, max_length=150):
        input_ids = tokenizer.encode(text, return_tensors="pt", add_special_tokens=True)
        generated_ids = model.generate(input_ids=input_ids, num_beams=2, max_length=max_length, repetition_penalty=2.5, length_penalty=1.0, early_stopping=True)
        preds = [tokenizer.decode(g, skip_special_tokens=True, clean_up_tokenization_spaces=True) for g in generated_ids]
        return preds[0]
    
  4. Use a Cloud GPU: For efficient processing, especially with large datasets or longer texts, consider using a cloud GPU service such as Google Colab or AWS.

License

The model and associated code are available under a license that allows for use and distribution, mentioning credit to the original creator, Manuel Romero, and the contributors. Always review the specific licensing terms provided in the repository or documentation.

More Related APIs in Text2text Generation