rut5_base_sum_gazeta

IlyaGusev

Introduction

The RUT5_BASE_SUM_GAZETA model is designed for abstractive summarization of Russian text. It is based on the rut5-base model, tailored to generate concise summaries of input documents. The model is built using the Transformers library and is compatible with PyTorch.

Architecture

The model architecture is based on the T5 (Text-to-Text Transfer Transformer) architecture, which is a versatile model suitable for various text generation tasks, including summarization. The model specifically uses the rut5-base configuration, optimized for Russian language input.

Training

The model was trained using the Gazeta dataset, which provides Russian language text for training and testing summarization tasks. The training process utilized a script named train.py and a configuration file t5_training_config.json. The model was evaluated using metrics such as ROUGE, BLEU, and METEOR, with results indicating competitive performance in summarization tasks.

Guide: Running Locally

To run the model locally, follow these steps:

  1. Install the necessary libraries:

    pip install transformers torch datasets
    
  2. Load the model and tokenizer:

    from transformers import AutoTokenizer, T5ForConditionalGeneration
    
    model_name = "IlyaGusev/rut5_base_sum_gazeta"
    tokenizer = AutoTokenizer.from_pretrained(model_name)
    model = T5ForConditionalGeneration.from_pretrained(model_name)
    
  3. Prepare the input text and generate a summary:

    article_text = "..."  # Your text here
    input_ids = tokenizer(
        [article_text],
        max_length=600,
        add_special_tokens=True,
        padding="max_length",
        truncation=True,
        return_tensors="pt"
    )["input_ids"]
    
    output_ids = model.generate(
        input_ids=input_ids,
        no_repeat_ngram_size=4
    )[0]
    
    summary = tokenizer.decode(output_ids, skip_special_tokens=True)
    print(summary)
    
  4. For large-scale summarization tasks, consider using cloud GPUs such as those provided by AWS, Google Cloud, or Azure for improved performance.

License

This model is released under the Apache 2.0 License, allowing for both personal and commercial use, subject to the terms of the license.

More Related APIs in Summarization