rut5 base paraphraser

cointegrated

Introduction

The rut5-base-paraphraser is a model designed for paraphrasing Russian sentences. It utilizes the T5 architecture and is developed by Cointegrated. The model is suitable for text-to-text generation tasks in the Russian language.

Architecture

The model is built on the T5 (Text-to-Text Transfer Transformer) architecture and implemented using PyTorch. It is optimized for paraphrasing tasks, making use of the transformer framework to generate variations of input text while preserving its meaning.

Training

The model leverages the cointegrated/ru-paraphrase-NMT-Leipzig dataset for training. The recommended setup involves using the encoder_no_repeat_ngram_size argument to prevent repetition of n-grams in the generated paraphrases. This setup enhances the quality of the paraphrased outputs.

Guide: Running Locally

To run the rut5-base-paraphraser locally, follow these steps:

  1. Install the Transformers Library: Ensure you have the Hugging Face Transformers library installed.

    pip install transformers
    
  2. Load the Model and Tokenizer:

    from transformers import T5ForConditionalGeneration, T5Tokenizer
    
    MODEL_NAME = 'cointegrated/rut5-base-paraphraser'
    model = T5ForConditionalGeneration.from_pretrained(MODEL_NAME)
    tokenizer = T5Tokenizer.from_pretrained(MODEL_NAME)
    model.cuda()
    model.eval()
    
  3. Paraphrase Function:

    def paraphrase(text, beams=5, grams=4, do_sample=False):
        x = tokenizer(text, return_tensors='pt', padding=True).to(model.device)
        max_size = int(x.input_ids.shape[1] * 1.5 + 10)
        out = model.generate(**x, encoder_no_repeat_ngram_size=grams, num_beams=beams, max_length=max_size, do_sample=do_sample)
        return tokenizer.decode(out[0], skip_special_tokens=True)
    
    # Example usage
    print(paraphrase('Каждый охотник желает знать, где сидит фазан.'))
    
  4. Cloud GPU Recommendation: For optimal performance, especially with large datasets or batch processing, consider using cloud GPUs from providers like AWS, Google Cloud, or Azure.

License

The model is licensed under the MIT License, allowing for broad usage and modification.

More Related APIs in Text2text Generation