parrot_paraphraser_on_ T5

prithivida

Introduction

Parrot is a paraphrase-based utterance augmentation framework designed to enhance the training of Natural Language Understanding (NLU) models. It provides more than simple paraphrasing capabilities and is built to fill gaps in current paraphrasing technologies, making it a valuable tool for generating diverse, adequate, and fluent paraphrases.

Architecture

Parrot leverages the T5 model within the PyTorch framework to generate paraphrases. It provides controls over three key metrics: Adequacy, Fluency, and Diversity, allowing users to fine-tune the balance between these factors based on their needs. The model is designed to generate high-quality paraphrases while maintaining the original intent and slots/entities, crucial for NLU model training.

Training

Parrot is designed with conversational interfaces in mind, focusing on phrases typically typed or spoken to such systems. The pre-trained model is optimized for text samples with a maximum length of 32 characters, making it suitable for short, interactive exchanges. The training process emphasizes maintaining meaning (adequacy) and grammatical correctness (fluency) while introducing lexical variations (diversity).

Guide: Running Locally

To run Parrot locally, follow these steps:

  1. Installation:

    pip install git+https://github.com/PrithivirajDamodaran/Parrot_Paraphraser.git
    
  2. Quickstart:

    from parrot import Parrot
    import torch
    import warnings
    warnings.filterwarnings("ignore")
    
    parrot = Parrot(model_tag="prithivida/parrot_paraphraser_on_T5", use_gpu=False)
    
    phrases = ["Your phrase here"]
    for phrase in phrases:
        para_phrases = parrot.augment(input_phrase=phrase)
        for para_phrase in para_phrases:
            print(para_phrase)
    
  3. Environment:

    • Ensure you have Python and PyTorch installed.
    • Consider using cloud GPU services like AWS or Google Cloud for improved performance if running large batches or more extensive models.

License

Parrot is available under the MIT License, allowing for wide use and distribution with minimal restrictions. For more information, refer to the project's GitHub repository.

More Related APIs in Text2text Generation