grammar synthesis small

pszemraj

Introduction

The Grammar-Synthesis-Small model is a fine-tuned version of Google's T5-small for grammar correction. It aims to correct grammatical errors without altering the semantics of the original text. This model is particularly suited for applications like correcting audio transcriptions or generated text.

Architecture

The model is based on the T5 architecture, a text-to-text framework that allows it to transform input text into corrected output text. It uses the JFLEG dataset for training, which is specifically designed for evaluating grammar correction systems.

Training

The model was trained using the following hyperparameters:

  • Learning rate: 0.0004
  • Train batch size: 16
  • Evaluation batch size: 16
  • Seed: 42
  • Distributed type: Multi-GPU
  • Gradient accumulation steps: 32
  • Total train batch size: 512
  • Optimizer: Adam with betas (0.9, 0.999) and epsilon 1e-08
  • LR scheduler type: Cosine
  • LR scheduler warmup ratio: 0.03
  • Number of epochs: 4

The training was conducted using Transformers 4.20.1, PyTorch 1.11.0+cu113, Datasets 2.3.2, and Tokenizers 0.12.1.

Guide: Running Locally

To run the model locally, follow these steps:

  1. Install dependencies: Ensure you have Python installed, then install the required packages:

    pip install transformers torch
    
  2. Load the model: Use the Transformers library to load and use the model:

    from transformers import pipeline
    corrector = pipeline('text2text-generation', 'pszemraj/grammar-synthesis-small')
    
  3. Correct text: Provide text input for correction:

    raw_text = 'i can has cheezburger'
    results = corrector(raw_text)
    print(results)
    
  4. Cloud GPUs: For enhanced performance, consider using cloud GPUs such as those provided by Google Colab, AWS, or Azure to handle larger datasets or more intensive computation.

License

The model is distributed under two licenses:

  • Dataset: CC BY-NC-SA 4.0
  • Model: Apache 2.0

These licenses allow usage with certain restrictions, particularly in non-commercial contexts.

More Related APIs in Text2text Generation