t5 base uk to us english

EnglishVoice

Introduction

The T5-BASE-UK-TO-US-ENGLISH model by English Voice AI Labs is designed to convert UK English sentences into US English. It modifies both spelling and vocabulary to align with American English conventions.

Architecture

This model employs the T5 architecture, a popular choice for text-to-text transformations. It can handle tasks such as paraphrase-generation and text-generation-inference. It is compatible with frameworks like PyTorch, TensorFlow, and JAX.

Training

The model was trained using a dataset of 264,519 sentences with UK English spelling and their corresponding US English translations. This dataset was created by English Voice AI Labs and is available for download from their website.

Guide: Running Locally

  1. Environment Setup:

    • Ensure Python and PyTorch are installed.
    • Install the transformers library:
      pip install transformers
      
  2. Device Configuration:

    • Check for GPU availability and set the device:
      import torch
      device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
      
  3. Model Loading:

    • Load the model and tokenizer:
      from transformers import T5ForConditionalGeneration, T5Tokenizer
      model = T5ForConditionalGeneration.from_pretrained("EnglishVoice/t5-base-uk-to-us-english").to(device)
      tokenizer = T5Tokenizer.from_pretrained("EnglishVoice/t5-base-uk-to-us-english")
      
  4. Inference Example:

    • Encode input and generate output:
      input_text = "My favourite colour is yellow."
      text = "UK to US: " + input_text
      encoding = tokenizer.encode_plus(text, return_tensors = "pt")
      input_ids = encoding["input_ids"].to(device)
      attention_masks = encoding["attention_mask"].to(device)
      beam_outputs = model.generate(input_ids=input_ids, attention_mask=attention_masks, early_stopping=True)
      result = tokenizer.decode(beam_outputs[0], skip_special_tokens=True)
      print(result)
      
  5. Cloud GPUs:

    • For optimal performance, consider using cloud GPU services like AWS, GCP, or Azure.

License

The T5-BASE-UK-TO-US-ENGLISH model is licensed under the Apache 2.0 License, allowing for both personal and commercial use.

More Related APIs in Text2text Generation