Introduction

The Open Pre-trained Transformers (OPT) are a series of language models developed by Meta AI, designed to facilitate research on large language models by providing open and accessible models. These models range from 125 million to 175 billion parameters and aim to match the performance of the GPT-3 models. The goal is to enable reproducible and responsible research in the field of large language models (LLMs), addressing issues such as robustness, bias, and toxicity.

Architecture

OPT models are decoder-only architectures similar to GPT-3, primarily trained on English text with some non-English data from CommonCrawl. They are designed for tasks like text generation and use a causal language modeling objective. The models are pretrained using a self-supervised approach and evaluated using prompts and setups similar to GPT-3.

Training

The training data for OPT is sourced from a broad collection of datasets, including BookCorpus, CC-Stories, The Pile, Pushshift.io Reddit dataset, and CCNewsV2. The dataset comprises 180 billion tokens, roughly 800GB of data, with a validation split of 200MB. Training involved preprocessing with GPT2 byte-level BPE tokenization and was conducted on 992 80GB A100 GPUs over approximately 33 days.

Guide: Running Locally

  1. Installation: Install the Hugging Face Transformers library.

    pip install transformers
    
  2. Model Loading: Use the Transformers library to load the OPT model for text generation.

    from transformers import pipeline
    generator = pipeline('text-generation', model="facebook/opt-2.7b")
    
  3. Text Generation: Generate text using the model.

    generator("What are we having for dinner?")
    
  4. Sampling: Enable top-k sampling for non-deterministic output.

    generator = pipeline('text-generation', model="facebook/opt-2.7b", do_sample=True)
    

For optimal performance, consider using cloud GPU services such as AWS, Google Cloud, or Azure, which offer GPU instances.

License

The OPT models are released under a license categorized as "other," indicating specific terms and conditions defined by the creators, Meta AI. These may include restrictions on commercial use and obligations to follow ethical guidelines for research and deployment.

More Related APIs in Text Generation