Introduction

GPT-NeoX-20B is a 20 billion parameter autoregressive language model developed by EleutherAI. It is based on the GPT-NeoX library and trained using the Pile dataset, which contains a wide range of English-language texts. This model is designed for research purposes and can be fine-tuned for various tasks, though it should not be deployed as-is for human-facing interactions.

Architecture

GPT-NeoX-20B is a transformer-based language model with an architecture similar to GPT-3. It features 44 layers, a model dimension of 6144, 64 attention heads, and a vocabulary of 50257 tokens. The model uses Rotary Position Embedding (RoPE) for positional encoding.

Training

The model was trained on the Pile, an 825GiB dataset from 22 diverse sources. Training involved a batch size of approximately 3.15M tokens for 150,000 steps, utilizing tensor and pipeline parallelism across GPUs. The training process and dataset details are elaborated in the associated papers.

Guide: Running Locally

To run GPT-NeoX-20B locally, follow these steps:

  1. Install the Transformers Library:

    pip install transformers
    
  2. Load the Model:

    from transformers import AutoTokenizer, AutoModelForCausalLM
    
    tokenizer = AutoTokenizer.from_pretrained("EleutherAI/gpt-neox-20b")
    model = AutoModelForCausalLM.from_pretrained("EleutherAI/gpt-neox-20b")
    
  3. Use a Cloud GPU: For efficient processing, consider using cloud GPU services like AWS, Google Cloud, or Azure.

License

GPT-NeoX-20B is released under the Apache 2.0 license, allowing for modification and distribution. Users are advised to conduct a risk and bias assessment before deploying any fine-tuned models based on GPT-NeoX-20B.

More Related APIs in Text Generation