OpenELM: An Efficient Language Model Family with Open Training and Inference Framework

Introduction

OpenELM is a family of efficient language models designed to enhance accuracy through a layer-wise scaling strategy within transformer models. Developed by a team including Sachin Mehta, Mohammad Hossein Sekhavat, and others, the models are pretrained using the CoreNet library and come in various sizes: 270M, 450M, 1.1B, and 3B parameters. Both pretrained and instruction-tuned models are available for each size.

Architecture

OpenELM employs a layer-wise scaling strategy to efficiently allocate parameters within transformer layers, contributing to improved model accuracy. The models are pretrained on a dataset comprising RefinedWeb, deduplicated PILE, a subset of RedPajama, and Dolma v1.6, totaling approximately 1.8 trillion tokens.

Training

The training involved using a diverse dataset, and both pretrained and instruction-tuned models are made available. Users are advised to review the license agreements and terms of these datasets before use.

Guide: Running Locally

Install Dependencies:

Clone the evaluation harness repository:

git clone https://github.com/EleutherAI/lm-evaluation-harness public-lm-eval-harness
cd public-lm-eval-harness
git checkout dc90fec
pip install -e .
cd ..

Install additional packages:

pip install datasets@git+https://github.com/huggingface/datasets.git@66d6242
pip install tokenizers>=0.15.2 transformers>=4.38.2 sentencepiece>=0.2.0

Run Inference:

Use the generate_openelm.py script to generate text:

python generate_openelm.py --model [MODEL_NAME] --hf_access_token [HF_ACCESS_TOKEN] --prompt 'Once upon a time there was' --generate_kwargs repetition_penalty=1.2

Cloud GPUs:
- For enhanced performance, consider using cloud GPU services such as AWS EC2, Google Cloud Platform, or Azure.

License

OpenELM is released under the Apple Sample Code License. Users should review the license details to understand the terms and conditions of use.