Open E L M
appleOpenELM: An Efficient Language Model Family with Open Training and Inference Framework
Introduction
OpenELM is a family of efficient language models designed to enhance accuracy through a layer-wise scaling strategy within transformer models. Developed by a team including Sachin Mehta, Mohammad Hossein Sekhavat, and others, the models are pretrained using the CoreNet library and come in various sizes: 270M, 450M, 1.1B, and 3B parameters. Both pretrained and instruction-tuned models are available for each size.
Architecture
OpenELM employs a layer-wise scaling strategy to efficiently allocate parameters within transformer layers, contributing to improved model accuracy. The models are pretrained on a dataset comprising RefinedWeb, deduplicated PILE, a subset of RedPajama, and Dolma v1.6, totaling approximately 1.8 trillion tokens.
Training
The training involved using a diverse dataset, and both pretrained and instruction-tuned models are made available. Users are advised to review the license agreements and terms of these datasets before use.
Guide: Running Locally
-
Install Dependencies:
- Clone the evaluation harness repository:
git clone https://github.com/EleutherAI/lm-evaluation-harness public-lm-eval-harness cd public-lm-eval-harness git checkout dc90fec pip install -e . cd ..
- Install additional packages:
pip install datasets@git+https://github.com/huggingface/datasets.git@66d6242 pip install tokenizers>=0.15.2 transformers>=4.38.2 sentencepiece>=0.2.0
- Clone the evaluation harness repository:
-
Run Inference:
- Use the
generate_openelm.py
script to generate text:python generate_openelm.py --model [MODEL_NAME] --hf_access_token [HF_ACCESS_TOKEN] --prompt 'Once upon a time there was' --generate_kwargs repetition_penalty=1.2
- Use the
-
Cloud GPUs:
- For enhanced performance, consider using cloud GPU services such as AWS EC2, Google Cloud Platform, or Azure.
License
OpenELM is released under the Apple Sample Code License. Users should review the license details to understand the terms and conditions of use.