O L Mo 2 1124 7 B Instruct
allenaiIntroduction
OLMo-2-1124-7B-Instruct is a post-trained variant of the OLMo-2 7B model, developed by Allen Institute for AI. It is designed for a wide range of tasks including conversational AI, MATH, GSM8K, and IFEval, using a combination of supervised fine-tuning and reinforcement learning with human feedback.
Architecture
OLMo-2-1124-7B-Instruct belongs to the OLMo series of Open Language Models. It is primarily trained in English and uses a mix of publicly available, synthetic, and human-created datasets. The model is built upon the Dolma dataset and is integrated with the Transformers library.
Training
The model underwent multiple stages of training:
- Supervised Fine-Tuning (SFT): Leveraging the Tülu 3 dataset variant.
- DPO Training: Further refinement using preference data.
- RLVR Training: Reinforcement Learning with Variable Reward using RLVR-GSM data.
Guide: Running Locally
Basic Steps
-
Install Transformers Library:
pip install --upgrade git+https://github.com/huggingface/transformers.git
-
Load the Model:
from transformers import AutoModelForCausalLM olmo_model = AutoModelForCausalLM.from_pretrained("allenai/OLMo-2-1124-7B-Instruct")
-
Use the Chat Template:
<|endoftext|><|user|> How are you doing? <|assistant|> I'm just a computer program, so I don't have feelings, but I'm functioning as expected. How can I assist you today?<|endoftext|>
Cloud GPUs
For optimal performance, consider using cloud GPU services such as AWS EC2, Google Cloud Platform, or Azure.
License
The OLMo-2-1124-7B-Instruct is licensed under the Apache 2.0 license and is intended for research and educational purposes. Additional terms apply for dataset mixes involving third-party model outputs. For more details, refer to the Responsible Use Guidelines.