Eurus 2 7 B S F T

PRIME-RL

Introduction

Eurus-2-7B-SFT is a fine-tuned model derived from Qwen2.5-Math-7B-Base, known for its mathematical proficiency. It is trained using the Eurus-2-SFT-Data dataset, which emphasizes action-centric chain-of-thought reasoning. The model employs imitation learning (supervised finetuning) as an initial stage to teach reasoning patterns. It acts as a preliminary model for Eurus-2-7B-PRIME.

Architecture

Eurus-2-7B-SFT utilizes a specialized prompting system for coding and mathematical tasks. Prompts guide the model through various actions such as ASSESS, ADVANCE, VERIFY, SIMPLIFY, SYNTHESIZE, PIVOT, and OUTPUT to help structure the reasoning process. For coding tasks, the model outputs solutions in Python, and for math tasks, it presents answers in LaTex format.

Training

The model is trained using the Eurus-2-SFT-Data dataset with a focus on action-centric reasoning. Imitation learning is employed as a warmup phase to instill reasoning capabilities into the model. The training process fine-tunes the model's ability to respond to tailored prompts for complex reasoning tasks.

Guide: Running Locally

  1. Setup Environment: Ensure Python and necessary libraries (e.g., transformers, torch) are installed.
  2. Download Model: Access Eurus-2-7B-SFT from the Hugging Face model hub.
  3. Load Model: Use the Hugging Face transformers library to load the model into your environment.
  4. Run Inference: Prepare your data and prompts for coding or math tasks and run the model to get predictions.

For optimal performance, consider using cloud-based GPUs such as those provided by AWS, Google Cloud, or Azure to handle the computational demands.

License

The Eurus-2-7B-SFT model is licensed under the Apache-2.0 License. This allows for both personal and commercial use with proper attribution.

More Related APIs