Llama 3 8 B Magpie Align S F T v0.3 LLM Model

Introduction

The LLAMA-3-8B-MAGPIE-ALIGN-SFT-V0.3 model is a fine-tuned version of Meta-Llama-3-8B, designed to enhance multi-lingual capabilities, particularly with added support for Chinese language instructions. This model is a part of the Magpie-Align project, focusing on alignment data synthesis.

Architecture

This model is based on Meta-Llama-3-8B and uses the Axolotl framework for implementation. It incorporates datasets such as Magpie-Align/Magpie-Reasoning-150K, Magpie-Align/Magpie-Pro-MT-300K-v0.1, and Magpie-Align/Magpie-Qwen2-Pro-200K-Chinese. The model is designed to handle English and Chinese languages and supports conversation-based applications.

Training

Training Hyperparameters

Learning Rate: 2e-05
Train Batch Size: 1
Eval Batch Size: 1
Seed: 42
Distributed Type: Multi-GPU
Number of Devices: 4
Gradient Accumulation Steps: 32
Total Train Batch Size: 128
Total Eval Batch Size: 4
Optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
LR Scheduler Type: Cosine
LR Scheduler Warmup Steps: 98
Number of Epochs: 2

Training Results

The training process involved continuous evaluation, resulting in the following validation loss metrics that indicate the model's learning progress over epochs.

Framework Versions

Transformers: 4.42.3
PyTorch: 2.3.1+cu121
Datasets: 2.19.1
Tokenizers: 0.19.1

Guide: Running Locally

Basic Steps

Clone the Repository:

git clone https://github.com/magpie-align/magpie
cd magpie

Install Dependencies:
Ensure you have the compatible versions of PyTorch and Transformers installed, as specified above.

Load the Model:
Use the Hugging Face Transformers library to load the model:

from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained('Magpie-Align/Llama-3-8B-Magpie-Align-SFT-v0.3')
tokenizer = AutoTokenizer.from_pretrained('Magpie-Align/Llama-3-8B-Magpie-Align-SFT-v0.3')

Run Inference:
Tokenize your input and generate text:

inputs = tokenizer("Your input text here", return_tensors="pt")
outputs = model.generate(**inputs)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Cloud GPUs

For efficient training and inference, consider using cloud-based GPU services like AWS EC2 with NVIDIA GPUs, Google Cloud Platform, or Azure Machine Learning.

License

This model is distributed under the Meta Llama 3 Community License. Please refer to Meta Llama 3 License for more details.

More Related APIs in Text Generation