L3.1 Pneuma 8 B LLM Model

Introduction

The L3.1-PNEUMA-8B model is a fine-tuned version of the meta-llama/Llama-3.1-8B-Instruct, trained on the Sandevistan dataset. It is designed to challenge traditional paradigms in training large language models, focusing on user experience over profitability.

Architecture

The model is built using Axolotl version 0.5.0, with a base model of meta-llama/Llama-3.1-8B-Instruct. It does not utilize the 8-bit or 4-bit loading options and includes features like sample packing and sequence padding. The architecture supports advanced plugins such as Liger for enhanced performance.

Training

Training Procedure

The training process incorporates various hyperparameters, including a learning rate of 7.8e-06, a train batch size of 8, and a total train batch size of 128. The optimizer used is PAGED_ADAMW_8BIT with a cosine learning rate scheduler, and training is conducted over 2 epochs.

Training Results

The model achieved a training loss of approximately 2.4357 on the evaluation set, with loss values decreasing through the training epochs.

Guide: Running Locally

Environment Setup: Ensure you have Python installed along with dependencies such as PyTorch and the Transformers library.
Model Download: Clone the model from its repository or download it directly from Hugging Face.

Install Required Libraries:

pip install torch transformers datasets

Model Loading: Use the Transformers library to load the pre-trained model and tokenizer.

from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("Replete-AI/L3.1-Pneuma-8B")
tokenizer = AutoTokenizer.from_pretrained("Replete-AI/L3.1-Pneuma-8B")

Cloud GPUs: For efficient processing, consider using cloud services like AWS, GCP, or Azure, which provide access to GPUs.

License

The model is licensed under the llama3.1 license, which governs its usage and distribution. Please review the license terms for compliance.

More Related APIs in Text Generation