L3 Pneuma 8 B
Replete-AIIntroduction
L3-PNEUMA-8B is a fine-tuned version of the Meta-Llama-3-8B model, optimized for tasks like chatting, conversation, and assistance in small-scale projects. The model prioritizes user experience and is trained using the Sandevistan dataset.
Architecture
The model is built using the Axolotl framework and is based on the Meta-Llama-3-8B architecture. It incorporates several advanced features such as gradient checkpointing and flash attention to enhance performance. The model uses specialized tokens for the beginning and end of text sequences.
Training
Training Hyperparameters
- Learning Rate: 1e-05
- Train Batch Size: 8
- Eval Batch Size: 8
- Seed: 42
- Gradient Accumulation Steps: 16
- Total Train Batch Size: 128
- Optimizer: Adam (betas=(0.9,0.999), epsilon=1e-08)
- LR Scheduler Type: Cosine
- LR Scheduler Warmup Steps: 10
- Training Steps: 743
Training Results
The model achieved a training loss of 2.7381 on the evaluation dataset.
Framework Versions
- Transformers: 4.45.1
- PyTorch: 2.3.1+cu121
- Datasets: 2.21.0
- Tokenizers: 0.20.1
Guide: Running Locally
To run L3-PNEUMA-8B locally, follow these steps:
- Clone the Repository: Clone the model repository from Hugging Face to your local machine.
- Install Dependencies: Ensure you have Python and necessary libraries installed (
transformers
,torch
, etc.). - Load the Model: Use the Hugging Face Transformers library to load the model.
- Run Inference: Implement a script to input text and receive generated text output.
For optimal performance, use cloud GPU services such as AWS, GCP, or Azure to handle the model's computational requirements.
License
The L3-PNEUMA-8B model is licensed under the "llama3" license, which dictates the terms and conditions for usage.