magnum v4 9b
anthracite-orgIntroduction
The Magnum V4 9B model is designed for text generation, aiming to replicate the prose quality of the Claude 3 models, specifically Sonnet and Opus. It is a fine-tuned version of the Gemma 2 9B model, optimized for conversational tasks.
Architecture
The model is built on the Gemma 2 9B architecture, utilizing the transformers
library. It employs the chatML
format for handling conversational data, and is integrated with various plugins for enhanced functionality, such as LigerPlugin.
Training
Magnum V4 9B was trained over 2 epochs on an 8xH100 GPU setup provided by Recursal AI and Featherless AI. The training involved datasets including Anthracite's logs and various instruct datasets, with a focus on conversational fine-tuning. The training framework used was Axolotl.
Guide: Running Locally
- Setup Environment: Install the necessary libraries, primarily
transformers
andsafetensors
. - Download Model: Retrieve the model files from the Hugging Face repository.
- Configure Model: Utilize the AXOLOTL configuration for setting up the model, including tokenizer and model type.
- Run Inference: Use a Python script to load the model and generate text based on input prompts.
- Hardware Recommendations: For optimal performance, consider using a cloud GPU service, such as AWS G4 instances or Google Cloud's GPU offerings.
License
The model is licensed under the Gemma license, which should be reviewed for restrictions and permissions regarding usage and distribution.