Lumimaid Magnum v4 12 B G G U F
bartowskiIntroduction
LUMIMAID-MAGNUM-V4-12B-GGUF is a quantized version of the Lumimaid-Magnum model, designed for efficient text generation tasks. This project utilizes llama.cpp
for quantization, providing various quantization levels to optimize performance according to user needs.
Architecture
The model is based on the original Lumimaid-Magnum-v4-12B from Undi95's Hugging Face repository. It employs llama.cpp for quantization, with multiple quantization levels available to suit different performance and quality requirements.
Training
Quantization is performed using the imatrix
option, with a dedicated dataset provided by the community. The quantization process involves adjusting the precision of the model's weights to reduce the model size and computational requirements while aiming to retain as much of the original model's performance as possible.
Guide: Running Locally
-
Install Dependencies: Ensure you have
huggingface-cli
installed by running:pip install -U "huggingface_hub[cli]"
-
Download Model Files: Use the CLI to download specific model files:
huggingface-cli download bartowski/Lumimaid-Magnum-v4-12B-GGUF --include "Lumimaid-Magnum-v4-12B-Q4_K_M.gguf" --local-dir ./
-
Select Appropriate File: Choose the model file that fits your hardware capabilities, considering both RAM and VRAM. For maximum performance, the model should fit entirely in your GPU's VRAM.
-
Cloud GPU Options: Consider using cloud GPU services like AWS, Google Cloud, or Azure for enhanced performance, especially if local resources are limited.
License
The model and its associated files are available under terms that allow for both academic and commercial use. For more detailed licensing information, refer to the original repository or contact the creator.