gemma 2 9b it abliterated G G U F
bartowskiIntroduction
This documentation provides details on the GEMMA-2-9B-IT-ABLITERATED model, designed for text generation. The model is available in various quantized formats and is suitable for conversational AI tasks. It is built upon the base model by IlyaGusev and utilizes the GGUF library.
Architecture
The model architecture utilizes the llama.cpp framework for quantization, specifically using the imatrix option. This allows for different quantization levels, such as Q8_0 and Q6_K, which affect model performance and quality.
Training
The model was trained with a dataset curated for imatrix quantization. The training process was assisted by contributions from community members, specifically in creating calibration datasets.
Guide: Running Locally
To run the model locally, follow these steps:
-
Install Dependencies: Ensure you have
huggingface_hub[cli]
installed:pip install -U "huggingface_hub[cli]"
-
Download Model Files: Use the
huggingface-cli
to download specific model quantization files:huggingface-cli download bartowski/gemma-2-9b-it-abliterated-GGUF --include "gemma-2-9b-it-abliterated-Q4_K_M.gguf" --local-dir ./
-
Choose the Right Quantization: Depending on your hardware capabilities (RAM/VRAM), select a quantization file that fits your needs. For speed, prioritize fitting the model into your GPU's VRAM.
-
Run the Model: Execute the model in your preferred environment. If you are using an ARM chip, consider using Q4_0_X_X quants for better performance.
Cloud GPUs
For enhanced performance, consider using cloud-based GPU services such as Google Cloud, AWS, or Azure to run the model efficiently.
License
The model is released under the Gemma license. Please review the license terms to ensure compliance with usage guidelines.