Q2.5 M S Mistoria 72b v2 G G U F
bartowskiIntroduction
The Q2.5-MS-Mistoria-72b-v2-GGUF model is a text generation model utilizing LLAMACPP IMATRIX quantizations. It is designed to provide high-quality text generation through various quantization formats, optimized for different hardware configurations.
Architecture
The model is based on the Steelskull/Q2.5-MS-Mistoria-72b-v2 architecture. It employs llama.cpp for quantization, specifically using the imatrix option for dataset processing.
Training
The quantizations were created using the imatrix option with a specific dataset available here. This ensures a range of quantization levels from extremely high quality to very low quality to suit different use cases and hardware capabilities.
Guide: Running Locally
- Installation: Ensure you have
huggingface-cli
installed:pip install -U "huggingface_hub[cli]"
- Downloading Models: Use the
huggingface-cli
to download your desired model file, for example:huggingface-cli download bartowski/Q2.5-MS-Mistoria-72b-v2-GGUF --include "Q2.5-MS-Mistoria-72b-v2-Q4_K_M.gguf" --local-dir ./
- Hardware Requirements: Determine your available RAM and VRAM to select the appropriate quant file. Aim for a file size 1-2GB smaller than your total available memory.
- Suggested Environment: Running the model with cloud GPUs like those from AWS, GCP, or Azure can greatly enhance performance, especially for larger quant files.
License
This model and its quantizations are shared under the licenses provided by Hugging Face and the original licensors. Users should refer to the model repository for specific licensing details.