Q2.5 M S Mistoria 72b v2 G G U F

bartowski

Introduction

The Q2.5-MS-Mistoria-72b-v2-GGUF model is a text generation model utilizing LLAMACPP IMATRIX quantizations. It is designed to provide high-quality text generation through various quantization formats, optimized for different hardware configurations.

Architecture

The model is based on the Steelskull/Q2.5-MS-Mistoria-72b-v2 architecture. It employs llama.cpp for quantization, specifically using the imatrix option for dataset processing.

Training

The quantizations were created using the imatrix option with a specific dataset available here. This ensures a range of quantization levels from extremely high quality to very low quality to suit different use cases and hardware capabilities.

Guide: Running Locally

  1. Installation: Ensure you have huggingface-cli installed:
    pip install -U "huggingface_hub[cli]"
    
  2. Downloading Models: Use the huggingface-cli to download your desired model file, for example:
    huggingface-cli download bartowski/Q2.5-MS-Mistoria-72b-v2-GGUF --include "Q2.5-MS-Mistoria-72b-v2-Q4_K_M.gguf" --local-dir ./
    
  3. Hardware Requirements: Determine your available RAM and VRAM to select the appropriate quant file. Aim for a file size 1-2GB smaller than your total available memory.
  4. Suggested Environment: Running the model with cloud GPUs like those from AWS, GCP, or Azure can greatly enhance performance, especially for larger quant files.

License

This model and its quantizations are shared under the licenses provided by Hugging Face and the original licensors. Users should refer to the model repository for specific licensing details.

More Related APIs in Text Generation