Deep Seek Coder V2 Lite Instruct G G U F

bartowski

Introduction

DeepSeek-Coder-V2-Lite-Instruct-GGUF is a text generation model optimized using llama.cpp quantizations, aiming to enhance performance while reducing model size. It is designed for versatile use with various quantization configurations.

Architecture

The model is based on the DeepSeek-Coder-V2-Lite-Instruct architecture. Quantizations were performed using llama.cpp, specifically the release b3166. The model offers multiple quantization formats ranging from Q8_0 to IQ2_XS, each providing different balances of quality, size, and performance.

Training

Quantizations utilize the imatrix option with the dataset available here. These quantizations aim to optimize the model's performance on various hardware configurations by adjusting the precision of weights.

Guide: Running Locally

  1. Installation: Ensure you have huggingface-cli installed:

    pip install -U "huggingface_hub[cli]"
    
  2. Download a Model File: Use the following command to download a specific quantized file:

    huggingface-cli download bartowski/DeepSeek-Coder-V2-Lite-Instruct-GGUF --include "DeepSeek-Coder-V2-Lite-Instruct-Q4_K_M.gguf" --local-dir ./
    

    For models larger than 50GB, they will be split into multiple files:

    huggingface-cli download bartowski/DeepSeek-Coder-V2-Lite-Instruct-GGUF --include "DeepSeek-Coder-V2-Lite-Instruct-Q8_0.gguf/*" --local-dir DeepSeek-Coder-V2-Lite-Instruct-Q8_0
    
  3. Choose the Right File: Determine the quantization file based on your hardware's RAM and VRAM. For maximum speed, choose a file size slightly smaller than your GPU's VRAM. For maximum quality, combine system RAM and VRAM, and select a file slightly smaller than this total.

  4. Quantization Types:

    • K-quants: Recommended for general use.
    • I-quants: Better for performance with specific configurations (e.g., sub-Q4 models using cuBLAS or rocBLAS).

    For cloud GPU options, consider platforms like AWS, Google Cloud, or Azure with appropriate GPU instances.

License

The model is released under the deepseek-license. For more details, refer to the LICENSE file associated with the model.

More Related APIs in Text Generation