Deep Seek V2.5 1210 int4 sym inc

OPEA

Introduction

DeepSeek-V2.5-1210-INT4-SYM-INC is a model developed for efficient AI performance, utilizing int4 precision with a group size of 128 and symmetric quantization. It incorporates Intel's auto-round algorithm for optimization. This model is intended for use in scenarios where computational efficiency is crucial and can be loaded using the AutoGPTQ format for better performance.

Architecture

The model is based on deepseek-ai's DeepSeek-V2.5-1210 architecture and leverages the Intel auto-round algorithm to optimize weight rounding. The model supports both CPU and CUDA for int4 inference, enabling it to run efficiently on various hardware configurations.

Training

DeepSeek-V2.5-1210-INT4-SYM-INC was trained using the NeelNanda/pile-10k dataset. The model's training process focused on achieving optimal precision and performance through quantization methods, allowing it to perform well under resource constraints.

Guide: Running Locally

  1. Clone the Repository:

    git clone https://huggingface.co/OPEA/DeepSeek-V2.5-1210-int4-sym-inc
    cd DeepSeek-V2.5-1210-int4-sym-inc
    git checkout 6d3d2cf
    
  2. Install Required Packages: Ensure Python environment has torch, transformers, and auto_round installed.

  3. Run Inference: Use the provided Python script to load the model and run inference tasks. Adjust max_memory settings based on your hardware.

  4. Cloud GPU Suggestions: For improved performance, consider using cloud-based GPU services such as AWS EC2, Google Cloud Platform, or Azure.

License

The model's use is subject to the license terms of the original model. Users must ensure compliance with all applicable license agreements and consult legal advice for commercial use. The developers are not liable for third-party use.

More Related APIs