Mini C P M V 2_6 gguf

openbmb

Introduction

MiniCPM-V 2.6 is an advanced model that can be used for various machine learning tasks. The model can be downloaded, prepared, and run using specific tools and scripts. It supports conversion to the GGUF format and includes options for quantization.

Architecture

MiniCPM-V 2.6 is based on the GGUF architecture, which facilitates efficient processing and inference. The model can be run in different precision modes, including f16 and quantized int4, allowing for flexibility based on computational resources and performance requirements.

Training

The training details are not explicitly provided in the documentation, but the model supports conversion and quantization, suggesting that it can be adapted for different tasks and optimized for performance. The conversion processes involve specific scripts that adjust model parameters and formats.

Guide: Running Locally

  1. Prepare Models and Code:

    • Download the MiniCPM-V-2_6 PyTorch model from Hugging Face.
    • Clone the llama.cpp repository and navigate to the minicpmv-main branch:
      git clone git@github.com:OpenBMB/llama.cpp.git
      cd llama.cpp
      git checkout minicpmv-main
      
  2. Convert PyTorch Model to GGUF:

    • Use provided scripts to convert the model:
      python ./examples/llava/minicpmv-surgery.py -m ../MiniCPM-V-2_6
      python ./examples/llava/minicpmv-convert-image-encoder-to-gguf.py -m ../MiniCPM-V-2_6 --minicpmv-projector ../MiniCPM-V-2_6/minicpmv.projector --output-dir ../MiniCPM-V-2_6/ --image-mean 0.5 0.5 0.5 --image-std 0.5 0.5 0.5 --minicpmv_version 3
      python ./convert_hf_to_gguf.py ../MiniCPM-V-2_6/model
      
    • Quantize to int4 version if needed:
      ./llama-quantize ../MiniCPM-V-2_6/model/ggml-model-f16.gguf ../MiniCPM-V-2_6/model/ggml-model-Q4_K_M.gguf Q4_K_M
      
  3. Build and Run:

    • For Linux or Mac, build the CLI tool:
      make
      make llama-minicpmv-cli
      
    • Run inference using the f16 or quantized int4 model versions:
      ./llama-minicpmv-cli -m ../MiniCPM-V-2_6/model/ggml-model-f16.gguf --mmproj ../MiniCPM-V-2_6/mmproj-model-f16.gguf -c 4096 --temp 0.7 --top-p 0.8 --top-k 100 --repeat-penalty 1.05 --image xx.jpg -p "What is in the image?"
      
  4. Cloud GPUs:

    • Consider using cloud GPU services such as AWS, Google Cloud, or Azure for running the model if local resources are insufficient. These platforms provide powerful GPUs that can accelerate model inference.

License

The license information is not provided in the summary. Users should verify the licensing terms on the official Hugging Face model page or in the repository to ensure compliance with usage guidelines.

More Related APIs