Q V Q 72 B Preview 4bit

mlx-community

Introduction

The QVQ-72B-Preview-4bit model is a variant of the Qwen/QVQ-72B-Preview model, converted into the MLX format using mlx-vlm version 0.1.6. It is designed for image-text-to-text processing and supports conversational and chat functionalities. The model leverages the transformers library and is optimized for use with the MLX framework.

Architecture

The QVQ-72B-Preview-4bit model is based on the Qwen/Qwen2-VL-72B architecture. It is structured to handle image-text-to-text tasks, facilitating advanced text generation with a focus on inference endpoints.

Training

Details regarding the specific training methodologies for the QVQ-72B-Preview-4bit model are not provided in the summary. Users are encouraged to refer to the original Qwen/QVQ-72B-Preview model card for in-depth training information.

Guide: Running Locally

  1. Install MLX-VLM: Ensure you have the latest version of the mlx-vlm package installed.

    pip install -U mlx-vlm
    
  2. Generate Text: Use the following command to generate text using the model:

    python -m mlx_vlm.generate --model mlx-community/QVQ-72B-Preview-4bit --max-tokens 100 --temp 0.0
    
  3. Cloud GPUs: For optimal performance, especially with large models, consider utilizing cloud GPU services such as AWS EC2 with NVIDIA GPUs, Google Cloud's Compute Engine, or Azure's GPU offerings.

License

The QVQ-72B-Preview-4bit model is distributed under the Qwen license. For detailed licensing terms, refer to the license document.

More Related APIs in Image Text To Text