Q V Q 72 B Preview 3bit
mlx-communityIntroduction
The QVQ-72B-Preview-3bit model, hosted by the MLX Community, is a variant converted to the MLX format from the original Qwen/QVQ-72B-Preview model. It is designed to handle image-to-text conversion tasks using advanced transformer techniques.
Architecture
The model is based on the Qwen2-VL-72B architecture, utilizing the MLX library and the Transformers library for efficient text generation. It supports image-text-to-text pipelines and is optimized for conversational applications.
Training
The model was converted using mlx-vlm version 0.1.6. While specific training details are not provided in this excerpt, the model's architecture is optimized for high-performance inference tasks.
Guide: Running Locally
To run the QVQ-72B-Preview-3bit model locally, follow these steps:
-
Install MLX-VLM:
pip install -U mlx-vlm
-
Generate Text:
Run the model using the following command:python -m mlx_vlm.generate --model mlx-community/QVQ-72B-Preview-3bit --max-tokens 100 --temp 0.0
Cloud GPU Recommendation
For optimal performance, especially when processing large datasets, consider using cloud-based GPUs. Platforms such as AWS, Google Cloud Platform, or Azure provide scalable GPU resources suitable for intensive machine learning tasks.
License
The model is distributed under the "qwen" license. For full licensing details, refer to the license link.