P A R M V2 Qw Q Qwen 2.5 o1 3 B G G U F

Pinkstack

Introduction

PARM-V2-QWQ-QWEN-2.5-O1-3B-GGUF is a high-quality text generation model released by Pinkstack. It excels in reasoning, math, and coding abilities, and is designed to run efficiently on a variety of devices. This model supports both English and Chinese languages and is optimized for use with the GGUF file format.

Architecture

The model is based on the Qwen 2.5 3B architecture and has been enhanced with additional reasoning training parameters. It comes in two quantization formats:

  • Q4: Suitable for edge devices like high-end phones or laptops, offering compact size and decent quality.
  • Q8: Ideal for high-end modern devices such as RTX 3080, providing very high-quality responses at a slightly slower speed than Q4.

Training

The model was finetuned from Pinkstack's PARM-V1.5-QwQ-Qwen-2.5-o1-3B-VLLM and was trained using datasets available here. The training process utilized the Unsloth and Hugging Face's TRL library.

Guide: Running Locally

  1. Setup Environment: Ensure your system supports GGUF file format. Install necessary dependencies including the Hugging Face transformers library.
  2. Download Model: Obtain the model files from Pinkstack's repository.
  3. Select Quantization: Choose between Q4 and Q8 based on your device's capabilities.
  4. Run Model: Execute the model on your local environment. For optimal performance, use a cloud GPU such as an NVIDIA RTX 3080.

License

The model is released under the Apache-2.0 license, allowing for wide use and distribution.

More Related APIs in Text Generation