Small Thinker 3 B Preview G G U F

QuantFactory

SmallThinker-3B-Preview-GGUF

Introduction

SmallThinker-3B-Preview-GGUF is a quantized version of the SmallThinker-3B-Preview model, derived from the Qwen2.5-3B-Instruct. It is optimized for edge deployment and serves as a draft model for larger models like QwQ-32B-Preview, offering significant speed improvements.

Architecture

The model is fine-tuned from the base model Qwen2.5-3B-Instruct. It is designed to operate efficiently on resource-constrained devices due to its compact size.

Training

The model was trained using 8 H100 GPUs and a global batch size of 16. Training was conducted in two phases:

  1. First Phase: Trained with PowerInfer/QWQ-LONGCOT-500K dataset for 1.5 epochs.
  2. Second Phase: Continued training with both PowerInfer/QWQ-LONGCOT-500K and PowerInfer/LONGCOT-Refine datasets for an additional 2 epochs.

Key training parameters include a learning rate of 1.0e-5, cosine learning rate scheduling, a warmup ratio of 0.02, and bf16 precision.

Guide: Running Locally

Basic Steps

  1. Setup Environment: Ensure you have Python and necessary dependencies installed.
  2. Clone Repository: Download the model repository from Hugging Face.
  3. Install Dependencies: Use pip install -r requirements.txt to install required packages.
  4. Run Model: Execute the model using a suitable script or notebook.

Suggested Cloud GPUs

Consider using cloud-based GPUs like NVIDIA A100 or H100 for optimal performance.

License

SmallThinker-3B-Preview-GGUF is distributed under the [applicable license]. Users must adhere to the terms and conditions outlined within the license agreement.

More Related APIs