Small Thinker 3 B Preview G G U F
QuantFactorySmallThinker-3B-Preview-GGUF
Introduction
SmallThinker-3B-Preview-GGUF is a quantized version of the SmallThinker-3B-Preview model, derived from the Qwen2.5-3B-Instruct. It is optimized for edge deployment and serves as a draft model for larger models like QwQ-32B-Preview, offering significant speed improvements.
Architecture
The model is fine-tuned from the base model Qwen2.5-3B-Instruct. It is designed to operate efficiently on resource-constrained devices due to its compact size.
Training
The model was trained using 8 H100 GPUs and a global batch size of 16. Training was conducted in two phases:
- First Phase: Trained with PowerInfer/QWQ-LONGCOT-500K dataset for 1.5 epochs.
- Second Phase: Continued training with both PowerInfer/QWQ-LONGCOT-500K and PowerInfer/LONGCOT-Refine datasets for an additional 2 epochs.
Key training parameters include a learning rate of 1.0e-5, cosine learning rate scheduling, a warmup ratio of 0.02, and bf16 precision.
Guide: Running Locally
Basic Steps
- Setup Environment: Ensure you have Python and necessary dependencies installed.
- Clone Repository: Download the model repository from Hugging Face.
- Install Dependencies: Use
pip install -r requirements.txt
to install required packages. - Run Model: Execute the model using a suitable script or notebook.
Suggested Cloud GPUs
Consider using cloud-based GPUs like NVIDIA A100 or H100 for optimal performance.
License
SmallThinker-3B-Preview-GGUF is distributed under the [applicable license]. Users must adhere to the terms and conditions outlined within the license agreement.