Qwen2.5 3 B Instruct G G U F
QwenIntroduction
Qwen2.5 is the latest series of large language models in the Qwen family, offering significant advancements over its predecessor, Qwen2. Improvements include enhanced knowledge, coding, and mathematical capabilities, improved instruction-following, and support for generating long texts and structured data. It supports multilingual communication across 29 languages and can handle long contexts up to 128K tokens, generating up to 8K tokens.
Architecture
The Qwen2.5-3B-Instruct-GGUF model is a causal language model with the following features:
- Type: Transformers with RoPE, SwiGLU, RMSNorm, Attention QKV bias, and tied word embeddings.
- Parameters: 3.09 billion total, 2.77 billion non-embedding.
- Layers: 36
- Attention Heads (GQA): 16 for Query, 2 for Key and Value.
- Context Length: Full 32,768 tokens, generation up to 8192 tokens.
- Quantization: Supports multiple formats (e.g., q2_K, q5_K_M).
Training
The model has undergone both pretraining and post-training processes. It includes instruction tuning to enhance its capabilities in generating structured outputs and following diverse system prompts.
Guide: Running Locally
To run the Qwen2.5-3B-Instruct-GGUF model locally, follow these steps:
-
Clone llama.cpp Repository:
Clone the llama.cpp repository and follow its installation guide. -
Install Hugging Face Hub:
pip install -U huggingface_hub
-
Download the Model:
Use the Hugging Face CLI to download:huggingface-cli download Qwen/Qwen2.5-3B-Instruct-GGUF qwen2.5-3b-instruct-q5_k_m.gguf --local-dir . --local-dir-use-symlinks False
-
Run the Model:
Execute the model in conversation mode:./llama-cli -m <gguf-file-path> \ -co -cnv -p "You are Qwen, created by Alibaba Cloud. You are a helpful assistant." \ -fa -ngl 80 -n 512
For optimal performance, consider using cloud GPU services like AWS EC2 with GPU instances or Google Cloud Platform with GPU support.
License
The Qwen2.5-3B-Instruct-GGUF model is licensed under the qwen-research license. For more details, refer to the license file.