Dolphin3.0 Qwen2.5 3b G G U F
bartowskiIntroduction
Dolphin3.0-Qwen2.5-3B-GGUF is a text generation model available on Hugging Face, focusing on English language tasks. It utilizes the GGUF library and has been fine-tuned on a variety of datasets to enhance its conversational capabilities.
Architecture
The model is based on the cognitivecomputations/Dolphin3.0-Qwen2.5-3B architecture. It has been quantized using the llama.cpp framework, specifically the b4418 release, to support various quantization formats that optimize for different hardware configurations.
Training
The model was trained using datasets such as OpenCoder-LLM, Microsoft Orca, NousResearch, AI-MO, and others, aiming to improve its performance in code generation, math problems, and general conversational tasks. The quantization was performed using the imatrix option, which enhances model efficiency when running on different hardware setups.
Guide: Running Locally
-
Installation: Ensure you have
huggingface_hub
installed:pip install -U "huggingface_hub[cli]"
-
Download Model: Use the
huggingface-cli
to download the desired quantization file:huggingface-cli download bartowski/Dolphin3.0-Qwen2.5-3b-GGUF --include "Dolphin3.0-Qwen2.5-3b-Q4_K_M.gguf" --local-dir ./
-
Model Execution: Depending on your hardware's RAM or VRAM, select a quantization file that fits your system's capabilities. For faster performance, ensure the model can fit entirely on your GPU's VRAM.
-
Hardware Recommendations: Leveraging cloud GPUs from providers like AWS or Google Cloud can significantly enhance performance, especially when using models with higher resource demands.
License
The model is released under the Qwen-research license. For more details, refer to the license documentation.