Dolphin3.0 Qwen2.5 3b
cognitivecomputationsIntroduction
Dolphin 3.0-QWEN 2.5-3B is an advanced instruct-tuned AI model, part of the Dolphin 3.0 series. It is crafted for diverse applications, including coding, math, and general-purpose agent tasks. The model emphasizes user control over system prompts and data, allowing customization and alignment with specific use cases.
Architecture
The model is based on Qwen/Qwen2.5-3B, integrating various datasets to enhance its instruct-tuning capabilities. It uses a ChatML-based chat template for user interaction, allowing customization through system prompts to define behavior and tone.
Training
Dolphin 3.0-QWEN 2.5-3B was trained using resources from sponsors like Crusoe Cloud and Akash, which provided high-performance GPUs. Several open-source datasets from entities such as OpenCoder-LLM and Microsoft were instrumental in training the model, alongside advanced data augmentation and filtering techniques.
Guide: Running Locally
- Prerequisites: Ensure you have Python and the Hugging Face Transformers library installed.
- Download Model: Clone the model repository using Git or download it directly from Hugging Face.
- Setup Environment: Install necessary dependencies using
pip install -r requirements.txt
. - Run the Model: Use the Hugging Face Transformers library to load and interact with the model.
- Cloud GPUs: Consider using cloud GPU services like those from Amazon AWS, Google Cloud, or Azure for efficient computation.
License
The model is released under the Qwen-research license. For detailed terms and conditions, visit the license link.