Dolphin3.0 Qwen2.5 3b

cognitivecomputations

Introduction

Dolphin 3.0-QWEN 2.5-3B is an advanced instruct-tuned AI model, part of the Dolphin 3.0 series. It is crafted for diverse applications, including coding, math, and general-purpose agent tasks. The model emphasizes user control over system prompts and data, allowing customization and alignment with specific use cases.

Architecture

The model is based on Qwen/Qwen2.5-3B, integrating various datasets to enhance its instruct-tuning capabilities. It uses a ChatML-based chat template for user interaction, allowing customization through system prompts to define behavior and tone.

Training

Dolphin 3.0-QWEN 2.5-3B was trained using resources from sponsors like Crusoe Cloud and Akash, which provided high-performance GPUs. Several open-source datasets from entities such as OpenCoder-LLM and Microsoft were instrumental in training the model, alongside advanced data augmentation and filtering techniques.

Guide: Running Locally

  1. Prerequisites: Ensure you have Python and the Hugging Face Transformers library installed.
  2. Download Model: Clone the model repository using Git or download it directly from Hugging Face.
  3. Setup Environment: Install necessary dependencies using pip install -r requirements.txt.
  4. Run the Model: Use the Hugging Face Transformers library to load and interact with the model.
  5. Cloud GPUs: Consider using cloud GPU services like those from Amazon AWS, Google Cloud, or Azure for efficient computation.

License

The model is released under the Qwen-research license. For detailed terms and conditions, visit the license link.

More Related APIs