dolphin 2.9.2 qwen2 72b

cognitivecomputations

Introduction

Dolphin-2.9.2-QWEN2-72B is a text generation model developed by Cognitive Computations, built on the Qwen2-72B base model. It is fine-tuned for a variety of tasks including conversational and instructional skills and supports function calling. The model is trained to be highly compliant, making it suitable for diverse applications, though users are advised to implement their own alignment layers.

Architecture

The Dolphin-2.9.2 model is based on the Qwen2-72B architecture, featuring a 128k context length with full-weight fine-tuning at an 8k sequence length. The model supports self-attention and other advanced neural network techniques. It employs the ChatML prompt template format and is trained using parameters selected by the Laser Scanner tool.

Training

Training involved diverse datasets and techniques:

  • Utilized the Axolotl framework for model configuration and training.
  • Datasets included cognitivecomputations/Dolphin-2.9, m-a-p/CodeFeedback-Filtered-Instruction, and others.
  • The model was trained using a cosine learning rate scheduler and paged AdamW optimizer.
  • It adopted various project, token, and attention configurations to enhance performance.

Guide: Running Locally

To run Dolphin-2.9.2 locally:

  1. Set up Environment: Ensure you have Python installed and set up a virtual environment.
  2. Install Libraries: Use pip to install necessary libraries like transformers and torch.
  3. Download Model: Access the model repository on Hugging Face and download the model files.
  4. Load Model: Write a script to load the model and tokenizer using the AutoModelForCausalLM and AutoTokenizer classes.
  5. Run Inference: Use the model to generate text by providing prompts.

For improved performance, consider using cloud GPU services such as AWS, Google Cloud, or Azure, which offer advanced GPU options like NVIDIA's H100.

License

Dolphin-2.9.2-QWEN2-72B is licensed under the tongyi-qianwen license. This permits any use, including commercial, as long as it aligns with the terms specified. For more details, refer to the license document.

More Related APIs in Text Generation