dolphin 2.9.3 mistral 7 B 32k

cognitivecomputations

Introduction

Dolphin-2.9.3-Mistral-7B-32K is a text generation model developed by Eric Hartford and Cognitive Computations. It is built on the mistralai/Mistral-7B-v0.3 foundation and features a 32k context window. The model is designed for instruction following, conversational interactions, and coding tasks, with additional abilities for function calling.

Architecture

The model is based on the Mistral-7B architecture with a focus on text generation. It utilizes the ChatML prompt template format and has been finetuned with an 8192 sequence length. Dolphin-2.9.3 is uncensored and its training data has been filtered to reduce alignment and bias, making it highly compliant with user requests.

Training

Dolphin-2.9.3 was trained using the Axolotl framework. The training datasets include various sources, such as ShareGPT, to enhance its conversational abilities. The model was trained with a sequence length of 8192 and involved techniques like sample packing and gradient checkpointing. The training process incorporated a cosine learning rate scheduler and adamw_8bit optimizer over three epochs.

Guide: Running Locally

To run Dolphin-2.9.3 locally, follow these steps:

  1. Install the necessary dependencies: Ensure you have Python and the required libraries installed.
  2. Clone the model repository: Use Git to clone the Dolphin-2.9.3 repository from Hugging Face.
  3. Set up the environment: Configure the environment by setting up paths and any required environment variables.
  4. Download the model: Use Hugging Face's model hub to download the model files.
  5. Run the model: Use a Python script or Jupyter notebook to load the model and generate text.

For optimal performance, it is recommended to use cloud GPUs such as NVIDIA's H100 available from providers like Crusoe Cloud.

License

Dolphin-2.9.3 is released under the Apache 2.0 license, allowing for both personal and commercial use. Users are responsible for ensuring compliance with any applicable laws and regulations when deploying the model.

More Related APIs in Text Generation