32 B Qwen2.5 Kunou v1

Sao10K

Introduction

The 32B-Qwen2.5-Kunou-v1 is a text generation model developed by Sao10K as a versatile and generalist roleplay model. It is part of a series that includes variations for lightweight and heavyweight use, such as the 14B and 72B versions. The model is built on a refined dataset to enhance performance compared to previous iterations.

Architecture

The model architecture utilizes the Qwen/Qwen2.5-32B-Instruct as its base model and is designed for text generation using the AutoModelForCausalLM and AutoTokenizer. Key features include:

  • Sequence length of 16384.
  • Supports both 4-bit and 8-bit loading for memory efficiency.
  • Incorporates flash attention and the qlora adapter for enhanced performance.
  • Utilizes liger plugins for additional optimizations such as RMS normalization and fused linear cross-entropy.

Training

Training of the model was conducted using the Axolotl framework (version 0.5.2) with a focus on:

  • Utilizing a variety of datasets, including custom chat and roleplay data.
  • A single epoch with gradient accumulation steps set to 4, and micro-batch size of 1.
  • Optimizations like paged_ademamix_8bit optimizer and cosine learning rate scheduler.
  • DeepSpeed configurations for efficient parallel training.

Guide: Running Locally

To run the 32B-Qwen2.5-Kunou-v1 model locally, follow these steps:

  1. Setup Environment: Ensure you have Python and necessary libraries such as transformers and torch installed.
  2. Download Model: Use the Hugging Face model hub to download the model files.
  3. Load the Model: Use the transformers library to load the model and tokenizer.
  4. Run Inference: Create a script to generate text using the model with your desired prompts.

For performance optimization, consider using cloud GPUs like those offered by AWS, Google Cloud, or Azure to handle the model's computational requirements.

License

The model is distributed under the qwen license. For more detailed information, refer to the license document.

More Related APIs in Text Generation