zelensky 78b

nisten

Introduction

Zelensky-78B is an experimental commander model designed for advanced AI tasks. The name is intended humorously, referencing a comparison with another AI model, Grok-2.

Architecture

The model is built upon multiple base models, including Qwen/Qwen2.5-72B-Instruct and ChatWaifu_72B_v2.2. It utilizes a combination of datasets such as EvolKit-75K and WildChat-1M-Full, enhancing its reasoning and interactive capabilities.

Training

The training process involved using a low learning rate over one epoch, leveraging evolutionary merging with three other models. This process was repeatedly conducted on eight AMD Mi300 GPUs with 192GB memory each. The LM_Eval harness's gpqa_diamond_zeroshot was also employed during training. Compute resources were provided by Vultr.

Guide: Running Locally

  1. Clone the Repository: Download the Zelensky-78B model from the Hugging Face repository.
  2. Install Dependencies: Ensure you have the necessary Python libraries and frameworks installed.
  3. Download Model Files: Use the provided links to acquire the model weights and configurations.
  4. Set Up Environment: Configure your environment to match the model's requirements, such as ensuring adequate GPU resources.
  5. Run Inference: Execute scripts to perform inference tasks with the model.

For optimal performance, it is recommended to use cloud GPUs, such as those provided by AWS or Google Cloud.

License

The model is distributed under the MIT License. Note that the Qwen License applies by default to certain components of the model.

More Related APIs