Qwexit 2.5 14 B 2024

CultriX

Introduction

QWEXIT-2.5-14B-2024 is a merged language model developed by CultriX using multiple pre-trained models to enhance performance in text generation tasks. The model leverages the mergekit library to combine different model architectures, resulting in a more robust and versatile language model.

Architecture

QWEXIT-2.5-14B-2024 is built using the following base models:

  • CultriX/SeQwence-14Bv1
  • CultriX/Qwen2.5-14B-Broca
  • CultriX/Qwen2.5-14B-Wernickev3
  • CultriX/Qwen2.5-14B-FinalMerge
  • sthenno-com/miscii-14b-1225
  • djuna/Q2.5-Veltha-14B

The merging process uses the della_linear method from mergekit, focusing on fine-grained parameter scaling and normalization. The model is configured with specific weights and densities for each contributing model to optimize functionality across various tasks such as conversation, logic, and general coverage.

Training

The merging process involves specific parameters like epsilon, lambda, and normalization settings to ensure stable merges. Key configurations include:

  • Merging method: della_linear
  • Data type: bfloat16
  • Parameter weights and densities are tailored for each model contribution.
  • Adaptive merge parameters emphasize sub-benchmarks with specific task weights and a smoothing factor to balance contributions.
  • Gradient clipping is set to 1.0 to prevent over-contribution from any single model.

Guide: Running Locally

To run QWEXIT-2.5-14B-2024 locally:

  1. Setup Environment: Ensure Python and the transformers library are installed.
  2. Download Model: Use huggingface_hub to clone the repository containing the model files.
  3. Load Model: Utilize the transformers library to load the model into your application.
  4. Run Inference: Implement the model in your text generation pipeline to start generating outputs.

For optimal performance, especially for large models like this, it is recommended to use cloud GPUs such as those provided by AWS, Google Cloud, or Azure.

License

The model and its components are released under licenses that ensure free use and distribution, subject to the terms specified in the individual model cards. Users are encouraged to review these terms to ensure compliance with any restrictions or obligations.

More Related APIs in Text Generation