Q2.5 Veltha 14 B G G U F

QuantFactory

Introduction

The Q2.5-Veltha-14B-GGUF is a quantized version of the Q2.5-Veltha-14B model, designed for text generation tasks. It is created using llama.cpp and employs a merge strategy that combines several pre-trained language models.

Architecture

The architecture of Q2.5-Veltha-14B-GGUF is based on merging multiple models using the della_linear merge method. The base model is qwen/Qwen2.5-14b. The merged models include:

  • huihui-ai/Qwen2.5-14B-Instruct-abliterated-v2
  • EVA-UNIT-01/EVA-Qwen2.5-14B-v0.2
  • v000000/Qwen2.5-Lumen-14B
  • arcee-ai/SuperNova-Medius
  • allura-org/TQ2.5-14B-Aletheia-v1

The configuration utilizes a YAML setup with specific parameters such as epsilon: 0.04, lambda: 1.05, and normalize: true.

Training

The model has been evaluated on various datasets with diverse few-shot configurations. Key performance metrics include:

  • IFEval (0-Shot): Strict accuracy of 82.92
  • BBH (3-Shot): Normalized accuracy of 49.75
  • MATH Lvl 5 (4-Shot): Exact match of 28.02
  • GPQA (0-shot): Normalized accuracy of 14.54
  • MuSR (0-shot): Normalized accuracy of 12.26
  • MMLU-PRO (5-shot): Accuracy of 47.76

Guide: Running Locally

To run the Q2.5-Veltha-14B-GGUF model locally:

  1. Environment Setup: Install the necessary libraries such as transformers and datasets.
  2. Download the Model: Clone the repository from Hugging Face or use the transformers library to load the model.
  3. Inference: Use the model for text generation tasks, ensuring you configure the tokenizer correctly based on arcee-ai/SuperNova-Medius.

For optimal performance, consider using cloud GPUs such as those provided by AWS, Google Cloud, or Azure.

License

The use of the Q2.5-Veltha-14B-GGUF model is subject to the licensing terms specified by the individual models merged within it. Users should refer to the licenses of the base models for detailed information.

More Related APIs