Q2.5 Veltha 14 B 0.5 G G U F

QuantFactory

Introduction

The Q2.5-Veltha-14B-0.5-GGUF model is a quantized version of the djuna/Q2.5-Veltha-14B-0.5 model, created using the llama.cpp framework. This model merges several pre-trained language models utilizing the mergekit tool.

Architecture

The model is developed using the della_linear merge method and incorporates the following models:

  • arcee-ai/SuperNova-Medius
  • huihui-ai/Qwen2.5-14B-Instruct-abliterated-v2
  • allura-org/TQ2.5-14B-Aletheia-v1
  • EVA-UNIT-01/EVA-Qwen2.5-14B-v0.2
  • v000000/Qwen2.5-Lumen-14B

The YAML configuration specifies the merge method, data types, and parameters such as epsilon, lambda, and normalization settings. Each model's contribution is controlled by weight and density parameters.

Training

The model achieves various performance metrics across different datasets, as evaluated on the Open LLM Leaderboard:

  • IFEval (0-Shot) with a strict accuracy of 77.96
  • BBH (3-Shot) with a normalized accuracy of 50.32
  • MATH Lvl 5 (4-Shot) with an exact match value of 33.84
  • GPQA (0-shot) with a normalized accuracy of 15.77
  • MuSR (0-shot) with a normalized accuracy of 14.17
  • MMLU-PRO (5-shot) with an accuracy of 47.72

Guide: Running Locally

To run the Q2.5-Veltha-14B-0.5-GGUF model locally, follow these steps:

  1. Clone the model repository from Hugging Face.
  2. Install necessary dependencies, including the transformers library.
  3. Load the model using the Hugging Face transformers pipeline.
  4. Use a cloud GPU service like AWS, GCP, or Azure for optimal performance when handling large models.

License

The model and its components are provided under licenses specified by each individual model included in the merge. Always refer to the respective model pages for specific licensing information.

More Related APIs