Q2.5 Veltha 14 B 0.5 G G U F
QuantFactoryIntroduction
The Q2.5-Veltha-14B-0.5-GGUF model is a quantized version of the djuna/Q2.5-Veltha-14B-0.5 model, created using the llama.cpp framework. This model merges several pre-trained language models utilizing the mergekit tool.
Architecture
The model is developed using the della_linear merge method and incorporates the following models:
- arcee-ai/SuperNova-Medius
- huihui-ai/Qwen2.5-14B-Instruct-abliterated-v2
- allura-org/TQ2.5-14B-Aletheia-v1
- EVA-UNIT-01/EVA-Qwen2.5-14B-v0.2
- v000000/Qwen2.5-Lumen-14B
The YAML configuration specifies the merge method, data types, and parameters such as epsilon, lambda, and normalization settings. Each model's contribution is controlled by weight and density parameters.
Training
The model achieves various performance metrics across different datasets, as evaluated on the Open LLM Leaderboard:
- IFEval (0-Shot) with a strict accuracy of 77.96
- BBH (3-Shot) with a normalized accuracy of 50.32
- MATH Lvl 5 (4-Shot) with an exact match value of 33.84
- GPQA (0-shot) with a normalized accuracy of 15.77
- MuSR (0-shot) with a normalized accuracy of 14.17
- MMLU-PRO (5-shot) with an accuracy of 47.72
Guide: Running Locally
To run the Q2.5-Veltha-14B-0.5-GGUF model locally, follow these steps:
- Clone the model repository from Hugging Face.
- Install necessary dependencies, including the
transformers
library. - Load the model using the Hugging Face
transformers
pipeline. - Use a cloud GPU service like AWS, GCP, or Azure for optimal performance when handling large models.
License
The model and its components are provided under licenses specified by each individual model included in the merge. Always refer to the respective model pages for specific licensing information.