Q2.5 Veltha 14 B G G U F
QuantFactoryIntroduction
The Q2.5-Veltha-14B-GGUF is a quantized version of the Q2.5-Veltha-14B model, designed for text generation tasks. It is created using llama.cpp and employs a merge strategy that combines several pre-trained language models.
Architecture
The architecture of Q2.5-Veltha-14B-GGUF is based on merging multiple models using the della_linear
merge method. The base model is qwen/Qwen2.5-14b
. The merged models include:
huihui-ai/Qwen2.5-14B-Instruct-abliterated-v2
EVA-UNIT-01/EVA-Qwen2.5-14B-v0.2
v000000/Qwen2.5-Lumen-14B
arcee-ai/SuperNova-Medius
allura-org/TQ2.5-14B-Aletheia-v1
The configuration utilizes a YAML setup with specific parameters such as epsilon: 0.04
, lambda: 1.05
, and normalize: true
.
Training
The model has been evaluated on various datasets with diverse few-shot configurations. Key performance metrics include:
- IFEval (0-Shot): Strict accuracy of 82.92
- BBH (3-Shot): Normalized accuracy of 49.75
- MATH Lvl 5 (4-Shot): Exact match of 28.02
- GPQA (0-shot): Normalized accuracy of 14.54
- MuSR (0-shot): Normalized accuracy of 12.26
- MMLU-PRO (5-shot): Accuracy of 47.76
Guide: Running Locally
To run the Q2.5-Veltha-14B-GGUF model locally:
- Environment Setup: Install the necessary libraries such as
transformers
anddatasets
. - Download the Model: Clone the repository from Hugging Face or use the
transformers
library to load the model. - Inference: Use the model for text generation tasks, ensuring you configure the tokenizer correctly based on
arcee-ai/SuperNova-Medius
.
For optimal performance, consider using cloud GPUs such as those provided by AWS, Google Cloud, or Azure.
License
The use of the Q2.5-Veltha-14B-GGUF model is subject to the licensing terms specified by the individual models merged within it. Users should refer to the licenses of the base models for detailed information.