L3.1 Purosani 2 8 B G G U F

QuantFactory

Introduction

The L3.1-Purosani-2-8B-GGUF model is a quantized version of the djuna/L3.1-Purosani-2-8B, created using llama.cpp. This model is a product of merging several pre-trained language models with the mergekit tool, employing a specific merge method.

Architecture

The model uses a combination of several pre-trained models, merged using the della_linear method with unsloth/Meta-Llama-3.1-8B as the base. The merge integrates various models, each contributing differently through specified weights and densities. The configuration employs bfloat16 data type with parameters such as epsilon, lambda, and an int8_mask for normalization.

Training

The merged model was evaluated using different datasets across multiple tasks like text generation. It has shown varying performance metrics, such as strict accuracy, normalized accuracy, and exact match scores, across datasets like IFEval, BBH, MATH, GPQA, MuSR, and MMLU-PRO.

Guide: Running Locally

  1. Setup Environment: Ensure Python and required libraries are installed.
  2. Download Model: Obtain the model files from the Hugging Face model hub.
  3. Load Model: Use a script to load the model with frameworks like Transformers.
  4. Run Inference: Execute a script to perform text generation tasks.

Suggested Resources:

  • For optimal performance, consider using cloud GPU services such as AWS, Google Cloud, or Azure.

License

The usage and distribution of the L3.1-Purosani-2-8B-GGUF model are subject to specific licensing terms, which should be reviewed to ensure compliance with legal and ethical standards.

More Related APIs