phi 4 abliterated Q8_0 G G U F

KnutJaegersberg

Introduction

The PHI-4-ABLITERATED-Q8_0-GGUF model is a text generation model adapted to the GGUF format using llama.cpp. It is based on the original model from Orion-zhen/phi-4-abliterated. This version facilitates efficient deployment for applications requiring text generation capabilities.

Architecture

The model employs the GGUF format, facilitating a streamlined conversion from the base model Orion-zhen/phi-4-abliterated. The conversion process utilizes llama.cpp and ggml.ai's space GGUF-my-repo. It is optimized for English language text generation, ensuring compatibility with inference endpoints and conversational applications.

Training

Details about the training specifics of the base model can be found in the original model card hosted by Orion-zhen. The conversion to GGUF format does not alter the training methodology but optimizes the model for specific deployment environments using llama.cpp.

Guide: Running Locally

To run the PHI-4-ABLITERATED-Q8_0-GGUF model locally, follow these steps:

  1. Install llama.cpp:
    Use Homebrew to install llama.cpp on Mac or Linux:

    brew install llama.cpp
    
  2. Invoke Using CLI or Server:

    • CLI:

      llama-cli --hf-repo KnutJaegersberg/phi-4-abliterated-Q8_0-GGUF --hf-file phi-4-abliterated-q8_0.gguf -p "The meaning to life and the universe is"
      
    • Server:

      llama-server --hf-repo KnutJaegersberg/phi-4-abliterated-Q8_0-GGUF --hf-file phi-4-abliterated-q8_0.gguf -c 2048
      
  3. Alternative Setup:

    • Clone the llama.cpp repository:
      git clone https://github.com/ggerganov/llama.cpp
      
    • Navigate and build with necessary flags (e.g., LLAMA_CUDA=1 for Nvidia GPUs):
      cd llama.cpp && LLAMA_CURL=1 make
      
    • Run inference with:
      ./llama-cli --hf-repo KnutJaegersberg/phi-4-abliterated-Q8_0-GGUF --hf-file phi-4-abliterated-q8_0.gguf -p "The meaning to life and the universe is"
      
      or
      ./llama-server --hf-repo KnutJaegersberg/phi-4-abliterated-Q8_0-GGUF --hf-file phi-4-abliterated-q8_0.gguf -c 2048
      

For optimal performance, consider using cloud GPUs such as those offered by AWS, Google Cloud, or Azure.

License

The PHI-4-ABLITERATED-Q8_0-GGUF model is subject to an "other" license. Ensure compliance by reviewing the specific licensing terms provided in the model's documentation.

More Related APIs in Text Generation