Falcon3 7 B Instruct abliterated G G U F

bartowski

Introduction

The Falcon3-7B-Instruct-abliterated-GGUF model is designed for text generation, supporting four languages: English, French, Spanish, and Portuguese. It utilizes the GGUF library and features uncensored and conversational capabilities. The model is provided under the Falcon LLM license.

Architecture

This model is a quantized variant of the original Falcon3-7B-Instruct-abliterated model using llama.cpp, specifically release b4381. It employs various quantization techniques to optimize performance across different systems, including ARM and AVX machines.

Training

The model is trained using an imatrix option with a specific dataset, allowing for diverse quantization methods. These variations cater to different quality and performance requirements, from high-quality options like Q8_0 to more compact formats such as IQ3_XS.

Guide: Running Locally

To run the model locally, follow these steps:

  1. Install dependencies: Ensure you have the huggingface_hub CLI installed:

    pip install -U "huggingface_hub[cli]"
    
  2. Download the model: Use the CLI to download the desired quantized file. For example:

    huggingface-cli download bartowski/Falcon3-7B-Instruct-abliterated-GGUF --include "Falcon3-7B-Instruct-abliterated-Q4_K_M.gguf" --local-dir ./
    
  3. Select the appropriate file: Choose a quantization file based on your system's RAM and VRAM capacity. For best performance, select a file size 1-2GB smaller than your GPU's VRAM.

  4. Run the model: Use LM Studio or any compatible inference engine to execute the model.

Cloud GPUs

For enhanced performance, consider utilizing cloud GPUs such as AWS, Google Cloud, or Azure, which can handle larger models and provide faster inference times.

License

The Falcon3-7B-Instruct-abliterated-GGUF model is licensed under the Falcon LLM license. For detailed terms and conditions, visit the license page.

More Related APIs in Text Generation