llama2_7b_chat_uncensored

georgesung

Introduction

The LLAMA2_7B_CHAT_UNCENSORED model is a fine-tuned version of Llama-2 7B, utilizing an uncensored/unfiltered Wizard-Vicuna conversation dataset. It leverages QLoRA for fine-tuning and is trained for one epoch on a 24GB GPU, specifically an NVIDIA A10G, which took approximately 19 hours. The model is available in fp16 format on Hugging Face.

Architecture

The model builds upon the Llama-2 7B architecture and has been fine-tuned using a conversation dataset to enhance its text generation capabilities. It is compatible with various model formats, including GGML and GPTQ, thanks to contributions from TheBloke.

Training

The training utilized QLoRA and was performed using the uncensored Wizard-Vicuna dataset, originally sourced from ehartford. The training took place over one epoch on an NVIDIA A10G GPU, which required around 19 hours of processing time.

Guide: Running Locally

To run the model locally, follow these steps:

  1. Clone the Repository:

    git clone https://github.com/georgesung/llm_qlora
    cd llm_qlora
    
  2. Install Requirements:

    pip install -r requirements.txt
    
  3. Run the Training Script:

    python train.py configs/llama2_7b_chat_uncensored.yaml
    

For optimal performance, consider using cloud GPU services such as AWS, Google Cloud, or Azure to access high-performance GPUs like the NVIDIA A10G.

License

The model and associated files are distributed under an unspecified license, indicated as "other." Please review the license terms provided in the repository or on the Hugging Face model page for specific usage rights and restrictions.

More Related APIs in Text Generation