llama2_7b_chat_uncensored
georgesungIntroduction
The LLAMA2_7B_CHAT_UNCENSORED model is a fine-tuned version of Llama-2 7B, utilizing an uncensored/unfiltered Wizard-Vicuna conversation dataset. It leverages QLoRA for fine-tuning and is trained for one epoch on a 24GB GPU, specifically an NVIDIA A10G, which took approximately 19 hours. The model is available in fp16 format on Hugging Face.
Architecture
The model builds upon the Llama-2 7B architecture and has been fine-tuned using a conversation dataset to enhance its text generation capabilities. It is compatible with various model formats, including GGML and GPTQ, thanks to contributions from TheBloke.
Training
The training utilized QLoRA and was performed using the uncensored Wizard-Vicuna dataset, originally sourced from ehartford. The training took place over one epoch on an NVIDIA A10G GPU, which required around 19 hours of processing time.
Guide: Running Locally
To run the model locally, follow these steps:
-
Clone the Repository:
git clone https://github.com/georgesung/llm_qlora cd llm_qlora
-
Install Requirements:
pip install -r requirements.txt
-
Run the Training Script:
python train.py configs/llama2_7b_chat_uncensored.yaml
For optimal performance, consider using cloud GPU services such as AWS, Google Cloud, or Azure to access high-performance GPUs like the NVIDIA A10G.
License
The model and associated files are distributed under an unspecified license, indicated as "other." Please review the license terms provided in the repository or on the Hugging Face model page for specific usage rights and restrictions.