Qwen2.5 32 B Instruct abliterated
huihui-aiIntroduction
Qwen2.5-32B-Instruct-Abliterated is an uncensored version of the Qwen2.5-32B-Instruct model, utilizing a technique called "abliteration." It is designed for conversational and text generation tasks.
Architecture
The model is built using the Transformers library and is compatible with the Safetensors format. It supports text generation and inference endpoints, operating primarily in English.
Training
The model utilizes "abliteration," a special technique created by @FailSpy. This method modifies the base Qwen2.5-32B-Instruct model to produce uncensored outputs. The original model was developed by Alibaba Cloud.
Guide: Running Locally
To run the model locally:
-
Install Dependencies: Ensure Python is installed and set up your environment. Install the Transformers library:
pip install transformers torch
-
Load the Model and Tokenizer: Use the following Python script to initialize the model:
from transformers import AutoModelForCausalLM, AutoTokenizer model_name = "huihui-ai/Qwen2.5-32B-Instruct-abliterated" model = AutoModelForCausalLM.from_pretrained( model_name, torch_dtype="auto", device_map="auto" ) tokenizer = AutoTokenizer.from_pretrained(model_name)
-
Run a Conversation: Implement a loop to interact with the model:
initial_messages = [ {"role": "system", "content": "You are Qwen, created by Alibaba Cloud. You are a helpful assistant."} ] messages = initial_messages.copy() while True: user_input = input("User: ").strip() if user_input.lower() == "/exit": break if user_input.lower() == "/clean": messages = initial_messages.copy() continue if not user_input: continue messages.append({"role": "user", "content": user_input}) text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True) model_inputs = tokenizer([text], return_tensors="pt").to(model.device) generated_ids = model.generate(**model_inputs, max_new_tokens=8192) response = tokenizer.batch_decode([output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)], skip_special_tokens=True)[0] messages.append({"role": "assistant", "content": response}) print(f"Qwen: {response}")
-
Consider Cloud GPUs: For optimal performance, consider using cloud GPUs available from providers such as AWS, Google Cloud, or Azure.
License
Qwen2.5-32B-Instruct-Abliterated is licensed under the Apache-2.0 License. For more details, refer to the license file.