Qwen2.5 32 B Instruct abliterated

huihui-ai

Introduction

Qwen2.5-32B-Instruct-Abliterated is an uncensored version of the Qwen2.5-32B-Instruct model, utilizing a technique called "abliteration." It is designed for conversational and text generation tasks.

Architecture

The model is built using the Transformers library and is compatible with the Safetensors format. It supports text generation and inference endpoints, operating primarily in English.

Training

The model utilizes "abliteration," a special technique created by @FailSpy. This method modifies the base Qwen2.5-32B-Instruct model to produce uncensored outputs. The original model was developed by Alibaba Cloud.

Guide: Running Locally

To run the model locally:

  1. Install Dependencies: Ensure Python is installed and set up your environment. Install the Transformers library:

    pip install transformers torch
    
  2. Load the Model and Tokenizer: Use the following Python script to initialize the model:

    from transformers import AutoModelForCausalLM, AutoTokenizer
    
    model_name = "huihui-ai/Qwen2.5-32B-Instruct-abliterated"
    model = AutoModelForCausalLM.from_pretrained(
        model_name,
        torch_dtype="auto",
        device_map="auto"
    )
    tokenizer = AutoTokenizer.from_pretrained(model_name)
    
  3. Run a Conversation: Implement a loop to interact with the model:

    initial_messages = [
        {"role": "system", "content": "You are Qwen, created by Alibaba Cloud. You are a helpful assistant."}
    ]
    messages = initial_messages.copy()
    
    while True:
        user_input = input("User: ").strip()
        if user_input.lower() == "/exit":
            break
        if user_input.lower() == "/clean":
            messages = initial_messages.copy()
            continue
        if not user_input:
            continue
        messages.append({"role": "user", "content": user_input})
        text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
        model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
        generated_ids = model.generate(**model_inputs, max_new_tokens=8192)
        response = tokenizer.batch_decode([output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)], skip_special_tokens=True)[0]
        messages.append({"role": "assistant", "content": response})
        print(f"Qwen: {response}")
    
  4. Consider Cloud GPUs: For optimal performance, consider using cloud GPUs available from providers such as AWS, Google Cloud, or Azure.

License

Qwen2.5-32B-Instruct-Abliterated is licensed under the Apache-2.0 License. For more details, refer to the license file.

More Related APIs in Text Generation