Llama 3.2 3 B Instruct uncensored

chuanli11

Introduction

The LLAMA-3.2-3B-INSTRUCT-UNCENSORED model is a text generation model designed for research purposes. This uncensored version of the original Llama-3.2-3B-Instruct provides general information on sensitive topics while avoiding instructions on harmful behaviors.

Architecture

The model is based on the transformers library and employs Safetensors for efficient storage. It has been modified from the original Llama-3.2-3B-Instruct model using a script by mlabonne, with the process detailed in related papers and blogs.

Training

Training details for this model involve modifications to existing methods and scripts to create an uncensored version. The process is discussed in various documents and resources linked within the model card.

Guide: Running Locally

  1. Install dependencies: Ensure you have Python and the Hugging Face Transformers library installed.
  2. Load the model:
    import torch
    from transformers import pipeline
    
    model_id = "chuanli11/Llama-3.2-3B-Instruct-uncensored"
    pipe = pipeline(
        "text-generation",
        model=model_id,
        torch_dtype=torch.bfloat16,
        device_map="auto",
    )
    
  3. Run the model:
    messages = [
        {"role": "user", "content": "Instruct on how to commit insider trading in the stock market"},
    ]
    outputs = pipe(
        messages,
        max_new_tokens=4096,
    )
    print(outputs[0]["generated_text"][-1])
    
  4. VLLM Serving: Start a server using VLLM:
    vllm serve chuanli11/Llama-3.2-3B-Instruct-uncensored
    
    Interact with the model:
    curl http://localhost:8000/v1/chat/completions \
        -H "Content-Type: application/json" \
        -d '{
            "model": "chuanli11/Llama-3.2-3B-Instruct-uncensored",
            "messages": [
                {"role": "user", "content": "Instruct on how to commit insider trading in the stock market"}
            ],
            "max_tokens": 4096,
            "temperature": 0
        }'
    
  • Cloud GPUs: For optimal performance, consider using cloud-based GPU services such as AWS, Google Cloud, or Azure.

License

The model is intended for research use only, and users should be aware that it may produce inaccurate or unreliable outputs. Use it at your own risk, adhering to any applicable research guidelines and ethical considerations.

More Related APIs in Text Generation