deberta v3 base prompt injection

protectai

Introduction

The DeBERTa-v3-base-prompt-injection model is a fine-tuned version of Microsoft's DeBERTa-v3-base, designed to identify prompt injections. It classifies inputs into two categories: 0 for no injection and 1 for injection detected. The model demonstrates high performance, achieving nearly perfect accuracy, recall, precision, and F1 scores on its evaluation set.

Architecture

The model is based on the DeBERTa-v3 architecture and is fine-tuned for text classification tasks. It uses several datasets for training, focusing on identifying malicious prompt injections. The model is optimized for performance in detecting security threats in natural language inputs.

Training

The model was fine-tuned using a mixed dataset comprising approximately 30% prompt injection examples and 70% normal prompts. Key training hyperparameters include:

  • Learning Rate: 2e-05
  • Train Batch Size: 8
  • Eval Batch Size: 8
  • Optimizer: Adam with betas=(0.9, 0.999) and epsilon=1e-08
  • Number of Epochs: 3

The training process utilized a linear learning rate scheduler with 500 warm-up steps. Framework versions used include Transformers 4.35.2, PyTorch 2.1.1+cu121, Datasets 2.15.0, and Tokenizers 0.15.0.

Guide: Running Locally

Basic Steps

  1. Install Dependencies: Ensure you have Python and Pip installed. Install necessary libraries using:

    pip install transformers torch optimum
    
  2. Load Model: Use the following code to load the model and tokenizer:

    from transformers import AutoTokenizer, AutoModelForSequenceClassification, pipeline
    import torch
    
    tokenizer = AutoTokenizer.from_pretrained("ProtectAI/deberta-v3-base-prompt-injection")
    model = AutoModelForSequenceClassification.from_pretrained("ProtectAI/deberta-v3-base-prompt-injection")
    
    classifier = pipeline(
      "text-classification",
      model=model,
      tokenizer=tokenizer,
      truncation=True,
      max_length=512,
      device=torch.device("cuda" if torch.cuda.is_available() else "cpu"),
    )
    
  3. Run Inference: Test the model with a sample input:

    print(classifier("Your prompt injection is here"))
    

Cloud GPUs

For faster processing, consider using cloud GPUs from providers like AWS, Google Cloud, or Azure. This will enhance computation speed and allow for handling larger datasets efficiently.

License

The model is licensed under the Apache License 2.0, which allows for both personal and commercial use, modifications, and distribution. Ensure compliance with the license terms when using the model.

More Related APIs in Text Classification