stable diffusion safety checker
CompVisIntroduction
The Stable Diffusion Safety Checker is an image identification model developed to identify NSFW (Not Safe For Work) content. It was developed by CompVis and is based on the CLIP model architecture. The model is primarily intended for AI researchers to assess the robustness, generalization, and biases of computer vision models.
Architecture
The model uses the ViT-L/14 Transformer architecture for the image encoder and a masked self-attention Transformer for the text encoder. These components are trained to maximize the similarity between image-text pairs using a contrastive loss approach.
Training
Training Data
Details about the specific training data used are not provided.
Training Procedure
No specific information is available about the preprocessing, speeds, sizes, or times associated with the training.
Evaluation
The model's evaluation details, including testing data, factors, and metrics, are not disclosed. It is recommended to use the Machine Learning Impact calculator for estimating carbon emissions, though specific hardware, hours, and provider details are not given.
Guide: Running Locally
To use the Stable Diffusion Safety Checker, follow these steps:
-
Install the necessary libraries by running:
pip install transformers
-
Load the model in your Python environment:
from transformers import AutoProcessor, SafetyChecker processor = AutoProcessor.from_pretrained("CompVis/stable-diffusion-safety-checker") safety_checker = SafetyChecker.from_pretrained("CompVis/stable-diffusion-safety-checker")
-
Follow the Hugging Face documentation for further guidance on processing images.
For improved performance, consider using cloud GPUs from providers like AWS, Google Cloud, or Azure.
License
The license for the Stable Diffusion Safety Checker is not specified in the provided documentation. Please refer to the official repository or contact the authors for detailed licensing information.