sdxl turbo
stabilityaiIntroduction
SDXL-Turbo is a fast generative text-to-image model capable of synthesizing photorealistic images from a text prompt in a single network evaluation. It is designed for real-time synthesis using Adversarial Diffusion Distillation (ADD), which enables high-quality image generation in just 1 to 4 steps.
Architecture
SDXL-Turbo is a distilled version of SDXL 1.0, utilizing a novel training method called ADD. This approach combines score distillation and adversarial loss to maintain high image fidelity even with low-step sampling. It is a generative text-to-image model developed and funded by Stability AI.
Training
The model is fine-tuned from the SDXL 1.0 Base and leverages large-scale off-the-shelf image diffusion models as a teacher signal. The training method ensures high-quality image generation with minimal sampling steps.
Guide: Running Locally
- Installation: Install the required packages using pip.
pip install diffusers transformers accelerate --upgrade
- Setup: Use the
diffusers
library to load and run the model.- For text-to-image generation:
from diffusers import AutoPipelineForText2Image import torch pipe = AutoPipelineForText2Image.from_pretrained("stabilityai/sdxl-turbo", torch_dtype=torch.float16, variant="fp16") pipe.to("cuda") prompt = "A cinematic shot of a baby racoon wearing an intricate Italian priest robe." image = pipe(prompt=prompt, num_inference_steps=1, guidance_scale=0.0).images[0]
- For image-to-image generation:
from diffusers import AutoPipelineForImage2Image from diffusers.utils import load_image import torch pipe = AutoPipelineForImage2Image.from_pretrained("stabilityai/sdxl-turbo", torch_dtype=torch.float16, variant="fp16") pipe.to("cuda") init_image = load_image("https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/cat.png").resize((512, 512)) prompt = "cat wizard, gandalf, lord of the rings, detailed, fantasy, cute, adorable, Pixar, Disney, 8k" image = pipe(prompt, image=init_image, num_inference_steps=2, strength=0.5, guidance_scale=0.0).images[0]
- For text-to-image generation:
- Hardware Recommendations: Utilize cloud GPUs for optimal performance, such as those provided by AWS, Google Cloud, or Azure.
License
SDXL-Turbo is available under the sai-nc-community license. For commercial use, consult the Stability AI license at https://stability.ai/license.