sdxl vae fp16 fix LLM Model

Introduction

SDXL-VAE-FP16-Fix is a modified version of the SDXL VAE designed to operate in fp16 precision without producing NaNs (Not a Number). This fix addresses issues with the original SDXL-VAE by adjusting internal network parameters.

Architecture

The model adjusts the internal activation values by scaling down weights and biases, reducing the likelihood of generating NaNs in fp16 mode. This approach ensures the final output remains largely consistent with the original SDXL-VAE, making it suitable for most use cases.

Training

The SDXL-VAE-FP16-Fix was finetuned to maintain output consistency with the original SDXL-VAE while decreasing internal activation magnitudes. This was achieved by altering network weights and biases, resulting in a more stable fp16 execution without significant output discrepancies.

Guide: Running Locally

To run the SDXL-VAE-FP16-Fix model locally, you can follow these steps:

Diffusers Usage:

Load the model using the AutoencoderKL class from the Diffusers library.

Use the following Python code:

import torch
from diffusers import DiffusionPipeline, AutoencoderKL

vae = AutoencoderKL.from_pretrained("madebyollin/sdxl-vae-fp16-fix", torch_dtype=torch.float16)
pipe = DiffusionPipeline.from_pretrained("stabilityai/stable-diffusion-xl-base-1.0", vae=vae, torch_dtype=torch.float16, variant="fp16", use_safetensors=True)
pipe.to("cuda")

refiner = DiffusionPipeline.from_pretrained("stabilityai/stable-diffusion-xl-refiner-1.0", vae=vae, torch_dtype=torch.float16, use_safetensors=True, variant="fp16")
refiner.to("cuda")

n_steps = 40
high_noise_frac = 0.7

prompt = "A majestic lion jumping from a big stone at night"

image = pipe(prompt=prompt, num_inference_steps=n_steps, denoising_end=high_noise_frac, output_type="latent").images
image = refiner(prompt=prompt, num_inference_steps=n_steps, denoising_start=high_noise_frac, image=image).images[0]

Ensure CUDA is available for GPU acceleration.

Automatic1111 Usage:
- Download the fixed sdxl.vae.safetensors file.
- Place the file in the stable-diffusion-webui/models/VAE directory.
- Select the fixed VAE in webui settings.
- Remove the --no-half-vae command line argument if previously used.

Cloud GPUs: Consider using cloud-based GPUs such as AWS EC2, Google Cloud, or Azure for improved performance when running these models.

License

The SDXL-VAE-FP16-Fix is licensed under the MIT License, allowing for open-source usage and contributions.

More Related APIs