Introduction

Hyper-SD is a state-of-the-art diffusion model acceleration technique designed to enhance image synthesis efficiency. This repository includes models distilled from various sources such as FLUX.1-dev, SD3-Medium, SDXL Base 1.0, and Stable-Diffusion v1-5.

Architecture

Hyper-SD leverages advanced techniques to enable faster and more efficient image generation. It supports multiple configurations, including LoRA checkpoints for different models and steps. Key components involve the use of LoRA (Low-Rank Adaptation) for model fine-tuning and various schedulers like DDIM and TCD for improved image quality.

Training

The training process incorporates LoRA scales and guidance scales to optimize the performance of each model variant. Models can be adapted to different step configurations, allowing for versatility in generating high-quality images. The repository provides detailed instructions and examples for integrating these models into various pipelines.

Guide: Running Locally

  1. Setup Environment:

    • Ensure your environment has access to a CUDA-enabled GPU.
    • Install necessary dependencies using a package manager like pip.
  2. Download Models:

    • Use the Hugging Face Hub to download the required model checkpoints. Ensure you have access tokens if the model is gated.
  3. Run Text-to-Image Inference:

    • Choose the appropriate pipeline and load the model using the provided DiffusionPipeline or StableDiffusionPipeline classes.
    • Load the LoRA weights and fuse them with the model.
    • Set the scheduler and inference parameters, such as guidance scale and number of steps.
  4. Example Code:

    import torch
    from diffusers import DiffusionPipeline, TCDScheduler
    from huggingface_hub import hf_hub_download
    
    base_model_id = "stabilityai/stable-diffusion-xl-base-1.0"
    repo_name = "ByteDance/Hyper-SD"
    ckpt_name = "Hyper-SDXL-1step-lora.safetensors"
    
    pipe = DiffusionPipeline.from_pretrained(base_model_id, torch_dtype=torch.float16, variant="fp16").to("cuda")
    pipe.load_lora_weights(hf_hub_download(repo_name, ckpt_name))
    pipe.fuse_lora()
    pipe.scheduler = TCDScheduler.from_config(pipe.scheduler.config)
    
    prompt = "a photo of a cat"
    image = pipe(prompt=prompt, num_inference_steps=1, guidance_scale=0, eta=1.0).images[0]
    image.save("output.png")
    
  5. Suggest Cloud GPUs:

    • Consider using cloud GPU providers like AWS, GCP, or Azure for scalable and efficient computation.

License

The Hyper-SD project is licensed under the terms specified in the repository. Ensure compliance with the license when using or distributing the models and related code. For detailed license terms, refer to the repository's official documentation.

More Related APIs in Text To Image