Introduction

The Latent Consistency Model (LCM) LoRA is designed to accelerate stable-diffusion models by reducing the number of inference steps required, allowing for efficient text-to-image synthesis. Its primary objective is to enhance the performance of the Stable-Diffusion XL models by acting as a distilled consistency adapter.

Architecture

LCM-LoRA operates in conjunction with the stabilityai/stable-diffusion-xl-base-1.0 model and is integrated into the Hugging Face Diffusers library. It optimizes the diffusion process by minimizing inference steps to between 2 and 8, leveraging the LoRA (Low-Rank Adaptation) technique.

Training

Details on the training process have not been provided. This section is marked as "TODO," indicating that documentation or updates might be forthcoming.

Guide: Running Locally

To use LCM-LoRA locally:

  1. Install Required Libraries:

    pip install --upgrade pip
    pip install --upgrade diffusers transformers accelerate peft
    
  2. Load the Model and Adapter:

    import torch
    from diffusers import LCMScheduler, AutoPipelineForText2Image
    
    model_id = "stabilityai/stable-diffusion-xl-base-1.0"
    adapter_id = "latent-consistency/lcm-lora-sdxl"
    
    pipe = AutoPipelineForText2Image.from_pretrained(model_id, torch_dtype=torch.float16, variant="fp16")
    pipe.scheduler = LCMScheduler.from_config(pipe.scheduler.config)
    pipe.to("cuda")
    
    # Load and fuse LCM-LoRA
    pipe.load_lora_weights(adapter_id)
    pipe.fuse_lora()
    
    prompt = "Self-portrait oil painting, a beautiful cyborg with golden hair, 8k"
    image = pipe(prompt=prompt, num_inference_steps=4, guidance_scale=0).images[0]
    
  3. Consider Using Cloud GPUs:

    • For optimal performance, especially when working with large models or datasets, it's recommended to use cloud-based GPU services such as AWS, GCP, or Azure.

License

LCM-LoRA is distributed under the OpenRAIL++ license, which is a permissive open-source license.

More Related APIs in Text To Image