T C D S D15 Lo R A

h1t

Introduction

The TCD-SD15-LoRA model is an official LoRA (Low-Rank Adaptation) for Stable Diffusion v1.5, designed to enhance the capabilities of text-to-image generation. It utilizes Trajectory Consistency Distillation as described in the associated paper.

Architecture

The model builds on the base model runwayml/stable-diffusion-v1-5 and uses the diffusers library. It integrates LoRA weights to refine the generative process and improve output quality.

Training

The model employs a specialized distillation process to maintain trajectory consistency, enhancing the generation of detailed and high-quality images. Parameters like eta are used to control stochasticity during inference.

Guide: Running Locally

  1. Environment Setup: Ensure you have Python installed with necessary libraries such as torch and diffusers.
  2. Hardware: Use a CUDA-compatible GPU for optimal performance. Cloud GPUs like those from AWS or Google Cloud can be used.
  3. Installation: Clone the repository and install dependencies.
    git clone https://huggingface.co/h1t/TCD-SD15-LoRA
    cd TCD-SD15-LoRA
    pip install -r requirements.txt
    
  4. Execution: Run the provided example script to generate images.
    import torch
    from diffusers import StableDiffusionPipeline, TCDScheduler
    
    device = "cuda"
    base_model_id = "runwayml/stable-diffusion-v1-5"
    tcd_lora_id = "h1t/TCD-SD15-LoRA"
    pipe = StableDiffusionPipeline.from_pretrained(
        base_model_id, torch_dtype=torch.float16, variant="fp16"
    ).to(device)
    pipe.scheduler = TCDScheduler.from_config(pipe.scheduler.config)
    pipe.load_lora_weights(tcd_lora_id)
    pipe.fuse_lora()
    
    prompt = "Beautiful woman, bubblegum pink, lemon yellow, minty blue, futuristic, high-detail, epic composition, watercolor."
    image = pipe(
        prompt=prompt,
        num_inference_steps=4,
        guidance_scale=0,
        eta=0.3,
        generator=torch.Generator(device=device).manual_seed(42),
    ).images[0]
    

License

The TCD-SD15-LoRA model is released under the MIT License, allowing for wide use and modification.

More Related APIs in Text To Image