Introduction

InstantIR is an advanced single-image restoration model that enhances damaged images, delivering high-quality and realistic details. The model can further improve performance with additional text prompts for customized editing.

Architecture

InstantIR is built on a framework using the diffusers library, specifically designed for image-to-image tasks. The model employs pretrained weights and several components, including an adapter, aggregator, and previewer LoRA, to perform restoration tasks effectively.

Training

InstantIR utilizes pretrained models from the stabilityai/stable-diffusion-xl-base-1.0 as its base. The architecture supports the integration of additional model weights to optimize image restoration capabilities. The model can run on GPU with half-precision (fp16) for enhanced performance.

Guide: Running Locally

  1. Clone the GitHub Repository

    git clone https://github.com/JY-Joy/InstantIR.git
    cd InstantIR
    
  2. Download Model Weights

    Use the following Python script to download necessary weights:

    from huggingface_hub import hf_hub_download
    
    hf_hub_download(repo_id="InstantX/InstantIR", filename="models/adapter.pt", local_dir=".")
    hf_hub_download(repo_id="InstantX/InstantIR", filename="models/aggregator.pt", local_dir=".")
    hf_hub_download(repo_id="InstantX/InstantIR", filename="models/previewer_lora_weights.bin", local_dir=".")
    
  3. Load InstantIR with Diffusers

    # Install required packages
    # !pip install diffusers opencv-python transformers accelerate
    
    import torch
    from PIL import Image
    from diffusers import DDPMScheduler
    from schedulers.lcm_single_step_scheduler import LCMSingleStepScheduler
    from module.ip_adapter.utils import load_adapter_to_pipe
    from pipelines.sdxl_instantir import InstantIRPipeline
    
    # Set model path and load pretrained models
    instantir_path = './models'
    pipe = InstantIRPipeline.from_pretrained('stabilityai/stable-diffusion-xl-base-1.0', torch_dtype=torch.float16)
    
    # Load adapter and aggregator weights
    load_adapter_to_pipe(pipe, f"{instantir_path}/adapter.pt", image_encoder_or_path='facebook/dinov2-large')
    pipe.prepare_previewers(instantir_path)
    pipe.scheduler = DDPMScheduler.from_pretrained('stabilityai/stable-diffusion-xl-base-1.0', subfolder="scheduler")
    lcm_scheduler = LCMSingleStepScheduler.from_config(pipe.scheduler.config)
    pretrained_state_dict = torch.load(f"{instantir_path}/aggregator.pt")
    pipe.aggregator.load_state_dict(pretrained_state_dict)
    
    # Send model to GPU
    pipe.to(device='cuda', dtype=torch.float16)
    pipe.aggregator.to(device='cuda', dtype=torch.float16)
    
  4. Restore Images

    low_quality_image = Image.open('path/to/your-image').convert("RGB")
    image = pipe(image=low_quality_image, previewer_scheduler=lcm_scheduler).images[0]
    

Use cloud GPUs, such as those available on AWS, GCP, or Azure, for optimal performance and efficiency.

License

InstantIR is released under the Apache License 2.0. Users must comply with local laws and utilize the tool responsibly. The developers disclaim responsibility for misuse.

More Related APIs in Image To Image