svdquant models

mit-han-lab

Introduction

SVDQuant is a model repository developed by MIT's HAN Lab, focused on quantization for image generation tasks using low-rank components. It is designed to work seamlessly with pre-existing LoRAs, maintaining high image quality while utilizing INT4 quantization.

Architecture

The architecture involves a quantization library, DeepCompressor, and an inference engine, Nunchaku. The model encompasses various styles, including Realism, Ghibsky Illustration, Anime, Children Sketch, and Yarn Art, leveraging the FLUX.1-dev framework.

Training

The model uses a converted LoRA collection for SVDQuant INT4 FLUX.1-dev, allowing it to integrate different styles without re-quantization. It matches the image quality of the original 16-bit FLUX.1-dev.

Guide: Running Locally

  1. Setup Environment: Follow instructions in the Nunchaku GitHub repository to set up the environment.
  2. Import and Initialize: Use the nunchaku library to load the pre-trained model and set parameters.
    import torch
    from nunchaku.pipelines import flux as nunchaku_flux
    
    pipeline = nunchaku_flux.from_pretrained(
        "black-forest-labs/FLUX.1-dev",
        torch_dtype=torch.bfloat16,
        qmodel_path="mit-han-lab/svdq-int4-flux.1-dev"
    ).to("cuda")
    
  3. Run Model: Execute the pipeline with desired parameters to generate images.
    image = pipeline("a dog wearing a wizard hat", num_inference_steps=28, guidance_scale=3.5).images[0]
    image.save("example.png")
    

Suggested Cloud GPUs

  • NVIDIA GPUs with architectures sm_86 (Ampere: RTX 3090, A6000), sm_89 (Ada: RTX 4090), and sm_80 (A100) are recommended.

License

The model and associated files are licensed under the flux-1-dev-non-commercial-license, which permits non-commercial use only.

More Related APIs in Text To Image