svdquant models
mit-han-labIntroduction
SVDQuant is a model repository developed by MIT's HAN Lab, focused on quantization for image generation tasks using low-rank components. It is designed to work seamlessly with pre-existing LoRAs, maintaining high image quality while utilizing INT4 quantization.
Architecture
The architecture involves a quantization library, DeepCompressor, and an inference engine, Nunchaku. The model encompasses various styles, including Realism, Ghibsky Illustration, Anime, Children Sketch, and Yarn Art, leveraging the FLUX.1-dev framework.
Training
The model uses a converted LoRA collection for SVDQuant INT4 FLUX.1-dev, allowing it to integrate different styles without re-quantization. It matches the image quality of the original 16-bit FLUX.1-dev.
Guide: Running Locally
- Setup Environment: Follow instructions in the Nunchaku GitHub repository to set up the environment.
- Import and Initialize: Use the
nunchaku
library to load the pre-trained model and set parameters.import torch from nunchaku.pipelines import flux as nunchaku_flux pipeline = nunchaku_flux.from_pretrained( "black-forest-labs/FLUX.1-dev", torch_dtype=torch.bfloat16, qmodel_path="mit-han-lab/svdq-int4-flux.1-dev" ).to("cuda")
- Run Model: Execute the pipeline with desired parameters to generate images.
image = pipeline("a dog wearing a wizard hat", num_inference_steps=28, guidance_scale=3.5).images[0] image.save("example.png")
Suggested Cloud GPUs
- NVIDIA GPUs with architectures sm_86 (Ampere: RTX 3090, A6000), sm_89 (Ada: RTX 4090), and sm_80 (A100) are recommended.
License
The model and associated files are licensed under the flux-1-dev-non-commercial-license, which permits non-commercial use only.