F L U X.1 Canny dev lora

black-forest-labs

Introduction

FLUX.1 Canny [dev] LoRA is a model developed by Black Forest Labs, capable of generating images based on text descriptions while maintaining the structure of input images using canny edges. It is derived from the FLUX.1 Canny [dev], a 12 billion parameter rectified flow transformer. The model supports advanced image generation and can be used for personal, scientific, and commercial purposes under specific license terms.

Architecture

The model leverages a rectified flow transformer architecture, allowing it to generate high-quality images that adhere to input prompts. It is trained with guidance distillation, which enhances efficiency and maintains the structure of source images based on canny edges. The architecture is designed to support open weights, facilitating new scientific research and creative workflows.

Training

FLUX.1 Canny [dev] LoRA is trained using guidance distillation to improve efficiency and output quality. The training process focuses on achieving a balance between prompt adherence and preserving the structure of input images. The model's open weights are intended to support further research and development, allowing artists and developers to explore new applications.

Guide: Running Locally

To run FLUX.1 Canny [dev] LoRA locally:

  1. Install the necessary libraries:

    pip install -U git+https://github.com/huggingface/diffusers
    pip install -U controlnet-aux
    pip install -U peft
    
  2. Set up the model:

    import torch
    from controlnet_aux import CannyDetector
    from diffusers import FluxControlPipeline
    from diffusers.utils import load_image
    
    pipe = FluxControlPipeline.from_pretrained("black-forest-labs/FLUX.1-dev", torch_dtype=torch.bfloat16).to("cuda")
    pipe.load_lora_weights("black-forest-labs/FLUX.1-Canny-dev-lora", adapter_name="canny")
    pipe.set_adapters("canny", 0.85)
    
    prompt = "A robot made of exotic candies and chocolates of different kinds. The background is filled with confetti and celebratory gifts."
    control_image = load_image("https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/robot.png")
    
    processor = CannyDetector()
    control_image = processor(control_image, low_threshold=50, high_threshold=200, detect_resolution=1024, image_resolution=1024)
    
    image = pipe(
        prompt=prompt,
        control_image=control_image,
        height=1024,
        width=1024,
        num_inference_steps=50,
        guidance_scale=30.0,
    ).images[0]
    image.save("output.png")
    
  3. Consider using cloud GPUs: For optimal performance and faster processing, consider using cloud GPU services such as AWS EC2, Google Cloud, or Azure.

License

The model is distributed under the FLUX.1 [dev] Non-Commercial License. Users must agree to the FluxDev Non-Commercial License Agreement and adhere to the Acceptable Use Policy.

More Related APIs