F L U X.1 Canny dev

black-forest-labs

Introduction

FLUX.1 Canny [dev] is a 12 billion parameter rectified flow transformer model designed for generating images from text descriptions while preserving the structure of input images using canny edges. It offers cutting-edge output quality and is efficient due to guidance distillation. The model is available for personal, scientific, and commercial use under a specific non-commercial license.

Architecture

FLUX.1 Canny [dev] combines advanced prompt adherence with the ability to maintain the structural integrity of source images through canny edge detection. It supports open weights for research and artistic innovation.

Training

The model is trained using guidance distillation, enhancing its efficiency and output quality. It aims to deliver high fidelity in image generation while adhering to the prompts provided.

Guide: Running Locally

  1. Install Required Libraries: Ensure you have the latest versions of diffusers and controlnet_aux.

    pip install -U diffusers controlnet_aux
    
  2. Load and Run the Model: Use the following Python code to generate images with FLUX.1-Canny-dev:

    import torch
    from controlnet_aux import CannyDetector
    from diffusers import FluxControlPipeline
    from diffusers.utils import load_image
    
    pipe = FluxControlPipeline.from_pretrained("black-forest-labs/FLUX.1-Canny-dev", torch_dtype=torch.bfloat16).to("cuda")
    
    prompt = "A robot made of exotic candies and chocolates of different kinds. The background is filled with confetti and celebratory gifts."
    control_image = load_image("https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/robot.png")
    
    processor = CannyDetector()
    control_image = processor(control_image, low_threshold=50, high_threshold=200, detect_resolution=1024, image_resolution=1024)
    
    image = pipe(
        prompt=prompt,
        control_image=control_image,
        height=1024,
        width=1024,
        num_inference_steps=50,
        guidance_scale=30.0,
    ).images[0]
    image.save("output.png")
    
  3. Hardware Recommendations: For optimal performance, consider using cloud GPUs such as those available on AWS, GCP, or Azure.

License

The model is distributed under the FLUX.1 [dev] Non-Commercial License. For detailed terms, refer to the license document.

More Related APIs in Text To Image