F L U X.1 Canny dev LLM Model

Introduction

FLUX.1 Canny [dev] is a 12 billion parameter rectified flow transformer model designed for generating images from text descriptions while preserving the structure of input images using canny edges. It offers cutting-edge output quality and is efficient due to guidance distillation. The model is available for personal, scientific, and commercial use under a specific non-commercial license.

Architecture

FLUX.1 Canny [dev] combines advanced prompt adherence with the ability to maintain the structural integrity of source images through canny edge detection. It supports open weights for research and artistic innovation.

Training

The model is trained using guidance distillation, enhancing its efficiency and output quality. It aims to deliver high fidelity in image generation while adhering to the prompts provided.

Guide: Running Locally

Install Required Libraries: Ensure you have the latest versions of diffusers and controlnet_aux.
```
pip install -U diffusers controlnet_aux
```

Load and Run the Model: Use the following Python code to generate images with FLUX.1-Canny-dev:

import torch
from controlnet_aux import CannyDetector
from diffusers import FluxControlPipeline
from diffusers.utils import load_image

pipe = FluxControlPipeline.from_pretrained("black-forest-labs/FLUX.1-Canny-dev", torch_dtype=torch.bfloat16).to("cuda")

prompt = "A robot made of exotic candies and chocolates of different kinds. The background is filled with confetti and celebratory gifts."
control_image = load_image("https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/robot.png")

processor = CannyDetector()
control_image = processor(control_image, low_threshold=50, high_threshold=200, detect_resolution=1024, image_resolution=1024)

image = pipe(
    prompt=prompt,
    control_image=control_image,
    height=1024,
    width=1024,
    num_inference_steps=50,
    guidance_scale=30.0,
).images[0]
image.save("output.png")

Hardware Recommendations: For optimal performance, consider using cloud GPUs such as those available on AWS, GCP, or Azure.

License

The model is distributed under the FLUX.1 [dev] Non-Commercial License. For detailed terms, refer to the license document.

More Related APIs in Text To Image