FLUX.1-DEV-CONTROLNET-UNION

Introduction

FLUX.1-DEV-CONTROLNET-UNION is a model designed for text-to-image generation, leveraging ControlNet and Diffusers technologies to enhance image generation capabilities. It supports multiple control modes for fine-tuning outputs based on specific criteria like canny, tile, depth, blur, pose, gray, and lq.

Architecture

The model architecture includes ControlNet and Diffusers, allowing for integration with various control modes. It employs a base model, black-forest-labs/FLUX.1-dev, and utilizes FluxControlNetPipeline to manage the inference process.

Training

Training the union controlnet requires substantial computational resources. The current release is a beta version intended to stimulate community engagement and development within the Flux ecosystem. Although the beta version is not fully trained, it is expected that ongoing training will improve model performance, potentially matching specialized models in the future.

Guide: Running Locally

To run the model locally, follow these steps:

Setup Environment: Ensure that you have PyTorch and Hugging Face's Diffusers library installed.

Load Models:

import torch
from diffusers import FluxControlNetPipeline, FluxControlNetModel
base_model = 'black-forest-labs/FLUX.1-dev'
controlnet_model = 'InstantX/FLUX.1-dev-Controlnet-Union'
controlnet = FluxControlNetModel.from_pretrained(controlnet_model, torch_dtype=torch.bfloat16)
pipe = FluxControlNetPipeline.from_pretrained(base_model, controlnet=controlnet, torch_dtype=torch.bfloat16)
pipe.to("cuda")

Run Inference:

from diffusers.utils import load_image
control_image = load_image("https://huggingface.co/InstantX/FLUX.1-dev-Controlnet-Union-alpha/resolve/main/images/canny.jpg")
prompt = 'A bohemian-style female travel blogger with sun-kissed skin and messy beach waves.'
image = pipe(prompt, control_image=control_image, control_mode=0, width=512, height=512, controlnet_conditioning_scale=0.5, num_inference_steps=24, guidance_scale=3.5).images[0]
image.save("image.jpg")

For optimal performance, using a cloud GPU such as AWS EC2 with NVIDIA GPUs or Google Cloud's GPU offerings is recommended.

License

This model is released under the flux-1-dev-non-commercial-license. For detailed licensing information, visit the license page.