F L U X.1 Depth dev
black-forest-labsIntroduction
FLUX.1 Depth [dev] is a 12 billion parameter rectified flow transformer capable of generating images from text descriptions while maintaining the structure of a given input image using depth maps. It is designed for both personal and scientific use, promoting new research and innovative artistic workflows.
Architecture
The model uses a rectified flow transformer, which allows for high-quality output and strong adherence to text prompts while preserving the structure of source images based on depth maps. It incorporates guidance distillation for efficiency and offers open weights for research and creative purposes.
Training
FLUX.1 Depth [dev] is trained using guidance distillation techniques, enhancing its efficiency in image generation tasks. This approach allows for impressive prompt adherence and output quality.
Guide: Running Locally
To run FLUX.1 Depth [dev] locally, you can utilize the ๐งจ diffusers library in Python. Follow these steps:
-
Install Dependencies:
pip install -U diffusers pip install git+https://github.com/asomoza/image_gen_aux.git
-
Run the Model:
import torch from diffusers import FluxControlPipeline from diffusers.utils import load_image from image_gen_aux import DepthPreprocessor pipe = FluxControlPipeline.from_pretrained("black-forest-labs/FLUX.1-Depth-dev", torch_dtype=torch.bfloat16).to("cuda") prompt = "A robot made of exotic candies and chocolates of different kinds. The background is filled with confetti and celebratory gifts." control_image = load_image("https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/robot.png") processor = DepthPreprocessor.from_pretrained("LiheYoung/depth-anything-large-hf") control_image = processor(control_image)[0].convert("RGB") image = pipe( prompt=prompt, control_image=control_image, height=1024, width=1024, num_inference_steps=30, guidance_scale=10.0, generator=torch.Generator().manual_seed(42), ).images[0] image.save("output.png")
-
Cloud GPUs: For optimal performance, consider using cloud GPU services such as AWS, Google Cloud, or Azure.
License
FLUX.1 Depth [dev] is distributed under the FLUX.1 [dev] Non-Commercial License. Usage is restricted to non-commercial purposes as outlined in the license here.