Introduction

FLUX.1 [dev] is a 12 billion parameter rectified flow transformer designed for generating images from text descriptions. It aims to provide high-quality outputs with competitive prompt following capabilities and is trained using guidance distillation for efficiency. The model's open weights are intended to support scientific research and creative development.

Architecture

FLUX.1 [dev] utilizes a rectified flow transformer architecture, which enhances its ability to generate detailed and accurate images based on text prompts.

Training

The model was trained using guidance distillation, a technique that improves its efficiency and performance. This approach allows FLUX.1 [dev] to match the output quality of other closed-source models while maintaining openness for research and artistic use.

Model Stats Number

  • Parameters: 12 billion

Guide: Running Locally

To run FLUX.1 [dev] locally, you can use the Python diffusers library. Follow the steps below:

  1. Install or Upgrade Diffusers:

    pip install -U diffusers
    
  2. Load and Run the Model:

    import torch
    from diffusers import FluxPipeline
    
    pipe = FluxPipeline.from_pretrained("black-forest-labs/FLUX.1-dev", torch_dtype=torch.bfloat16)
    pipe.enable_model_cpu_offload() # Save VRAM by offloading model to CPU if necessary
    
    prompt = "A cat holding a sign that says hello world"
    image = pipe(
        prompt,
        height=1024,
        width=1024,
        guidance_scale=3.5,
        num_inference_steps=50,
        max_sequence_length=512,
        generator=torch.Generator("cpu").manual_seed(0)
    ).images[0]
    image.save("flux-dev.png")
    
  3. Cloud GPUs: For improved performance, consider using cloud GPU services such as AWS, Google Cloud, or Azure to run the model.

License

FLUX.1 [dev] is distributed under the FLUX.1 [dev] Non-Commercial License. For detailed terms, refer to the license document.

More Related APIs in Text To Image