F L U X.1 Turbo Alpha LLM Model

Introduction

FLUX.1-Turbo-Alpha is an advanced text-to-image model developed by Alimama-Creative, based on the FLUX.1-dev framework. It incorporates a multi-head discriminator to enhance the quality of image generation and is suitable for various applications including text-to-image transformation, inpainting, and controlnet tasks.

Architecture

The model uses a distilled Lora approach with an 8-step process, enhancing the original FLUX.1-dev model. It employs a multi-head discriminator architecture to improve distillation quality. The model is specifically designed for tasks such as text-to-image generation and inpainting controlnets.

Training

FLUX.1-Turbo-Alpha is trained on a dataset of 1 million images sourced from both open and internal channels. These images are selected based on aesthetic scores of 6.3+ and resolutions above 800x800 pixels. The model uses adversarial training with the original FLUX.1-dev transformer as the discriminator backbone and adds multi heads to each transformer layer. Key training parameters include:

Mixed precision: bf16
Learning rate: 2e-5
Batch size: 64
Image size: 1024x1024
Guidance scale: 3.5
Time shift: 3

Guide: Running Locally

To run FLUX.1-Turbo-Alpha locally, follow these steps:

Install Dependencies: Ensure you have the diffusers library installed.

Load Model:

import torch
from diffusers.pipelines import FluxPipeline

model_id = "black-forest-labs/FLUX.1-dev"
adapter_id = "alimama-creative/FLUX.1-Turbo-Alpha"

pipe = FluxPipeline.from_pretrained(
  model_id,
  torch_dtype=torch.bfloat16
)
pipe.to("cuda")

pipe.load_lora_weights(adapter_id)
pipe.fuse_lora()

Generate Image: Use a prompt to create images.

prompt = "A DSLR photo of a shiny VW van with a cityscape painted on it..."
image = pipe(
            prompt=prompt,
            guidance_scale=3.5,
            height=1024,
            width=1024,
            num_inference_steps=8
           ).images[0]

Suggested Environment: For optimal performance, consider using cloud GPUs like AWS EC2, Google Cloud GPUs, or Azure NV-series.

License

FLUX.1-Turbo-Alpha is distributed under the flux-1-dev-non-commercial-license. For more details, please refer to the license link.

More Related APIs in Text To Image