F L U X.1 Turbo Alpha

alimama-creative

Introduction

FLUX.1-Turbo-Alpha is an advanced text-to-image model developed by Alimama-Creative, based on the FLUX.1-dev framework. It incorporates a multi-head discriminator to enhance the quality of image generation and is suitable for various applications including text-to-image transformation, inpainting, and controlnet tasks.

Architecture

The model uses a distilled Lora approach with an 8-step process, enhancing the original FLUX.1-dev model. It employs a multi-head discriminator architecture to improve distillation quality. The model is specifically designed for tasks such as text-to-image generation and inpainting controlnets.

Training

FLUX.1-Turbo-Alpha is trained on a dataset of 1 million images sourced from both open and internal channels. These images are selected based on aesthetic scores of 6.3+ and resolutions above 800x800 pixels. The model uses adversarial training with the original FLUX.1-dev transformer as the discriminator backbone and adds multi heads to each transformer layer. Key training parameters include:

  • Mixed precision: bf16
  • Learning rate: 2e-5
  • Batch size: 64
  • Image size: 1024x1024
  • Guidance scale: 3.5
  • Time shift: 3

Guide: Running Locally

To run FLUX.1-Turbo-Alpha locally, follow these steps:

  1. Install Dependencies: Ensure you have the diffusers library installed.
  2. Load Model:
    import torch
    from diffusers.pipelines import FluxPipeline
    
    model_id = "black-forest-labs/FLUX.1-dev"
    adapter_id = "alimama-creative/FLUX.1-Turbo-Alpha"
    
    pipe = FluxPipeline.from_pretrained(
      model_id,
      torch_dtype=torch.bfloat16
    )
    pipe.to("cuda")
    
    pipe.load_lora_weights(adapter_id)
    pipe.fuse_lora()
    
  3. Generate Image: Use a prompt to create images.
    prompt = "A DSLR photo of a shiny VW van with a cityscape painted on it..."
    image = pipe(
                prompt=prompt,
                guidance_scale=3.5,
                height=1024,
                width=1024,
                num_inference_steps=8
               ).images[0]
    
  4. Suggested Environment: For optimal performance, consider using cloud GPUs like AWS EC2, Google Cloud GPUs, or Azure NV-series.

License

FLUX.1-Turbo-Alpha is distributed under the flux-1-dev-non-commercial-license. For more details, please refer to the license link.

More Related APIs in Text To Image