F L U X.1 merged LLM Model

Introduction

FLUX.1-MERGED is a text-to-image model leveraging the capabilities of two base models: FLUX.1-dev and FLUX.1-schnell. It is designed to produce images from textual descriptions using the Diffusers library. The repository merges parameters from both base models, allowing for efficient image generation.

Architecture

The model utilizes the FluxTransformer2DModel architecture from the Diffusers library. It merges the state dictionaries of FLUX.1-dev and FLUX.1-schnell, achieving a balance between the two models. The model is implemented using PyTorch and utilizes safetensors for efficient parameter storage.

Training

Training involves merging checkpoints from FLUX.1-dev and FLUX.1-schnell. The merging process involves averaging the non-guidance parameters of both models, while guidance parameters are taken directly from FLUX.1-dev. This approach ensures efficient memory usage during the merging process.

Guide: Running Locally

Install Required Libraries:
- Ensure you have the diffusers, torch, and safetensors packages installed.
Download Model Checkpoints:
- Use snapshot_download to fetch the checkpoints for both FLUX.1-dev and FLUX.1-schnell.
Merge the Models:
- Utilize the provided Python script to merge the state dictionaries of the two models.
Run Inference:
- Load the merged model using FluxPipeline and execute inference to generate images from text prompts.
Hardware Suggestions:
- For optimal performance, consider using cloud GPUs such as those provided by AWS, Google Cloud, or Azure.

License

The model is released under the flux-1-dev-non-commercial-license, which permits usage for non-commercial purposes only. Please review the LICENSE.md file for detailed terms and conditions.

More Related APIs in Text To Image