flux schnell realism

hugovntr

Introduction

Flux-Schnell-Realism is a text-to-image model designed to enhance photorealism in the FLUX.1 framework. It utilizes LoRA (Low-Rank Adaptation) techniques to improve image quality and output.

Architecture

  • Base Model: The model is based on black-forest-labs/FLUX.1-schnell.
  • Pipeline: It functions within the text-to-image domain.
  • Tags: Includes flux, lora, image-generation, and diffusers.

Training

  • Dataset: Trained on 1000+ synthetic images that were manually captioned. Images were generated and upscaled using Runtime44.
  • Duration and Environment: The training process, executed using kohya-ss scripts, was completed in approximately 1 hour on a local machine with 16GB of VRAM.

Guide: Running Locally

Basic Steps

  1. Install Dependencies: Ensure you have the diffusers library and PyTorch installed.
  2. Load Model: Use the AutoPipelineForText2Image from the diffusers library to load the model.
  3. Inference: Run the pipeline with your text prompt to generate images.

Suggested Cloud GPUs

For optimal performance, consider using cloud GPU services such as AWS EC2 with NVIDIA GPUs, Google Cloud Platform with Tensor Processing Units (TPUs), or Azure with NVIDIA V100 or A100 instances.

from diffusers import AutoPipelineForText2Image
import torch

pipeline = AutoPipelineForText2Image.from_pretrained('black-forest-labs/FLUX.1-schnell', torch_dtype=torch.bfloat16).to('cuda')
pipeline.load_lora_weights('hugovntr/flux-schnell-realism', weight_name='schnell-realism_v1')
image = pipeline('Moody kitchen at dusk, warm golden ...').images[0]
image.save("output.png")

License

Flux-Schnell-Realism is released under the Apache-2.0 license, allowing for broad use, modification, and distribution while respecting the license terms.

More Related APIs in Text To Image