flux schnell realism
hugovntrIntroduction
Flux-Schnell-Realism is a text-to-image model designed to enhance photorealism in the FLUX.1 framework. It utilizes LoRA (Low-Rank Adaptation) techniques to improve image quality and output.
Architecture
- Base Model: The model is based on
black-forest-labs/FLUX.1-schnell
. - Pipeline: It functions within the text-to-image domain.
- Tags: Includes
flux
,lora
,image-generation
, anddiffusers
.
Training
- Dataset: Trained on 1000+ synthetic images that were manually captioned. Images were generated and upscaled using Runtime44.
- Duration and Environment: The training process, executed using kohya-ss scripts, was completed in approximately 1 hour on a local machine with 16GB of VRAM.
Guide: Running Locally
Basic Steps
- Install Dependencies: Ensure you have the
diffusers
library and PyTorch installed. - Load Model: Use the
AutoPipelineForText2Image
from thediffusers
library to load the model. - Inference: Run the pipeline with your text prompt to generate images.
Suggested Cloud GPUs
For optimal performance, consider using cloud GPU services such as AWS EC2 with NVIDIA GPUs, Google Cloud Platform with Tensor Processing Units (TPUs), or Azure with NVIDIA V100 or A100 instances.
from diffusers import AutoPipelineForText2Image
import torch
pipeline = AutoPipelineForText2Image.from_pretrained('black-forest-labs/FLUX.1-schnell', torch_dtype=torch.bfloat16).to('cuda')
pipeline.load_lora_weights('hugovntr/flux-schnell-realism', weight_name='schnell-realism_v1')
image = pipeline('Moody kitchen at dusk, warm golden ...').images[0]
image.save("output.png")
License
Flux-Schnell-Realism is released under the Apache-2.0 license, allowing for broad use, modification, and distribution while respecting the license terms.