Flux.1 Dev Indo Realism Lo R A LLM Model

Introduction

The FLUX.1-DEV-INDO-REALISM-LORA model, developed by prithivMLmods, is a text-to-image model designed to generate images with an emphasis on Indo-Realism and Super-Realism styles. It utilizes LoRA (Low-Rank Adaptation) techniques to improve efficiency and performance for specific tasks.

Architecture

The model is based on the "black-forest-labs/FLUX.1-dev" architecture and employs diffusion techniques with LoRA. Key parameters include a network dimension of 64, network alpha of 32, and an optimizer using AdamW. The model is configured to operate best at image dimensions of 768 x 1024 or 1024 x 1024 pixels.

Training

The model is still in development, with a training dataset consisting of 26 images. It uses a learning rate scheduler set to constant, a noise offset of 0.03, and multires noise settings for discount and iterations. Training involves 20 epochs with checkpoints saved every epoch.

Guide: Running Locally

Setup Environment:
- Ensure you have Python and PyTorch installed.
- Install the required packages using pip: pip install torch diffusers.

Load the Model:

import torch
from pipelines import DiffusionPipeline

base_model = "black-forest-labs/FLUX.1-dev"
pipe = DiffusionPipeline.from_pretrained(base_model, torch_dtype=torch.bfloat16)

lora_repo = "prithivMLmods/Flux.1-Dev-Indo-Realism-LoRA"
trigger_word = "indo-realism"  
pipe.load_lora_weights(lora_repo)

device = torch.device("cuda")
pipe.to(device)

Generate Images:
- Use the trigger word "indo-realism" to generate images with the desired style.
Cloud GPU Recommendation:
- For optimal performance, consider running the model on a cloud-based GPU service, such as AWS, Google Cloud, or Azure.

License

The model is released under the CreativeML Open RAIL-M license, which allows for use with some restrictions. Please refer to the full license text for detailed terms and conditions.

More Related APIs in Text To Image