Flux Product Ad Backdrop
prithivMLmodsIntroduction
The Flux-Product-Ad-Backdrop is a text-to-image model designed for generating high-quality product advertisement images. Built using the diffusion model architecture, it features fine-tuning through LoRA (Low-Rank Adaptation) for specific use cases such as product advertising.
Architecture
- Base Model: Utilizes the "black-forest-labs/FLUX.1-dev" as its core model.
- LoRA Fine-Tuning: Enhances the model's capabilities to generate targeted advertisement images.
- Optimizer: AdamW is employed for optimization.
- Learning Rate Scheduler: Uses a constant scheduler.
- Noise Parameters: Noise offset of 0.03 with multires noise discount of 0.1.
Training
- Training Images: 19 images were used to train the model.
- Image Dimensions: Best results are obtained with dimensions of 768x1024 and 1024x1024.
- Epochs and Iterations: The model was trained over 15 epochs, with specific iterations for noise adjustments.
- Labeling: Images were labeled using florence2-en for natural language processing.
Guide: Running Locally
-
Setup Environment: Install
torch
and the necessary Hugging Face pipelines.import torch from pipelines import DiffusionPipeline
-
Load Base Model:
base_model = "black-forest-labs/FLUX.1-dev" pipe = DiffusionPipeline.from_pretrained(base_model, torch_dtype=torch.bfloat16)
-
Load LoRA Weights:
lora_repo = "prithivMLmods/Flux-Product-Ad-Backdrop" trigger_word = "Product Ad" pipe.load_lora_weights(lora_repo)
-
Use GPU: Transfer the pipeline to a CUDA-enabled device for better performance.
device = torch.device("cuda") pipe.to(device)
-
Cloud GPU Suggestion: Consider using cloud-based solutions like AWS EC2 with NVIDIA GPUs for efficient processing.
License
The model is released under the CreativeML OpenRAIL-M license, which allows for certain commercial uses while imposing restrictions on redistribution and derivative works.