Flux 3 D X L Garment Mannequin

strangerzonehf

Introduction

The FLUX-3DXL-Garment-Mannequin is a model designed for generating 3D images of mannequins in various attire. It uses text-to-image capabilities with a focus on generating detailed, realistic images of mannequins dressed in diverse styles.

Architecture

The model is built on the black-forest-labs/FLUX.1-dev base, utilizing a diffusion pipeline. It incorporates LoRA (Low-Rank Adaptation) techniques for efficient image generation, with specific tags like text-to-image, lora, diffusers, and 3D.

Training

Training involved 14 images, using parameters such as a constant LR Scheduler and AdamW optimizer. The network dimensions are set to 64 with a network alpha of 32. The training process spans 15 epochs, saving the model at each epoch.

Guide: Running Locally

To run the model locally, you'll need to set up a Python environment with the necessary libraries, including PyTorch and the Hugging Face Diffusion Pipeline.

  1. Install Dependencies:

    pip install torch transformers diffusers
    
  2. Set Up the Model:

    import torch
    from pipelines import DiffusionPipeline
    
    base_model = "black-forest-labs/FLUX.1-dev"
    pipe = DiffusionPipeline.from_pretrained(base_model, torch_dtype=torch.bfloat16)
    
    lora_repo = "strangerzonehf/Flux-3DXL-Garment-Mannequin"
    trigger_word = "3DXL Mannequin"  
    pipe.load_lora_weights(lora_repo)
    
    device = torch.device("cuda")
    pipe.to(device)
    
  3. Generate Images: Use the trigger word 3DXL Mannequin in your prompt to generate images.

Cloud GPUs: Consider using cloud services like AWS, GCP, or Azure with GPU support to expedite the image generation process.

License

The model is distributed under the creativeml-openrail-m license, which allows for creative and open use of the model, respecting the terms outlined in the license agreement.

More Related APIs in Text To Image