3 D Render Flux Lo R A

prithivMLmods

Introduction

The 3D-Render-Flux-LoRA model is a text-to-image generation model specialized for creating 3D portraits and renders. It integrates with the Diffusers library and utilizes LoRA (Low-Rank Adaptation) techniques to generate detailed and stylistic images based on text prompts.

Architecture

The model employs a base model named black-forest-labs/FLUX.1-dev and is enhanced with LoRA weights to support specialized 3D rendering. The architecture is designed to interpret descriptive text prompts and generate corresponding high-quality 3D images.

Training

The model is currently undergoing training, using a total of 19 high-resolution images. Key training parameters include:

  • LR Scheduler: constant
  • Optimizer: AdamW
  • Noise Offset: 0.03
  • Network Dimensions: 64
  • Epochs: 15 with saves every epoch

The model is tuned to perform image generation using specific trigger words like "3D Portrait" and "3D render."

Guide: Running Locally

  1. Setup:

    • Ensure you have PyTorch installed with CUDA support for GPU acceleration.
    • Install the necessary libraries such as diffusers.
  2. Code Snippet:

    import torch
    from pipelines import DiffusionPipeline
    
    base_model = "black-forest-labs/FLUX.1-dev"
    pipe = DiffusionPipeline.from_pretrained(base_model, torch_dtype=torch.bfloat16)
    
    lora_repo = "prithivMLmods/3D-Render-Flux-LoRA"
    trigger_word = "3D Portrait, 3d render"  
    pipe.load_lora_weights(lora_repo)
    
    device = torch.device("cuda")
    pipe.to(device)
    
  3. Execution:

    • Use the trigger words "3D Portrait, 3d render" to generate images.
  4. Recommended Cloud GPUs:

    • Consider using cloud services like AWS EC2 with NVIDIA GPUs, Google Cloud Platform, or Azure for optimal performance.

License

The model is distributed under the CreativeML Open RAIL-M license, which permits open use in compliance with the license terms.

More Related APIs in Text To Image