Flux 3 D Emojies Lo R A

strangerzonehf

Introduction

The FLUX-3D-EMOJIES-LORA model is designed for generating 3D emoji-themed images using a text-to-image pipeline. It leverages the LoRA (Low-Rank Adaptation) technique to enhance the diffusion process and is suitable for creative applications involving 3D visuals.

Architecture

This model is based on the FLUX.1-dev architecture provided by Black Forest Labs. It utilizes a diffusion pipeline with LoRA weights to generate high-quality 3D emoji images. The model is optimized with AdamW and employs a constant learning rate scheduler for training.

Training

  • Total Images Used: 24 in Flat 4K resolution.
  • Optimizer: AdamW
  • Epochs: 18 with saving at every epoch.
  • Noise Parameters: Multires Noise Discount is set at 0.1, with 10 iterations.
  • Network Parameters: Network Dimension is 64, and Network Alpha is 32.

Guide: Running Locally

  1. Environment Setup:

    • Ensure you have torch and DiffusionPipeline installed.
    • Use a cloud GPU, such as those provided by AWS or Google Cloud, for optimal performance.
  2. Model Initialization:

    import torch
    from pipelines import DiffusionPipeline
    
    base_model = "black-forest-labs/FLUX.1-dev"
    pipe = DiffusionPipeline.from_pretrained(base_model, torch_dtype=torch.bfloat16)
    
    lora_repo = "strangerzonehf/Flux-3D-Emojies-LoRA"
    trigger_word = "3D Emojies"  
    pipe.load_lora_weights(lora_repo)
    
    device = torch.device("cuda")
    pipe.to(device)
    
  3. Running Inference:

    • Use the trigger word "3D Emojies" to generate images.
    • Recommended inference steps range from 30 to 35 for optimal results.
    • The model supports various output dimensions, with 1024x1024 being the default.

License

The model is licensed under CreativeML Open RAIL-M, allowing for open and collaborative use while ensuring responsible AI deployment.

More Related APIs in Text To Image