Coloring Book Flux Lo R A

prithivMLmods

Introduction

The Coloring-Book-Flux-LoRA is a text-to-image model by prithivMLmods designed for generating black-and-white illustrations based on textual prompts. It leverages the FLUX model architecture and employs LoRA (Low-Rank Adaptation) techniques for enhanced performance in image generation.

Architecture

This model is built upon the base model "black-forest-labs/FLUX.1-dev" and is fine-tuned using LoRA to adapt the text-to-image generation capabilities specifically for black-and-white illustrations. The architecture includes a diffusion pipeline configured to process images at a default resolution of 1024x1024 pixels.

Training

The training process utilized 10 high-resolution images. Important parameters include:

  • Learning Rate Scheduler: constant
  • Optimizer: AdamW
  • Network Dimension: 64
  • Network Alpha: 32
  • Epochs: 10
  • The model is currently in the training phase, indicating potential improvements and refinements in future versions.

Guide: Running Locally

To run the Coloring-Book-Flux-LoRA model locally, follow these steps:

  1. Set Up Environment: Ensure you have Python and PyTorch installed. Set up a virtual environment for managing dependencies.
  2. Import Libraries: Use torch and the DiffusionPipeline from the pipelines library.
  3. Load Base Model:
    import torch
    from pipelines import DiffusionPipeline
    
    base_model = "black-forest-labs/FLUX.1-dev"
    pipe = DiffusionPipeline.from_pretrained(base_model, torch_dtype=torch.bfloat16)
    
  4. Load LoRA Weights:
    lora_repo = "prithivMLmods/Coloring-Book-Flux-LoRA"
    trigger_word = "Coloring Book"
    pipe.load_lora_weights(lora_repo)
    
  5. Configure Device:
    device = torch.device("cuda")
    pipe.to(device)
    
  6. Run Inference: Use the trigger word "Coloring Book" to generate images based on your prompts.

For optimal performance, consider using cloud GPUs from providers like AWS, Google Cloud, or Paperspace.

License

The model is released under the CreativeML OpenRAIL-M license, allowing usage with certain restrictions and guidelines.

More Related APIs in Text To Image