F L U X.1 controlnet lineart promeai

promeai

Introduction

The FLUX.1-controlnet-lineart-promeai model by promeai is a specialized model using ControlNet for lineart conditioned text-to-image generation. It is built upon weights trained on the black-forest-labs/FLUX.1-dev model.

Architecture

The model architecture leverages the diffusers library and integrates ControlNet to enhance the text-to-image generation process. It uses a combination of a base model and a ControlNet model to refine image outputs based on specific conditions like lineart.

Training

The model was trained using a single A100-80G GPU. The dataset consisted of proprietary, real-world images. Initial training used an image size of 512 with a batch size of 3, followed by a second phase with an image size of 1024 and a batch size of 1. The training utilized approximately 70GB of GPU memory and took about three days to reach a checkpoint at 14,000 steps.

Guide: Running Locally

Basic Steps

  1. Install Dependencies: Ensure you have Python installed. Use pip to install the diffusers library and other dependencies:

    pip install diffusers torch
    
  2. Load Models: Use the following script to load and run the model:

    import torch
    from diffusers.utils import load_image
    from diffusers.pipelines.flux.pipeline_flux_controlnet import FluxControlNetPipeline
    from diffusers.models.controlnet_flux import FluxControlNetModel
    
    base_model = 'black-forest-labs/FLUX.1-dev'
    controlnet_model = 'promeai/FLUX.1-controlnet-lineart-promeai'
    controlnet = FluxControlNetModel.from_pretrained(controlnet_model, torch_dtype=torch.bfloat16)
    pipe = FluxControlNetPipeline.from_pretrained(base_model, controlnet=controlnet, torch_dtype=torch.bfloat16)
    pipe.to("cuda")
    
    control_image = load_image("./images/example-control.jpg")
    prompt = "your prompt here"
    image = pipe(
        prompt, 
        control_image=control_image,
        controlnet_conditioning_scale=0.6,
        num_inference_steps=28, 
        guidance_scale=3.5,
    ).images[0]
    image.save("./image.jpg")
    
  3. Run on GPU: Ensure your environment supports CUDA for GPU acceleration. Cloud GPUs like AWS EC2 with NVIDIA GPUs or Google Cloud's GPU instances are recommended for efficient processing.

ComfyUI

  • An example ComfyUI workflow is available for additional guidance and customization here.

License

The usage of this model is subject to the terms provided by the Hugging Face repository and any specific licensing terms set by the model creator or contributing parties. Please refer to the model repository for detailed licensing information.

More Related APIs in Text To Image