Noob_ipadapter

kataragi

Introduction

The controlnet_IP-adapter is designed to emulate aspects like style and atmosphere from reference images which are difficult to specify through prompts. The recommended models are from the noobAI series, and specific configurations include using the "IP-adapter" in ControlNet, the "CLIP-ViT-H (IPAdapter)" preprocessor, and selecting the model ip_adapter_Noobtest_800000.bin.

Architecture

The IP-adapter integrates with ControlNet, providing improved flexibility in generating images that mimic complex stylistic attributes from reference images. It is compatible with the animagineXL3.1 series, allowing seamless integration when using NoobAI.

Training

Common Training Details

  • Hardware: RTX A6000 (48GB)
  • Training Time: 408 hours (~17 days)
  • Batch Size: 2
  • Resolution: 1024

Training Stages

  1. First Stage

    • Base Model: ip-adapter_sdxl.bin
    • Checkpoint: animagineXL3.1
    • Images: 50,000
    • Learning Rate: 1e-4
    • Steps: 400,000
  2. Second Stage

    • Base Model: ip-adapter_animegineXL-400000.bin
    • Checkpoint: NoobAI 1.1
    • Images: 50,000 (later augmented to 100,000 with flipped images)
    • Learning Rate: 1e-7
    • Steps: 400,000
  3. Third Stage

    • Base Model: ip-adapter_noobAI_XL-400000.bin (unreleased)
    • Checkpoint: NoobAI 1.1
    • Images: 100,000
    • Learning Rate: 6e-5 (cosine annealed with 1% warm-up)
    • Steps: 400,000

Remarks

Models trained on SDXL_base with animagineXL3.1 endured high learning rates without divergence. However, using NoobAI in the second stage required a reduced learning rate to prevent early divergence, gradually adapting to the NoobAI model.

Guide: Running Locally

  1. Clone the Repository: Obtain the code from the Hugging Face model repository.
  2. Set Up Environment: Ensure dependencies are installed, possibly using a virtual environment.
  3. Model Configuration: Select the appropriate IP-adapter and model settings as described.
  4. Inference: Run the model using your desired input image and configuration settings.

For optimal performance, it is recommended to use cloud GPUs, such as AWS EC2 instances with NVIDIA GPUs, or services like Google Cloud's TPU.

License

The model is licensed under the CreativeML Open RAIL-M, permitting various uses while ensuring the responsible deployment of AI technology.

More Related APIs