Noob_ipadapter
kataragiIntroduction
The controlnet_IP-adapter
is designed to emulate aspects like style and atmosphere from reference images which are difficult to specify through prompts. The recommended models are from the noobAI series, and specific configurations include using the "IP-adapter" in ControlNet, the "CLIP-ViT-H (IPAdapter)" preprocessor, and selecting the model ip_adapter_Noobtest_800000.bin
.
Architecture
The IP-adapter integrates with ControlNet, providing improved flexibility in generating images that mimic complex stylistic attributes from reference images. It is compatible with the animagineXL3.1 series, allowing seamless integration when using NoobAI.
Training
Common Training Details
- Hardware: RTX A6000 (48GB)
- Training Time: 408 hours (~17 days)
- Batch Size: 2
- Resolution: 1024
Training Stages
-
First Stage
- Base Model:
ip-adapter_sdxl.bin
- Checkpoint: animagineXL3.1
- Images: 50,000
- Learning Rate: 1e-4
- Steps: 400,000
- Base Model:
-
Second Stage
- Base Model:
ip-adapter_animegineXL-400000.bin
- Checkpoint: NoobAI 1.1
- Images: 50,000 (later augmented to 100,000 with flipped images)
- Learning Rate: 1e-7
- Steps: 400,000
- Base Model:
-
Third Stage
- Base Model:
ip-adapter_noobAI_XL-400000.bin
(unreleased) - Checkpoint: NoobAI 1.1
- Images: 100,000
- Learning Rate: 6e-5 (cosine annealed with 1% warm-up)
- Steps: 400,000
- Base Model:
Remarks
Models trained on SDXL_base with animagineXL3.1 endured high learning rates without divergence. However, using NoobAI in the second stage required a reduced learning rate to prevent early divergence, gradually adapting to the NoobAI model.
Guide: Running Locally
- Clone the Repository: Obtain the code from the Hugging Face model repository.
- Set Up Environment: Ensure dependencies are installed, possibly using a virtual environment.
- Model Configuration: Select the appropriate IP-adapter and model settings as described.
- Inference: Run the model using your desired input image and configuration settings.
For optimal performance, it is recommended to use cloud GPUs, such as AWS EC2 instances with NVIDIA GPUs, or services like Google Cloud's TPU.
License
The model is licensed under the CreativeML Open RAIL-M, permitting various uses while ensuring the responsible deployment of AI technology.