waifu diffusion xl

hakurei

Introduction

Waifu-Diffusion-XL is a latent text-to-image diffusion model fine-tuned on high-quality anime images. It is an advanced version based on StabilityAI's SDXL 0.9 and serves as a generative art tool, particularly for anime enthusiasts.

Architecture

The model builds upon StabilityAI’s SDXL 0.9, using a collection of aesthetic labels gathered by volunteers to enhance its performance in generating anime-style images. The core component, wdxl-aesthetic-0.9, is a fine-tuned checkpoint aimed at improving the visual output quality.

Training

The training of Waifu-Diffusion-XL involved fine-tuning on a custom dataset with 15,000 aesthetic labels. The base model used for this process was Stability.AI's SDXL 0.9, ensuring the model is primed for generating detailed and aesthetically pleasing anime images.

Guide: Running Locally

To run Waifu-Diffusion-XL locally:

  1. Prerequisites: Ensure that you have Python and necessary libraries installed.
  2. Clone Repository: Clone the Waifu-Diffusion-XL repository from Hugging Face.
  3. Install Dependencies: Install any required dependencies listed in the repository.
  4. Download Model Checkpoints: Obtain the model checkpoints from the repository.
  5. Run the Model: Use scripts provided in the repository to generate images.

For optimal performance, consider using cloud GPU services such as Google Colab, AWS, or Azure. These platforms offer powerful GPU options that can significantly speed up the model's inference times.

License

The Waifu-Diffusion-XL model is released under the SDXL 0.9 Research License Agreement. This is due to the inclusion of SDXL 0.9 weights prior to an official release, which Stability AI has authorized.

More Related APIs in Text To Image