waifu diffusion v1 3

hakurei

Introduction

Waifu Diffusion V1.3 is a latent text-to-image diffusion model specifically fine-tuned on high-quality anime images. It is based on the Stable Diffusion model and is designed to generate anime-style artwork.

Architecture

The underlying architecture of Waifu Diffusion V1.3 utilizes Stable Diffusion 1.4, a latent image diffusion model initially trained on the LAION2B-en dataset. The model has undergone fine-tuning with a learning rate of 5.0e-6 across 10 epochs using 680,000 anime-styled images.

Training

Waifu Diffusion V1.3 was fine-tuned from the original Stable Diffusion 1.4 model. Different versions of the model weights are available, including Float 16 EMA Pruned, Float 32 EMA Pruned, and Float 32 Full Weights. The model was optimized to enhance its capability in generating anime-style images.

Guide: Running Locally

To run Waifu Diffusion V1.3 locally, follow these steps:

  1. Environment Setup: Ensure you have a Python environment ready, and install necessary libraries such as PyTorch and Transformers.
  2. Download Model Weights: Select and download the desired model weights from the following options:
  3. Load Model: Use the Hugging Face library to load the model into your environment.
  4. Generate Images: Input text prompts to generate anime-style images.

For optimal performance, consider using cloud GPU services such as AWS, Google Cloud, or Azure.

License

Waifu Diffusion V1.3 is distributed under the CreativeML OpenRAIL-M license. This license allows for open access and usage with specific provisions:

  1. The model cannot be used to produce or share illegal or harmful content.
  2. Users have rights to the outputs they generate and are responsible for their use, adhering to the license terms.
  3. Redistribution and commercial use of the model are allowed, provided the same use restrictions are applied, and the CreativeML OpenRAIL-M license is shared with users.

The full license details can be found here.

More Related APIs in Text To Image