waifu diffusion

hakurei

Introduction

Waifu Diffusion is a latent text-to-image diffusion model fine-tuned on high-quality anime images. It allows users to generate anime-style images based on text descriptions.

Architecture

Waifu Diffusion utilizes the Stable Diffusion Pipeline, a powerful tool for generating images from text prompts. By conditioning on anime-style images, it specializes in creating high-quality visual content with an anime aesthetic.

Training

The model was trained using high-quality anime datasets. This fine-tuning process ensures that the model can produce detailed and stylistically accurate anime images based on user input.

Guide: Running Locally

To run Waifu Diffusion locally, follow these steps:

  1. Install Dependencies: Ensure you have Python and PyTorch installed. Install the diffusers library.

    pip install diffusers torch
    
  2. Load the Model: Use the following Python code to load and generate images with the model:

    import torch
    from torch import autocast
    from diffusers import StableDiffusionPipeline
    
    pipe = StableDiffusionPipeline.from_pretrained(
        'hakurei/waifu-diffusion',
        torch_dtype=torch.float32
    ).to('cuda')
    
    prompt = "1girl, aqua eyes, baseball cap, blonde hair"
    with autocast("cuda"):
        image = pipe(prompt, guidance_scale=6)["sample"][0]  
    
    image.save("test.png")
    
  3. GPU Recommendation: For optimal performance, consider using cloud-based GPUs such as those offered by Google Colab or AWS for intensive computation.

License

Waifu Diffusion is released under the CreativeML OpenRAIL-M license, which permits open access and usage with certain conditions:

  1. Do not use the model for illegal or harmful content.
  2. You are free to use generated outputs but are responsible for their use.
  3. Redistribution and commercial use are allowed but must include the same license terms.

Please refer to the full license text for complete details.

More Related APIs in Text To Image