pony diffusion
AstraliteHeartIntroduction
Pony Diffusion is a latent text-to-image diffusion model fine-tuned on high-quality pony images. It leverages a specialized adaptation of the Stable Diffusion model, aimed at generating safe-for-work pony-themed artworks. The model is built with contributions from Waifu-Diffusion and Novel AI, which provided expertise and computational resources.
Architecture
The model is based on a fine-tuned checkpoint of Waifu-Diffusion, which itself is an adaptation of Stable Diffusion V1-4. Stable Diffusion is a latent image diffusion model trained on the LAION2B-en dataset. The Pony Diffusion model has been fine-tuned with a learning rate of 5.0e-6 over four epochs, using approximately 80,000 text-image pairs sourced from Derpibooru with a score higher than 500, categorized as safe or suggestive.
Training
The fine-tuning process involved adjusting an early checkpoint of Waifu-Diffusion using specific pony-related text-image pairs to enhance its ability to generate pony-themed images. The model's training focused on maintaining a high-quality output while adhering to content safety standards.
Guide: Running Locally
To run the Pony Diffusion model locally, follow these steps:
- Prerequisites: Ensure you have Python and PyTorch installed with GPU support.
- Install Libraries:
pip install torch diffusers
- Download and Set Up the Model:
import torch from torch import autocast from diffusers import StableDiffusionPipeline, DDIMScheduler model_id = "AstraliteHeart/pony-diffusion" device = "cuda" pipe = StableDiffusionPipeline.from_pretrained( model_id, torch_dtype=torch.float16, revision="fp16", scheduler=DDIMScheduler( beta_start=0.00085, beta_end=0.012, beta_schedule="scaled_linear", clip_sample=False, set_alpha_to_one=False, ), ) pipe = pipe.to(device) prompt = "pinkie pie anthro portrait wedding dress veil intricate highly detailed digital painting artstation concept art smooth sharp focus illustration Unreal Engine 5 8K" with autocast("cuda"): image = pipe(prompt, guidance_scale=7.5)["sample"][0] image.save("cute_poner.png")
- Consider Using Cloud GPUs: For better performance, especially with large models, consider using cloud services like Google Colab or AWS that offer GPU support.
License
The model is distributed under the CreativeML OpenRAIL-M license, which allows open access and commercial use under the following conditions:
- Outputs must not be used for illegal or harmful content.
- The authors do not claim rights on the outputs; users are responsible for their use.
- Redistribution and commercial use require adherence to the same license terms, including providing the license to users.
For full details, please refer to the license documentation.