pokemon stable diffusion
justinpinkneyIntroduction
The Pokémon-Stable-Diffusion model is a fine-tuned version of Stable Diffusion, adapted to generate Pokémon-like images based on text prompts. This model was developed by Lambda Labs and allows users to create unique Pokémon characters without needing advanced prompt engineering.
Architecture
This model builds upon the Stable Diffusion architecture, a popular text-to-image generation framework. It utilizes the Diffusers library for efficient inference and has been specifically trained on a dataset of Pokémon images with BLIP captions to enhance its ability to generate Pokémon-themed content.
Training
The model was trained using 2xA6000 GPUs on the Lambda GPU Cloud, which provides scalable cloud-based GPU resources. The training process took approximately 15,000 steps over about 6 hours, costing around $10. The training dataset consisted of Pokémon images with BLIP-generated captions, available from Hugging Face's datasets.
Guide: Running Locally
To use the Pokémon-Stable-Diffusion model locally, follow these steps:
- Setup: Clone the Stable Diffusion repository and ensure you have the necessary dependencies installed.
- Download Checkpoint: Obtain the
ema-only-epoch=000142.ckpt
checkpoint file. - Run Script: Execute the following command:
python scripts/txt2img.py \ --prompt 'robotic cat with wings' \ --outdir 'outputs/generated_pokemon' \ --H 512 --W 512 \ --n_samples 4 \ --config 'configs/stable-diffusion/pokemon.yaml' \ --ckpt ema-only-epoch=000142.ckpt
- Configuration: You can also use the standard Stable Diffusion inference configuration if preferred.
For optimal performance, it is recommended to use cloud GPUs such as those provided by Lambda Labs.
License
The Pokémon-Stable-Diffusion model and associated resources are available under licenses specified by the creators and hosting platforms. Users should review these licenses to ensure compliance with usage terms.