Zero Diffusion
drheadIntroduction
ZeroDiffusion is a text-to-image model developed to serve as a robust training base for other models and to facilitate research utilizing zero terminal Signal-to-Noise Ratio (SNR). It includes both a base model and an inpainting variant.
Architecture
- ZeroDiffusion-Base v0.9: A foundational model trained on approximately 20 million samples with zero terminal SNR.
- ZeroDiffusion-Inpaint v0.9: An experimental fine-tuned version of the stable-diffusion-inpainting model, derived from a merge of ZeroDiffusion 0.9.
Training
The models were trained using Google's TPU Research Cloud program, focusing on achieving zero terminal SNR. This setup aims to give researchers a clean model base for further exploration and adaptation.
Guide: Running Locally
-
Model Setup:
- Download the ZeroDiffusion model files.
- Obtain the corresponding YAML configuration file and place it in the same directory as the model.
-
Environment Requirements:
- Install necessary dependencies and configure the environment, possibly using A1111's webui or similar interfaces.
- Implement CFG rescale using the plugin available at CFG_Rescale_webui.
-
Execution:
- Ensure the web UI is set to v-prediction mode for optimal performance.
-
Hardware Suggestion:
- Utilize cloud GPUs like those offered by Google Cloud or AWS for efficient model execution and training.
License
The ZeroDiffusion models are released under the CreativeML OpenRAIL-M license.