Taiyi Stable Diffusion 1 B Chinese E N v0.1
IDEA-CCNLIntroduction
Taiyi-Stable-Diffusion-1B-Chinese-EN-v0.1 is the first open-source bilingual Chinese and English Stable Diffusion model. It is trained on 20 million filtered Chinese image-text pairs, providing a robust text-to-image generation capability.
Architecture
The model utilizes the Noah-Wukong and Zero datasets, employing a two-stage training process. Initially, the text encoder is trained while other parts of the model are frozen to maintain generative capabilities and align Chinese concepts. In the second stage, the entire model is unfrozen to train both the text encoder and diffusion model for better Chinese language guidance compatibility.
Training
Training was conducted over two stages:
- Stage 1: Only the text encoder was trained for 80 hours using 8 x A100 GPUs.
- Stage 2: Both the text encoder and diffusion model were trained for 100 hours, also on 8 x A100 GPUs. The training dataset included pairs with a CLIP Score greater than 0.2, derived from the Noah-Wukong and Zero datasets.
Guide: Running Locally
Basic Steps
-
Install Dependencies:
pip install torch diffusers
-
Full Precision Usage:
from diffusers import StableDiffusionPipeline pipe = StableDiffusionPipeline.from_pretrained("IDEA-CCNL/Taiyi-Stable-Diffusion-1B-Chinese-EN-v0.1").to("cuda") prompt = '小桥流水人家,Van Gogh style' image = pipe(prompt, guidance_scale=10).images[0] image.save("小桥.png")
-
Half Precision Usage (FP16):
from diffusers import StableDiffusionPipeline import torch torch.backends.cudnn.benchmark = True pipe = StableDiffusionPipeline.from_pretrained( "IDEA-CCNL/Taiyi-Stable-Diffusion-1B-Chinese-EN-v0.1", torch_dtype=torch.float16 ) pipe.to('cuda') prompt = '小桥流水人家,Van Gogh style' image = pipe(prompt, guidance_scale=10.0).images[0] image.save("小桥.png")
Cloud GPUs
For optimal performance, it is recommended to use cloud-based GPUs such as NVIDIA A100 instances available on major cloud platforms like AWS, Google Cloud, or Azure.
License
The model is licensed under the CreativeML OpenRAIL-M license. Users are free to utilize and distribute the model, with the condition that they do not produce or distribute illegal or harmful content. Users are responsible for adhering to the license terms, which include redistributing the model with the same use restrictions and sharing a copy of the license with users. Full license details can be found here.