mo di diffusion

nitrosocke

Introduction

Mo Di Diffusion is a fine-tuned version of the Stable Diffusion 1.5 model, specifically trained on screenshots from a popular animation studio. Users can generate images in the "modern disney style" by including the corresponding tokens in their prompts. The model is particularly suited for creating images of characters, animals, and landscapes.

Architecture

The model is based on the Stable Diffusion framework and uses a diffusers-based Dreambooth training methodology. It includes prior-preservation loss and utilizes the train-text-encoder flag, completing training in 9,000 steps.

Training

Mo Di Diffusion was trained using the diffusers library, which supports exporting models to various formats such as ONNX, MPS, and FLAX/JAX. The model has been fine-tuned to produce high-quality images following specific artistic styles.

Guide: Running Locally

To use the model locally:

  1. Install the necessary libraries: Ensure you have Python and the diffusers library installed.

  2. Load the model:

    from diffusers import StableDiffusionPipeline
    import torch
    
    model_id = "nitrosocke/mo-di-diffusion"
    pipe = StableDiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16)
    pipe = pipe.to("cuda")
    
  3. Generate an image:

    prompt = "a magical princess with golden hair, modern disney style"
    image = pipe(prompt).images[0]
    image.save("./magical_princess.png")
    
  4. Consider using Cloud GPUs: For optimal performance, especially on large or complex prompts, using cloud GPUs such as AWS, GCP, or Azure is recommended.

License

The model is available under the CreativeML OpenRAIL-M license, which specifies:

  1. The model cannot be used to produce or disseminate illegal or harmful content.
  2. The authors do not claim rights to the generated outputs; however, users are responsible for adhering to the license terms.
  3. Redistribution and commercial use of the model weights are permitted, provided that the same usage restrictions are applied, and a copy of the CreativeML OpenRAIL-M license is shared with users. For more information, read the full license.

More Related APIs in Text To Image