spider verse diffusion

nitrosocke

Introduction

Spider-Verse Diffusion is a fine-tuned Stable Diffusion model trained on movie stills from Sony's Into the Spider-Verse. It allows users to apply the "spiderverse style" to text-to-image generation prompts.

Architecture

The model is built on the Stable Diffusion framework using the diffusers library. It supports export to ONNX, MPS, and FLAX/JAX formats, allowing for flexible integration into various workflows.

Training

The model was fine-tuned using diffusers-based DreamBooth training and prior-preservation loss over 3,000 steps. Training utilized specific movie stills to achieve the desired visual effect.

Guide: Running Locally

  1. Install Required Libraries:

    pip install diffusers transformers scipy torch
    
  2. Load and Run the Model:

    from diffusers import StableDiffusionPipeline
    import torch
    
    model_id = "nitrosocke/spider-verse-diffusion"
    pipe = StableDiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16)
    pipe = pipe.to("cuda")
    
    prompt = "a magical princess with golden hair, spiderverse style"
    image = pipe(prompt).images[0]
    
    image.save("./magical_princess.png")
    
  3. Hardware Recommendations: For optimal performance, use a GPU. Cloud services such as AWS, GCP, or Azure, which offer GPU instances, can be considered for running the model.

License

The model is available under the CreativeML OpenRAIL-M license. Key points include:

  • Prohibition of using the model to generate illegal or harmful content.
  • The authors do not claim rights to the outputs; users are responsible for their use.
  • Redistribution and commercial use are permitted, but the same license and usage restrictions must be shared with users. Full license details are available here.

More Related APIs in Text To Image