Juggernaut X L v9

RunDiffusion

Introduction

Juggernaut-XL-V9 is a text-to-image model developed by RunDiffusion, leveraging the Stable Diffusion XL framework for creating high-quality photorealistic images. The model specializes in various photography styles, including cinematic, wildlife, and architectural photography.

Architecture

The model is built on the stable-diffusion-xl-base-1.0 architecture, utilizing the diffusers library for enhanced image diffusion capabilities. It features improvements in skin details, lighting, and contrast, building on the advancements made in the RunDiffusion Photo Model V2.

Training

Version 9 of Juggernaut-XL was trained with a focus on refining skin details, lighting, and overall image contrast. The model's photographic output has been enhanced significantly, thanks to updates in the RunDiffusion Photo Model. A complete retraining of the base set is planned for Version 10, with the aim of improving captioning quality using GPT-4 Vision.

Guide: Running Locally

  1. Requirements: Ensure you have the diffusers library installed.
  2. Setup: Clone the model from the Hugging Face repository.
  3. Execution:
    • Set resolution to 832x1216.
    • Use the DPM++ 2M Karras sampler with 30-40 steps.
    • Adjust CFG scale to between 3-7 for realism.
    • For high-resolution images, use the baked-in VAE with 15 steps and a denoise value of 0.3.
  4. Cloud GPUs: Consider using cloud services like AWS EC2 or Google Cloud for access to GPUs, which can expedite model training and inference.

License

The model is distributed under the creativeml-openrail-m license, which allows for non-commercial use with certain conditions. For commercial licensing, contact juggernaut@rundiffusion.com.

More Related APIs in Text To Image