Introduction

CosXL is a text-to-image model developed by Stability AI, available on Hugging Face. It features advanced image editing capabilities and is primarily intended for non-commercial research purposes.

Architecture

Cos Stable Diffusion XL 1.0 Base utilizes a Cosine-Continuous EDM VPred schedule. This schedule enhances the model's ability to produce a wide color range, from black to white, and improves the rate of change in images across each step. The Edit version of the model supports image editing based on provided prompts and source images.

Training

The model is fine-tuned using a Cosine-Continuous EDM VPred schedule, which allows for more refined image transitions and color accuracy. This tuning supports both the base model and its editing capabilities.

Guide: Running Locally

  1. Requirements: Ensure you have Python and necessary libraries installed.
  2. Installation:
    • Clone the Stable Swarm UI for inference: https://github.com/Stability-AI/StableSwarmUI.
    • Alternatively, use ComfyUI for running checkpoints: https://github.com/comfyanonymous/ComfyUI.
  3. Execution:
    • Use Stable Swarm UI or ComfyUI to load the CosXL model and perform inferences or edits.
    • Refer to ComfyUI examples for edit model usage: https://comfyanonymous.github.io/ComfyUI_examples/edit_models/.
  4. Hardware: For optimal performance, it is recommended to use cloud GPUs from providers like AWS, Google Cloud, or Azure.

License

The model is licensed under the Stability AI Non-Commercial Research Community License. It permits use for non-commercial research purposes only and restricts direct consumer or production use. Redistribution and derivative works are allowed under specific conditions outlined in the license agreement. The license also disclaims warranties and limits liability. For commercial inquiries, contact Stability AI.

More Related APIs in Text To Image