pixel art xl

nerijs

Introduction

PIXEL ART XL is a text-to-image generation model, designed to create pixel art images using the stable-diffusion framework. It supports a variety of configurations and optimizations to produce high-quality pixel art outputs.

Architecture

PIXEL ART XL is built upon the stabilityai/stable-diffusion-xl-base-1.0 model, incorporating LoRA (Low-Rank Adaptation) for enhanced performance. It uses the DiffusionPipeline from the diffusers library with a specific scheduler configuration to manage the diffusion process.

Training

The model uses a fixed Variational Autoencoder (VAE) to avoid image artifacts and relies on downscaling images by 8 times using Nearest Neighbors for pixel-perfect image generation. The integration of LoRA with specified strength and guidance scales further refines the output quality.

Guide: Running Locally

  1. Install Required Libraries: Ensure you have PyTorch and diffusers installed. Use a Python environment.

    pip install torch diffusers
    
  2. Set Up the Model: Use the following script to configure and run the model:

    from diffusers import DiffusionPipeline, LCMScheduler
    import torch
    
    model_id = "stabilityai/stable-diffusion-xl-base-1.0"
    lcm_lora_id = "latent-consistency/lcm-lora-sdxl"
    pipe = DiffusionPipeline.from_pretrained(model_id, variant="fp16")
    pipe.scheduler = LCMScheduler.from_config(pipe.scheduler.config)
    
    pipe.load_lora_weights(lcm_lora_id, adapter_name="lora")
    pipe.load_lora_weights("./pixel-art-xl.safetensors", adapter_name="pixel")
    
    pipe.set_adapters(["lora", "pixel"], adapter_weights=[1.0, 1.2])
    pipe.to(device="cuda", dtype=torch.float16)
    
    prompt = "pixel, a cute corgi"
    negative_prompt = "3d render, realistic"
    
    num_images = 9
    
    for i in range(num_images):
        img = pipe(
            prompt=prompt,
            negative_prompt=negative_prompt,
            num_inference_steps=8,
            guidance_scale=1.5,
        ).images[0]
        
        img.save(f"lcm_lora_{i}.png")
    
  3. Cloud GPU Recommendation: For optimal performance, consider using cloud services that offer GPUs such as AWS, Google Cloud, or Azure, as the model requires significant computational power.

License

PIXEL ART XL is released under the CreativeML Open RAIL-M license, which allows for open access and usage with certain conditions and restrictions related to ethical use and distribution.

More Related APIs in Text To Image