MistoLine

Introduction

MistoLine is a versatile SDXL-ControlNet model designed for adaptable line art conditioning. It can process various line art inputs, such as hand-drawn sketches and model-generated outlines, and generate high-quality images with improved detail restoration and stability. By leveraging the Anyline preprocessing algorithm and retraining with ControlNet's architecture, MistoLine offers robust performance across diverse line art conditions without needing different ControlNet models for different preprocessors.

Architecture

MistoLine is built upon the ControlNet architecture, designed to enhance text-to-image diffusion models by adding conditional control. This architecture facilitates the model's ability to process various types of line art with high accuracy, maintaining consistency with prior ControlNet designs.

Training

The training of MistoLine involved a novel line preprocessing algorithm, Anyline, and retraining the ControlNet model's UNet component. This process incorporated large model training innovations to achieve superior performance, especially in complex scenarios.

Guide: Running Locally

To run MistoLine locally, follow these steps:

  1. Install Required Libraries:

    pip install accelerate transformers safetensors opencv-python diffusers
    
  2. Run the Model: Utilize the following script to load and execute the model:

    from diffusers import ControlNetModel, StableDiffusionXLControlNetPipeline, AutoencoderKL
    from diffusers.utils import load_image
    from PIL import Image
    import numpy as np
    import cv2
    
    prompt = "aerial view, a futuristic research complex in a bright foggy jungle, hard lighting"
    negative_prompt = 'low quality, bad quality, sketches'
    
    image = load_image("https://huggingface.co/datasets/hf-internal-testing/diffusers-images/resolve/main/sd_controlnet/hf-logo.png")
    
    controlnet_conditioning_scale = 0.5
    
    controlnet = ControlNetModel.from_pretrained(
        "TheMistoAI/MistoLine",
        torch_dtype=torch.float16,
        variant="fp16",
    )
    vae = AutoencoderKL.from_pretrained("madebyollin/sdxl-vae-fp16-fix", torch_dtype=torch.float16)
    pipe = StableDiffusionXLControlNetPipeline.from_pretrained(
        "stabilityai/stable-diffusion-xl-base-1.0",
        controlnet=controlnet,
        vae=vae,
        torch_dtype=torch.float16,
    )
    pipe.enable_model_cpu_offload()
    
    image = np.array(image)
    image = cv2.Canny(image, 100, 200)
    image = image[:, :, None]
    image = np.concatenate([image, image, image], axis=2)
    image = Image.fromarray(image)
    
    images = pipe(
        prompt, negative_prompt=negative_prompt, image=image, controlnet_conditioning_scale=controlnet_conditioning_scale,
        ).images
    
    images[0].save(f"hug_lab.png")
    

Cloud GPUs

Consider using cloud-based GPU services for enhanced performance and faster processing times when running MistoLine.

License

MistoLine is released under the OpenRAIL++ license. The license prohibits usage for unlawful activities, harm to minors, misinformation spread, privacy infringement, discrimination, and unauthorized medical advice. Commercial use requires proper attribution to TheMisto.ai and must not imply endorsement. For specific attribution guidance, contact info@themisto.ai.

More Related APIs in Text To Image