Introduction

Emi 3 (Ethereal Master of Illustration 3) is an AI model developed by AI Picasso, based on the Stable Diffusion 3.5 Large. It is designed for generating AI art and does not include unauthorized images from sources like Danbooru.

Architecture

Emi 3 is a flow-based text-to-image generation model that utilizes algorithms such as Rectified Flow Transformer and OpenCLIP-ViT/G. It supports prompts with approximately 200 words in natural language.

Training

The model is trained on a dataset similar to Stable Diffusion, excluding unauthorized images from Danbooru. It manually includes approximately 3,000 images and automatically gathers around 400,000 images, using hardware like the A6000 for about 500 hours in Japan.

Guide: Running Locally

To run Emi 3 locally, you can use either ComfyUI or the Diffusers library. Here's how to set up with Diffusers:

  1. Install the Diffusers library:
    pip install -U diffusers
    
  2. Generate images using the following script:
    import torch
    from diffusers import StableDiffusion3Pipeline
    
    pipe = StableDiffusion3Pipeline.from_pretrained("aipicasso/emi-3", torch_dtype=torch.bfloat16)
    pipe = pipe.to("cuda")
    
    image = pipe(
        "anime style, 1girl, looking at viewer, serene expression...",
        num_inference_steps=40,
        guidance_scale=4.5,
    ).images[0]
    image.save("emi3.png")
    
  3. Cloud GPUs: Consider using cloud GPU providers like AWS, GCP, or Azure for enhanced performance.

License

The model is subject to the Stabilityai AI Community license. Users must comply with various legal requirements, particularly adhering to Japanese law, including copyright and penal codes.

More Related APIs in Text To Image