sdxl emoji

SvenN

Introduction

The SDXL-EMOJI is a fine-tuned model based on Apple's emojis, designed for generating text-to-image outputs using the Stable Diffusion framework. It utilizes LoRA (Low-Rank Adaptation) and Pivotal Tuning techniques for enhanced performance.

Architecture

  • Base Model: StabilityAI's Stable Diffusion XL Base 1.0
  • Pivotal Tuning: Combines Dreambooth LoRA training with Textual Inversion.
  • Trigger Tokens: <s0><s1>
  • Textual Embeddings: Stored in embeddings.pti

Training

The model leverages Pivotal Tuning to integrate new concepts into the pre-trained Stable Diffusion framework. This involves training a new token alongside the existing model using Textual Inversion and Dreambooth LoRA techniques.

Guide: Running Locally

To run the SDXL-EMOJI model locally, follow these steps:

  1. Install Required Packages:

    pip install diffusers transformers accelerate safetensors huggingface_hub
    
  2. Clone the Repository:

    git clone https://github.com/replicate/cog-sdxl cog_sdxl
    
  3. Set Up and Run Inference:

    import torch
    from huggingface_hub import hf_hub_download
    from diffusers import DiffusionPipeline
    from cog_sdxl.dataset_and_utils import TokenEmbeddingsHandler
    
    pipe = DiffusionPipeline.from_pretrained(
            "stabilityai/stable-diffusion-xl-base-1.0",
            torch_dtype=torch.float16,
            variant="fp16",
    ).to("cuda")
    
    pipe.load_lora_weights("SvenN/sdxl-emoji", weight_name="lora.safetensors")
    
    text_encoders = [pipe.text_encoder, pipe.text_encoder_2]
    tokenizers = [pipe.tokenizer, pipe.tokenizer_2]
    
    embedding_path = hf_hub_download(repo_id="SvenN/sdxl-emoji", filename="embeddings.pti", repo_type="model")
    embhandler = TokenEmbeddingsHandler(text_encoders, tokenizers)
    embhandler.load_embeddings(embedding_path)
    prompt="A <s0><s1> emoji of a man"
    images = pipe(
        prompt,
        cross_attention_kwargs={"scale": 0.8},
    ).images
    
  4. Output: The generated image will be available in images[0].

Cloud GPUs

For optimal performance, utilize cloud GPUs such as those offered by AWS, Google Cloud, or Azure to handle the computational demands of running this model.

License

The SDXL-EMOJI model is licensed under the CreativeML OpenRAIL-M license, allowing users to modify and use the model in various applications while adhering to the terms specified.

More Related APIs in Text To Image