Text to Image

ZB-Tech

Introduction

The ZB-Tech/Text-to-Image project provides LoRA adaptation weights for the Stable Diffusion XL model, enabling text-to-image generation. The model is designed for creating images based on textual descriptions, utilizing the diffusers library.

Architecture

The model is based on the stabilityai/stable-diffusion-xl-base-1.0 architecture. LoRA (Low-Rank Adaptation) is applied to enhance the model's performance, although not to the text encoder in this case. A special VAE, madebyollin/sdxl-vae-fp16-fix, is used during training to improve image quality.

Training

The model utilizes the ZB-Tech/DreamXL dataset for training. LoRA adaptation techniques are employed, but the text encoder adaptation is not enabled. The training process leverages the diffusers library, which is tailored for text-to-image tasks.

Guide: Running Locally

To run the model locally, follow these steps:

  1. Set Up Environment: Ensure Python is installed, and set up a virtual environment.
  2. Install Required Libraries:
    pip install requests pillow
    
  3. API Access: Obtain a Hugging Face API key and replace HF_API_KEY in the code.
  4. Download the Model: Access the model weights from the Files & versions tab in Safetensors format.
  5. Run the Example:
    import requests
    
    API_URL = "https://api-inference.huggingface.co/models/ZB-Tech/Text-to-Image"
    headers = {"Authorization": "Bearer HF_API_KEY"}
    
    def query(payload):
        response = requests.post(API_URL, headers=headers, json=payload)
        return response.content
    
    image_bytes = query({
        "inputs": "Astronaut riding a horse",
    })
    
    import io
    from PIL import Image
    image = Image.open(io.BytesIO(image_bytes))
    image.show()
    
  6. Use Cloud GPUs: For enhanced performance, consider using cloud GPUs from providers like AWS, GCP, or Azure to handle intensive computations.

License

The project is licensed under the openrail++ license, allowing for open use with certain conditions regarding the ethical use of the model.

More Related APIs in Text To Image