Text to Image
ZB-TechIntroduction
The ZB-Tech/Text-to-Image project provides LoRA adaptation weights for the Stable Diffusion XL model, enabling text-to-image generation. The model is designed for creating images based on textual descriptions, utilizing the diffusers library.
Architecture
The model is based on the stabilityai/stable-diffusion-xl-base-1.0
architecture. LoRA (Low-Rank Adaptation) is applied to enhance the model's performance, although not to the text encoder in this case. A special VAE, madebyollin/sdxl-vae-fp16-fix
, is used during training to improve image quality.
Training
The model utilizes the ZB-Tech/DreamXL
dataset for training. LoRA adaptation techniques are employed, but the text encoder adaptation is not enabled. The training process leverages the diffusers library, which is tailored for text-to-image tasks.
Guide: Running Locally
To run the model locally, follow these steps:
- Set Up Environment: Ensure Python is installed, and set up a virtual environment.
- Install Required Libraries:
pip install requests pillow
- API Access: Obtain a Hugging Face API key and replace
HF_API_KEY
in the code. - Download the Model: Access the model weights from the Files & versions tab in Safetensors format.
- Run the Example:
import requests API_URL = "https://api-inference.huggingface.co/models/ZB-Tech/Text-to-Image" headers = {"Authorization": "Bearer HF_API_KEY"} def query(payload): response = requests.post(API_URL, headers=headers, json=payload) return response.content image_bytes = query({ "inputs": "Astronaut riding a horse", }) import io from PIL import Image image = Image.open(io.BytesIO(image_bytes)) image.show()
- Use Cloud GPUs: For enhanced performance, consider using cloud GPUs from providers like AWS, GCP, or Azure to handle intensive computations.
License
The project is licensed under the openrail++ license, allowing for open use with certain conditions regarding the ethical use of the model.