H Y Di T Lo R A
Tencent-HunyuanHYDiT-LoRA
Introduction
HYDiT-LoRA is a project by Tencent-Hunyuan that provides two types of trained LoRA weights for testing. The model focuses on enhancing the capabilities of the base model through fine-tuning LoRA weights. This document provides details on the architecture, training, and how to run the model locally.
Architecture
HYDiT-LoRA operates by integrating LoRA weights into the base model, HunYuanDiT. It supports multiple resolutions and various enhancement settings. The architecture involves components like transformer layers, attention mechanisms, and LoRA weights, enabling the model to perform tasks such as text-to-image generation with style adaptations.
Training
Three types of weights are available for fine-tuning: ema
, module
, and distill
. The default setting uses ema
weights. Training involves loading distill weights into the main model and conducting fine-tuning through specified settings. Parameters like batch size, image resolution, and learning rate are adjustable according to GPU capacity and training data size. The training process also includes using DeepSpeed for optimization and supports mixed-precision training with configurations like Flash Attention and FP16 precision.
Guide: Running Locally
- Dependencies and Installation: Align with the base model's requirements. Install necessary libraries and tools.
- Model Download:
cd HunyuanDiT huggingface-cli download Tencent-Hunyuan/HYDiT-LoRA --local-dir ./ckpts/t2i/lora
- Inference Using Gradio:
- Ensure the conda environment is activated.
- Run with various styles:
python app/hydit_app.py --infer-mode fa --load-key ema --lora-ckpt ./ckpts/t2i/lora/jade
- Inference Using Command Line:
- Example command for jade style:
python sample_t2i.py --infer-mode fa --prompt "玉石绘画风格,一只猫在追蝴蝶" --load-key ema --lora-ckpt ./ckpts/t2i/lora/jade
- Example command for jade style:
- Cloud GPUs: Utilize cloud GPUs for enhanced performance and faster processing, especially when working with large models and datasets.
License
This project is licensed under the Tencent-Hunyuan Community License. For more details, refer to the license file.