Hunyuan Video
hunyuanvideo-communityHunyuanVideo
Introduction
HunyuanVideo is an unofficial community fork adapting the Tencent HunyuanVideo model for use with the Diffusers library, available on Hugging Face. It enables efficient video generation with advanced features like memory savings.
Architecture
The model utilizes the Diffusers library, specifically the HunyuanVideoPipeline
and HunyuanVideoTransformer3DModel
. It supports video generation with various prompts and configurations.
Training
The model is pre-trained and can be used directly through the Diffusers library. It leverages techniques such as model CPU offloading and tiling for memory efficiency, but specific training details are not provided in the README.
Guide: Running Locally
To run HunyuanVideo locally, follow these steps:
-
Install Dependencies: Ensure that you have the latest version of the Diffusers library installed.
pip install diffusers
-
Import Necessary Modules:
import torch from diffusers import HunyuanVideoPipeline, HunyuanVideoTransformer3DModel from diffusers.utils import export_to_video
-
Load the Model:
model_id = "hunyuanvideo-community/HunyuanVideo" transformer = HunyuanVideoTransformer3DModel.from_pretrained( model_id, subfolder="transformer", torch_dtype=torch.bfloat16 ) pipe = HunyuanVideoPipeline.from_pretrained(model_id, transformer=transformer, torch_dtype=torch.float16)
-
Enable Memory Savings:
pipe.vae.enable_tiling() pipe.enable_model_cpu_offload()
-
Generate Video:
output = pipe( prompt="A cat walks on the grass, realistic", height=320, width=512, num_frames=61, num_inference_steps=30, ).frames[0] export_to_video(output, "output.mp4", fps=15)
-
Cloud GPUs: For optimal performance, consider using cloud services like AWS, Google Cloud, or Azure with GPU support.
For more details, refer to the Hugging Face documentation.
License
The licensing for HunyuanVideo is not specified in the provided documentation. Users should check the source repository for license information.