Introduction

Hyvid is a text-to-video model designed to generate anime-style videos from textual descriptions. The model leverages advanced techniques, including LoRA (Low-Rank Adaptation), and is optimized for efficient use with GGUF quantized and FP8 scaled versions.

Architecture

Hyvid is based on the Tencent HunyuanVideo model and uses a LoRA adapter for anime-specific enhancements. The model is structured to handle complex video generation tasks with improved performance by employing GGUF quantization techniques.

Training

The model is trained on datasets such as "trojblue/test-HunyuanVideo-anime-images" and "calcuis/anime-descriptor." These datasets provide a diverse range of anime scenes that aid in the model's ability to create detailed and contextually accurate video outputs.

Guide: Running Locally

Setup (Once)

  1. Model Files: Download and place the following files in their respective directories within ComfyUI:
    • hyvid_lora_adapter.safetensors (323MB) → ./ComfyUI/models/loras
    • hunyuan-video-t2v-720p-q4_0.gguf (7.74GB) → ./ComfyUI/models/diffusion_models
    • clip_l.safetensors (246MB) → ./ComfyUI/models/text_encoders
    • llava_llama3_fp8_scaled.safetensors (9.09GB) → ./ComfyUI/models/text_encoders
    • hunyuan_video_vae_bf16.safetensors (493MB) → ./ComfyUI/models/vae

Running

  1. No Installation Needed: Run the .bat file in the main directory.
  2. Demo Clip: Drag the demo clip or workflow JSON file into your browser for execution.

Workflows

  • GGUF Workflow: Use the example workflow for GGUF to manage memory by switching between quantized files.
  • Safetensors Workflow: The FP8 scaled version (13.2GB) is recommended for enhanced performance.

Cloud GPUs

For optimal performance, consider using cloud GPUs available from major providers like AWS, GCP, or Azure, which offer high-performance environments suited for running heavy models like Hyvid.

License

Hyvid is distributed under the MIT License. The license details are available in the LICENSE file linked in the model's repository.

More Related APIs in Text To Video