Introduction

The LTXV-GGUF model is a quantized version of the LTX-Video model designed for text-to-video generation. It supports multiple video generation pipelines, enabling high-quality video outputs based on varied input prompts.

Architecture

The model is based on the Lightricks/LTX-Video architecture and quantized to the GGUF format to improve efficiency. It utilizes components like UNet, CLIP, and VAE models for video processing and generation, leveraging a combination of safetensors and GGUF files to optimize performance.

Training

The LTXV-GGUF model inherits its training methodology from the Lightricks/LTX-Video base model, which is fine-tuned for various video generation tasks. The quantization process via GGUF further enhances its speed and usability, particularly for testing and deployment scenarios.

Guide: Running Locally

  1. Setup:

    • Download ltx-video-2b-v0.9.1-q4_0.gguf (1.09GB) and place it in ./ComfyUI/models/unet.
    • Download t5xxl_fp8_e4m3fn.safetensors (4.89GB) to ./ComfyUI/models/clip.
    • Download ltx-video-vae.safetensors (838MB) to ./ComfyUI/models/vae.
  2. Execution:

    • Run the .bat file located in the main directory.
    • Drag and drop the workflow JSON file into your web browser for execution.
  3. Workflows:

    • Use the example workflow for GGUF or the original safetensors available in the repository.
  4. Recommendations:

    • For optimal performance, consider using cloud GPUs such as those provided by AWS, GCP, or Azure.

License

The model is distributed under a custom license. For detailed licensing information, refer to the LICENSE file in the repository.

More Related APIs in Text To Video