Introduction

LTX-Video is a state-of-the-art video generation model developed by Lightricks. It leverages DiT-based architecture to produce high-quality videos in real-time, both from text and from image+text inputs. The model operates at 24 FPS with a 768x512 resolution, trained on a diverse dataset to ensure realistic video content.

Architecture

The LTX-Video model is diffusion-based, designed for generating videos from text or image and text prompts. It supports high-resolution outputs and operates efficiently on resolutions divisible by 32 and frame counts divisible by 8 plus one. The model is primarily optimized for resolutions under 720x1280 and frames below 257.

Training

Trained on a large-scale dataset, LTX-Video is capable of producing diverse and realistic video content. The training process ensures the model can handle a wide variety of scenes and scenarios, making it versatile for different use cases.

Guide: Running Locally

Installation

  1. Clone the Repository and Set Up Environment:

    git clone https://github.com/Lightricks/LTX-Video.git
    cd LTX-Video
    python -m venv env
    source env/bin/activate
    python -m pip install -e .\[inference-script\]
    
  2. Download the Model:

    from huggingface_hub import snapshot_download
    model_path = 'PATH'
    snapshot_download("Lightricks/LTX-Video", local_dir=model_path, local_dir_use_symlinks=False, repo_type='model')
    

Inference

  • Text-to-Video:

    python inference.py --ckpt_dir 'PATH' --prompt "PROMPT" --height HEIGHT --width WIDTH --num_frames NUM_FRAMES --seed SEED
    
  • Image-to-Video:

    python inference.py --ckpt_dir 'PATH' --prompt "PROMPT" --input_image_path IMAGE_PATH --height HEIGHT --width WIDTH --num_frames NUM_FRAMES --seed SEED
    

Diffusers Library

  • Install Diffusers:

    pip install -U git+https://github.com/huggingface/diffusers
    
  • Example Usage:

    import torch
    from diffusers import LTXPipeline
    from diffusers.utils import export_to_video
    
    pipe = LTXPipeline.from_pretrained("Lightricks/LTX-Video", torch_dtype=torch.bfloat16)
    pipe.to("cuda")
    
    video = pipe(prompt="PROMPT", negative_prompt="NEGATIVE_PROMPT", width=704, height=480, num_frames=161, num_inference_steps=50).frames[0]
    export_to_video(video, "output.mp4", fps=24)
    

Cloud GPUs such as AWS EC2 with NVIDIA GPUs or Google Cloud GPUs are recommended for optimal performance.

License

The LTX-Video model is available under a custom license. For detailed information, refer to the LICENSE file.

More Related APIs in Image To Video