L T X Video
LightricksIntroduction
LTX-Video is a state-of-the-art video generation model developed by Lightricks. It leverages DiT-based architecture to produce high-quality videos in real-time, both from text and from image+text inputs. The model operates at 24 FPS with a 768x512 resolution, trained on a diverse dataset to ensure realistic video content.
Architecture
The LTX-Video model is diffusion-based, designed for generating videos from text or image and text prompts. It supports high-resolution outputs and operates efficiently on resolutions divisible by 32 and frame counts divisible by 8 plus one. The model is primarily optimized for resolutions under 720x1280 and frames below 257.
Training
Trained on a large-scale dataset, LTX-Video is capable of producing diverse and realistic video content. The training process ensures the model can handle a wide variety of scenes and scenarios, making it versatile for different use cases.
Guide: Running Locally
Installation
-
Clone the Repository and Set Up Environment:
git clone https://github.com/Lightricks/LTX-Video.git cd LTX-Video python -m venv env source env/bin/activate python -m pip install -e .\[inference-script\]
-
Download the Model:
from huggingface_hub import snapshot_download model_path = 'PATH' snapshot_download("Lightricks/LTX-Video", local_dir=model_path, local_dir_use_symlinks=False, repo_type='model')
Inference
-
Text-to-Video:
python inference.py --ckpt_dir 'PATH' --prompt "PROMPT" --height HEIGHT --width WIDTH --num_frames NUM_FRAMES --seed SEED
-
Image-to-Video:
python inference.py --ckpt_dir 'PATH' --prompt "PROMPT" --input_image_path IMAGE_PATH --height HEIGHT --width WIDTH --num_frames NUM_FRAMES --seed SEED
Diffusers Library
-
Install Diffusers:
pip install -U git+https://github.com/huggingface/diffusers
-
Example Usage:
import torch from diffusers import LTXPipeline from diffusers.utils import export_to_video pipe = LTXPipeline.from_pretrained("Lightricks/LTX-Video", torch_dtype=torch.bfloat16) pipe.to("cuda") video = pipe(prompt="PROMPT", negative_prompt="NEGATIVE_PROMPT", width=704, height=480, num_frames=161, num_inference_steps=50).frames[0] export_to_video(video, "output.mp4", fps=24)
Cloud GPUs such as AWS EC2 with NVIDIA GPUs or Google Cloud GPUs are recommended for optimal performance.
License
The LTX-Video model is available under a custom license. For detailed information, refer to the LICENSE file.