Open Sora Plan v1.3.0
LanguageBindIntroduction
Open-Sora Plan is an open-source project designed to replicate and enhance the Sora model, focusing on video generation with support for training and inference on Huawei Ascend AI systems. The project invites contributions from the community to improve its capabilities.
Architecture
Open-Sora Plan utilizes a 3D attention architecture, replacing the traditional 2+1D model to better capture spatiotemporal features in videos. The architecture is scalable and supports various video resolutions and formats.
Training
The training of models involves several strategies, including the use of CausalVideoVAE for high compression ratios and efficient inference. The project supports training with large models using Huawei Ascend systems and other parallel processing strategies for scalability.
Guide: Running Locally
-
Clone the Repository:
git clone https://github.com/PKU-YuanGroup/Open-Sora-Plan cd Open-Sora-Plan
-
Install Requirements: Ensure Python 3.8 or higher and PyTorch 2.1.0 or higher with CUDA version 11.7 or above are installed.
conda create -n opensora python=3.8 -y conda activate opensora pip install -e .
-
Optional Installations: For static type checking and development tools:
pip install -e '.[dev]'
-
Inference Setup: Follow the inference instructions from the Text-to-Video documentation. Use the
--save_memory
flag for efficient memory usage. -
Cloud GPUs: Consider using cloud-based GPU services like AWS or Google Cloud for more computational power, especially when handling large models or datasets.
License
The Open-Sora Plan is licensed under the MIT License. For more details, refer to the LICENSE file.