Cog Video X Fun V1.1 Reward Lo R As
alibaba-paiIntroduction
CogVideoX-Fun-V1.1-Reward-LoRAs leverages Reward Backpropagation to optimize video generation, aligning results more closely with human preferences. Pre-trained models (LoRAs) and training scripts are provided to enhance the base model or create custom reward LoRAs. Further details are available on the GitHub repository.
Architecture
The models utilize CogVideoX-Fun-V1.1 as the base, with reward models like HPS v2.1 and MPS applied. The LoRAs are trained with specific parameters, such as a rank of 128 and network alpha of 64, across various steps and batch sizes.
Training
- CogVideoX-Fun-V1.1-5b-InP-HPS2.1.safetensors: Trained with a batch size of 8 for 1,500 steps.
- CogVideoX-Fun-V1.1-2b-InP-HPS2.1.safetensors: Trained with a batch size of 8 for 3,000 steps.
- CogVideoX-Fun-V1.1-5b-InP-MPS.safetensors: Trained with a batch size of 8 for 5,500 steps.
- CogVideoX-Fun-V1.1-2b-InP-MPS.safetensors: Trained with a batch size of 8 for 16,000 steps.
Guide: Running Locally
- Install Required Libraries: Ensure you have PyTorch and other dependencies installed.
- Load Models: Use the provided script to load the CogVideoX models and LoRAs.
- Set Parameters: Define prompts, sample sizes, and other parameters for video generation.
- Run Inference: Execute the inference script to generate videos.
- Save Output: Store the generated video in the desired format.
Cloud GPU: For optimal performance, consider using cloud GPUs such as AWS EC2, Google Cloud, or Azure.
License
This project is released under an open-source license. Please refer to the GitHub repository for specific licensing details.