Introduction

This repository contains the pretrained models and resources for the paper "Flow-Guided Transformer for Video Inpainting" by Kaidong Zhang, Jingjing Fu, and Dong Liu, presented at the European Conference on Computer Vision (ECCV) 2022. The project focuses on video inpainting using a novel approach that leverages a flow-guided transformer.

Architecture

The repository includes three models:

  • lafc.pth.tar: A pretrained model of the "Local Aggregation Flow Completion Network" that completes corrupted optical flows.
  • lafc_single.pth.tar: A version of the above model designed for single flow completion, used primarily in training the FGT model.
  • fgt.pth.tar: The "Flow Guided Transformer" model that completes frames using a sequence of corrupted frames and completed optical flows.

Configuration files for each model are also provided:

  • LAFC_config.yaml: Configuration for lafc.pth.tar.
  • LAFC_single_config.yaml: Configuration for lafc_single.pth.tar.
  • FGT_config.yaml: Configuration for fgt.pth.tar.

Training

The repository does not provide explicit training instructions, focusing instead on deployment with pretrained models. The models have been trained to handle video inpainting tasks, specifically designed to improve results by completing corrupted optical flows and frames.

Guide: Running Locally

  1. Clone the Repository: Download the repository from its GitHub page.
  2. Deployment: Run bash deploy.sh in your base directory to set up the models and configuration files.
  3. Run Demos: After deployment, you can run the object removal demo directly.

Suggested Cloud GPUs

For running these models efficiently, especially for video inpainting tasks, consider using cloud GPUs like those from AWS EC2, Google Cloud Platform, or Azure.

License

This project is licensed under the MIT License, allowing for open usage and modification within the terms provided.

More Related APIs