catvton flux alpha

xiaozaa

Introduction

CATVTON-Flux is an advanced virtual try-on model that integrates CATVTON (Contrastive Appearance and Topology Virtual Try-On) with Flux fill inpainting for realistic clothing transfer. It achieves state-of-the-art performance in virtual garment fitting applications, as demonstrated by its results on the VITON-HD dataset.

Architecture

The model utilizes a combination of the CATVTON framework and the Flux inpainting model. It employs a transformer-based architecture for image-to-image tasks, specifically designed for virtual try-on applications. The model is built on the diffusers library and leverages pre-trained components from "black-forest-labs/FLUX.1-Fill-dev".

Training

Training Data

The model is trained using the VITON-HD dataset, which consists of high-resolution images tailored for virtual try-on applications.

Training Procedure

The training process involves fine-tuning the Flux1-dev-fill model to enhance its performance in virtual try-on scenarios.

Evaluation

The model achieves an FID score of 5.593255043029785, marking it as the state-of-the-art in its category.

Guide: Running Locally

To use the CATVTON-Flux model locally, follow these steps:

  1. Install the necessary libraries, ensuring you have PyTorch and the diffusers library.
  2. Load the pre-trained transformer model:
    transformer = FluxTransformer2DModel.from_pretrained(
        "xiaozaa/catvton-flux-alpha", 
        torch_dtype=torch.bfloat16
    )
    
  3. Initialize the pipeline:
    pipe = FluxFillPipeline.from_pretrained(
        "black-forest-labs/FLUX.1-dev",
        transformer=transformer,
        torch_dtype=torch.bfloat16
    ).to("cuda")
    
  4. Run the model using input parameters such as person image, person mask, garment image, and an optional random seed.

For optimal performance, it is recommended to use cloud GPUs such as AWS EC2, Google Cloud, or Azure.

License

The model is licensed under the CC BY-NC 2.0, which allows for non-commercial use with appropriate credit.

More Related APIs in Image To Image