I D M V T O N LLM Model — Open LLM List

Introduction

IDM-VTON is an implementation designed to improve diffusion models for authentic virtual try-on in diverse environments. This project is based on the research paper "Improving Diffusion Models for Authentic Virtual Try-on in the Wild" and utilizes the Stable Diffusion XL inpainting pipeline.

Architecture

The model utilizes the stable-diffusion-xl-1.0-inpainting-0.1 as its base. It focuses on virtual try-on applications and inpainting techniques to enhance image-to-image translation tasks. The architecture leverages various libraries such as Diffusers and ONNX, and it supports the use of Safetensors for efficient model handling.

Training

The training methodology and code are currently part of the ongoing development tasks. The project acknowledges contributions from existing frameworks like OOTDiffusion, DCI-VTON, and IP-Adapter to enhance its training and inference capabilities.

Guide: Running Locally

Clone the Repository:

git clone https://github.com/yisol/IDM-VTON.git
cd IDM-VTON

Install Dependencies:
Ensure you have Python and necessary libraries installed. Use a virtual environment for better package management.
```
pip install -r requirements.txt
```
Run the Demo:
Follow instructions in the repository to run the demo, which may require GPU support.
GPU Recommendation:
For optimal performance, especially for training and large-scale inference, it is recommended to use cloud GPUs such as those provided by AWS, Google Cloud, or Azure.

License

The IDM-VTON code and models are licensed under the Creative Commons BY-NC-SA 4.0 license. This allows for sharing and adaptation with attribution, but not for commercial use. More details can be found here.

More Related APIs in Image To Image