genview_pretrained_models
Xiaojie0903Introduction
The GenView pretrained models are designed to enhance view quality in visual representation tasks such as image classification, multimodal learning, and feature extraction. These models leverage generative models to improve self-supervised learning through better view quality and diversity, as introduced in the ECCV 2024 paper "GenView: Enhancing View Quality with Pretrained Generative Model for Self-Supervised Learning."
Architecture
The GenView models include both convolutional architectures like ResNet50 and transformer-based architectures such as ViT-B. They incorporate advanced self-supervised learning methods, including SimSiam, MoCo, and BYOL, using generative models for adaptive view generation to achieve superior feature representations.
- Developed by: Xiaojie Li, Yibo Yang, Xiangtai Li, Jianlong Wu, Yue Yu, Bernard Ghanem, Min Zhang
- Institutions: Harbin Institute of Technology, Shenzhen; Peng Cheng Laboratory; KAUST; NTU
- Model Type: Self-supervised learning for vision tasks
- License: Apache 2.0
Training
The models were evaluated using Linear Probe evaluation on the ImageNet-1K dataset, with metrics based on Top-1 accuracy. Results showed varying degrees of accuracy improvement across different methods and backbones, such as:
- MoCo v2 + GenView ResNet-50 achieved 70.0%
- BYOL + GenView ResNet-50 reached 73.2%
- MoCo v3 + GenView ViT-B obtained 77.8%
Guide: Running Locally
To run the GenView pretrained models locally, follow these steps:
- Install Requirements: Ensure you have Python and necessary libraries installed. Use
pip
to install the Hugging Face hub if needed. - Download the Model:
- Using
wget
:wget https://huggingface.co/Xiaojie0903/genview_pretrained_models/resolve/main/{MODEL_FILE}
- Using Hugging Face Python API:
from huggingface_hub import hf_hub_download file_path = hf_hub_download( repo_id="Xiaojie0903/genview_pretrained_models", filename="mocov3_resnet50_8xb512-amp-coslr-100e_in1k_genview.pth" ) print(f"Model downloaded to {file_path}")
- Using
- Run the Model: Load the model using your preferred deep learning framework and run it on your data.
For efficient training and inference, consider using cloud GPUs from providers like AWS, Google Cloud, or Azure.
License
The GenView pretrained models are licensed under the Apache 2.0 License, which allows for wide use and distribution with proper attribution.