genview_pretrained_models

Xiaojie0903

Introduction

The GenView pretrained models are designed to enhance view quality in visual representation tasks such as image classification, multimodal learning, and feature extraction. These models leverage generative models to improve self-supervised learning through better view quality and diversity, as introduced in the ECCV 2024 paper "GenView: Enhancing View Quality with Pretrained Generative Model for Self-Supervised Learning."

Architecture

The GenView models include both convolutional architectures like ResNet50 and transformer-based architectures such as ViT-B. They incorporate advanced self-supervised learning methods, including SimSiam, MoCo, and BYOL, using generative models for adaptive view generation to achieve superior feature representations.

  • Developed by: Xiaojie Li, Yibo Yang, Xiangtai Li, Jianlong Wu, Yue Yu, Bernard Ghanem, Min Zhang
  • Institutions: Harbin Institute of Technology, Shenzhen; Peng Cheng Laboratory; KAUST; NTU
  • Model Type: Self-supervised learning for vision tasks
  • License: Apache 2.0

Training

The models were evaluated using Linear Probe evaluation on the ImageNet-1K dataset, with metrics based on Top-1 accuracy. Results showed varying degrees of accuracy improvement across different methods and backbones, such as:

  • MoCo v2 + GenView ResNet-50 achieved 70.0%
  • BYOL + GenView ResNet-50 reached 73.2%
  • MoCo v3 + GenView ViT-B obtained 77.8%

Guide: Running Locally

To run the GenView pretrained models locally, follow these steps:

  1. Install Requirements: Ensure you have Python and necessary libraries installed. Use pip to install the Hugging Face hub if needed.
  2. Download the Model:
    • Using wget:
      wget https://huggingface.co/Xiaojie0903/genview_pretrained_models/resolve/main/{MODEL_FILE}
      
    • Using Hugging Face Python API:
      from huggingface_hub import hf_hub_download
      
      file_path = hf_hub_download(
          repo_id="Xiaojie0903/genview_pretrained_models",
          filename="mocov3_resnet50_8xb512-amp-coslr-100e_in1k_genview.pth"
      )
      print(f"Model downloaded to {file_path}")
      
  3. Run the Model: Load the model using your preferred deep learning framework and run it on your data.

For efficient training and inference, consider using cloud GPUs from providers like AWS, Google Cloud, or Azure.

License

The GenView pretrained models are licensed under the Apache 2.0 License, which allows for wide use and distribution with proper attribution.

More Related APIs in Image Classification