pmf_metadataset_dino
hushellPMF Meta-Dataset DINO
Introduction
The PMF Meta-Dataset DINO project involves model checkpoints designed for meta-testing using different configurations of the DINO architecture. The project includes specific setups for testing across various domains within the Meta-Dataset.
Architecture
The models utilize the DINO architecture, specifically the dino_small_patch16
variant. Requirements for hardware include peak VRAM of approximately 32GB for DINO-small and 42GB for DINO-base.
Training
Meta-testing is conducted using a distributed setup with 8 processes per node. Two main configurations are highlighted: training on the full Meta-Dataset and training on the ImageNet domain within the Meta-Dataset. The models are fine-tuned using specific learning rates and evaluated across multiple domains.
Guide: Running Locally
-
Prerequisites: Ensure the availability of a suitable environment with Python and PyTorch installed. A distributed setup is necessary for running the tests.
-
Setup:
- Clone the repository containing the code and datasets.
- Prepare the dataset paths and ensure they are accessible.
-
Run Meta-Testing:
- For full Meta-Dataset:
python -m torch.distributed.launch --nproc_per_node=8 --use_env test_meta_dataset.py --data-path ../../datasets/meta_dataset --dataset meta_dataset --arch dino_small_patch16 --deploy finetune --output outputs/md_full_dinosmall --resume md_full_128x128_dinosmall_fp16_lr5e-5/best.pth --dist-eval --ada_steps 100 --ada_lr 0.0001
- For ImageNet domain:
python -m torch.distributed.launch --nproc_per_node=8 --use_env test_meta_dataset.py --data-path ../../datasets/meta_dataset --dataset meta_dataset --arch dino_small_patch16 --deploy finetune --output outputs/md_inet_dinosmall_6gpus --resume pmf_metadataset_dino/md_inet_128x128_dinosmall_fp16_lr5e-5/best.pth --dist-eval --ada_steps 100 --ada_lr 0.0001
- For full Meta-Dataset:
-
Consider Cloud GPUs: Given the high VRAM requirements, utilizing cloud-based GPUs might be more feasible than local resources.
License
The project files and models are distributed under a license specified in the repository. Users should review the license to ensure compliance with usage terms.