Introduction

ResNet-152 is a pre-trained deep residual network model designed for image classification tasks. It uses the ImageNet-1k dataset with a resolution of 224x224. The model is a variant of ResNet v1.5, which introduces specific changes to improve accuracy over the original v1 model.

Architecture

ResNet (Residual Network) utilizes concepts of residual learning and skip connections, enabling the training of deeper models. The ResNet v1.5 differs from the original version by adjusting the stride in the bottleneck blocks. Specifically, it uses a stride of 2 in the 3x3 convolution, improving accuracy slightly but reducing processing speed by about 5% according to Nvidia.

Training

The model is pre-trained on the ImageNet-1k dataset. It is equipped to classify images into one of the 1,000 ImageNet classes. Fine-tuned versions of the model can be found on the model hub for specific tasks.

Guide: Running Locally

To use the ResNet-152 model for image classification:

  1. Install Dependencies: Ensure you have the transformers, torch, and datasets libraries installed.
  2. Load a Dataset: Use the datasets library to load an image dataset.
  3. Feature Extraction: Use AutoFeatureExtractor to process the image.
  4. Load the Model: Initialize the ResNetForImageClassification model with pre-trained weights.
  5. Run Inference: Perform inference and obtain the predicted label.

Example code snippet:

from transformers import AutoFeatureExtractor, ResNetForImageClassification
import torch
from datasets import load_dataset

dataset = load_dataset("huggingface/cats-image")
image = dataset["test"]["image"][0]

feature_extractor = AutoFeatureExtractor.from_pretrained("microsoft/resnet-152")
model = ResNetForImageClassification.from_pretrained("microsoft/resnet-152")

inputs = feature_extractor(image, return_tensors="pt")

with torch.no_grad():
    logits = model(**inputs).logits

predicted_label = logits.argmax(-1).item()
print(model.config.id2label[predicted_label])

For faster performance, consider using cloud GPU services such as AWS EC2, Google Cloud, or Azure.

License

ResNet-152 is released under the Apache 2.0 License, permitting usage, distribution, and modification under stated terms and conditions.

More Related APIs in Image Classification