maskformer swin large ade

facebook

Introduction

MaskFormer is a model designed for semantic segmentation, utilizing a Swin Transformer backbone. It was introduced in the research paper "Per-Pixel Classification is Not All You Need for Semantic Segmentation." The model is trained on the ADE20k dataset and provides a unified approach to instance, semantic, and panoptic segmentation by predicting masks and corresponding labels. This model card was prepared by the Hugging Face team.

Architecture

MaskFormer uses a novel approach where all three segmentation tasks—instance, semantic, and panoptic—are handled through the prediction of mask sets and their corresponding labels. The architecture is built upon the Swin Transformer backbone, which enhances performance by leveraging hierarchical feature maps.

Model Architecture

Training

MaskFormer is pre-trained on the ADE20k dataset, which is specifically utilized for semantic segmentation tasks. The model's training paradigm allows it to address various segmentation tasks uniformly.

Guide: Running Locally

To use MaskFormer locally, follow these steps:

from transformers import MaskFormerImageProcessor, MaskFormerForInstanceSegmentation
from PIL import Image
import requests

url = "https://huggingface.co/datasets/hf-internal-testing/fixtures_ade20k/resolve/main/ADE_val_00000001.jpg"
image = Image.open(requests.get(url, stream=True).raw)

processor = MaskFormerImageProcessor.from_pretrained("facebook/maskformer-swin-large-ade")
inputs = processor(images=image, return_tensors="pt")

model = MaskFormerForInstanceSegmentation.from_pretrained("facebook/maskformer-swin-large-ade")
outputs = model(**inputs)

class_queries_logits = outputs.class_queries_logits
masks_queries_logits = outputs.masks_queries_logits

predicted_semantic_map = processor.post_process_semantic_segmentation(outputs, target_sizes=[image.size[::-1]])[0]

Cloud GPUs

For optimal performance, consider using cloud-based GPU services such as AWS, Google Cloud, or Azure, which provide scalable computing resources suitable for running complex models like MaskFormer.

License

The model is released under an "other" license, indicating specific usage terms that may differ from standard licenses. Please refer to the model repository for detailed license information.

More Related APIs in Image Segmentation