maskformer swin large ade
facebookIntroduction
MaskFormer is a model designed for semantic segmentation, utilizing a Swin Transformer backbone. It was introduced in the research paper "Per-Pixel Classification is Not All You Need for Semantic Segmentation." The model is trained on the ADE20k dataset and provides a unified approach to instance, semantic, and panoptic segmentation by predicting masks and corresponding labels. This model card was prepared by the Hugging Face team.
Architecture
MaskFormer uses a novel approach where all three segmentation tasks—instance, semantic, and panoptic—are handled through the prediction of mask sets and their corresponding labels. The architecture is built upon the Swin Transformer backbone, which enhances performance by leveraging hierarchical feature maps.
Training
MaskFormer is pre-trained on the ADE20k dataset, which is specifically utilized for semantic segmentation tasks. The model's training paradigm allows it to address various segmentation tasks uniformly.
Guide: Running Locally
To use MaskFormer locally, follow these steps:
from transformers import MaskFormerImageProcessor, MaskFormerForInstanceSegmentation
from PIL import Image
import requests
url = "https://huggingface.co/datasets/hf-internal-testing/fixtures_ade20k/resolve/main/ADE_val_00000001.jpg"
image = Image.open(requests.get(url, stream=True).raw)
processor = MaskFormerImageProcessor.from_pretrained("facebook/maskformer-swin-large-ade")
inputs = processor(images=image, return_tensors="pt")
model = MaskFormerForInstanceSegmentation.from_pretrained("facebook/maskformer-swin-large-ade")
outputs = model(**inputs)
class_queries_logits = outputs.class_queries_logits
masks_queries_logits = outputs.masks_queries_logits
predicted_semantic_map = processor.post_process_semantic_segmentation(outputs, target_sizes=[image.size[::-1]])[0]
Cloud GPUs
For optimal performance, consider using cloud-based GPU services such as AWS, Google Cloud, or Azure, which provide scalable computing resources suitable for running complex models like MaskFormer.
License
The model is released under an "other" license, indicating specific usage terms that may differ from standard licenses. Please refer to the model repository for detailed license information.