yolos tiny
hustvlYOLOS (Tiny-Sized) Model
Introduction
YOLOS is a Vision Transformer (ViT) model fine-tuned on the COCO 2017 object detection dataset, introduced in the paper "You Only Look at One Sequence: Rethinking Transformer in Vision through Object Detection" by Fang et al. Despite its simplicity, a base-sized YOLOS model achieves 42 AP on COCO validation 2017, comparable to more complex frameworks.
Architecture
The model employs a "bipartite matching loss" to compare predicted classes and bounding boxes with ground truth annotations, using the Hungarian matching algorithm to optimize mapping. It utilizes cross-entropy for classes and a linear combination of L1 and generalized IoU loss for bounding boxes.
Training
YOLOS was pre-trained for 300 epochs on ImageNet-1k and fine-tuned for 300 epochs on the COCO dataset, which includes 118k annotated images for training and 5k for validation.
Guide: Running Locally
- Install Required Libraries: Ensure
transformers
,torch
, andPIL
are installed. - Load the Model:
from transformers import YolosImageProcessor, YolosForObjectDetection from PIL import Image import torch import requests url = "http://images.cocodataset.org/val2017/000000039769.jpg" image = Image.open(requests.get(url, stream=True).raw) model = YolosForObjectDetection.from_pretrained('hustvl/yolos-tiny') image_processor = YolosImageProcessor.from_pretrained("hustvl/yolos-tiny") inputs = image_processor(images=image, return_tensors="pt") outputs = model(**inputs) # Process and display results target_sizes = torch.tensor([image.size[::-1]]) results = image_processor.post_process_object_detection(outputs, threshold=0.9, target_sizes=target_sizes)[0] for score, label, box in zip(results["scores"], results["labels"], results["boxes"]): box = [round(i, 2) for i in box.tolist()] print(f"Detected {model.config.id2label[label.item()]} with confidence {round(score.item(), 3)} at location {box}")
- Cloud GPUs: For improved performance, consider using cloud-based GPUs such as AWS EC2, GCP, or Azure.
License
The YOLOS model is released under the Apache 2.0 License.