yolov8m table extraction

keremberke

Introduction

The YOLOV8M-TABLE-EXTRACTION is a model designed for object detection, specifically for extracting tables from images. It is built using the Ultralytics library and is part of the YOLO (You Only Look Once) family of models, renowned for their efficiency and performance in real-time object detection.

Architecture

This model utilizes the YOLOv8 architecture, which is part of the Ultralytics library, version 8.0.21. It supports object detection tasks, focusing on identifying two types of table structures: 'bordered' and 'borderless'. The model is trained using the Keremberke table extraction dataset, achieving a precision metric of 0.95194 mAP@0.5 for box detection.

Training

The model is trained on images from the keremberke/table-extraction dataset, achieving high precision in detecting tables. It uses PyTorch as the backend framework. The training process and results are documented, showing a robust performance on the validation set.

Guide: Running Locally

To run the YOLOV8M-TABLE-EXTRACTION model locally, follow these steps:

  1. Install Dependencies:

    pip install ultralyticsplus==0.0.23 ultralytics==8.0.21
    
  2. Load the Model and Perform Predictions:

    from ultralyticsplus import YOLO, render_result
    
    # load model
    model = YOLO('keremberke/yolov8m-table-extraction')
    
    # configure model parameters
    model.overrides['conf'] = 0.25  # NMS confidence threshold
    model.overrides['iou'] = 0.45  # NMS IoU threshold
    model.overrides['agnostic_nms'] = False  # NMS class-agnostic
    model.overrides['max_det'] = 1000  # maximum number of detections per image
    
    # set image
    image = 'https://github.com/ultralytics/yolov5/raw/master/data/images/zidane.jpg'
    
    # perform inference
    results = model.predict(image)
    
    # observe results
    print(results[0].boxes)
    render = render_result(model=model, image=image, result=results[0])
    render.show()
    
  3. Using Cloud GPUs: For better performance, especially with large datasets, consider using cloud GPUs from providers like AWS, Google Cloud, or Azure to accelerate inference and training times.

License

The YOLOV8M-TABLE-EXTRACTION model is licensed under the AGPL-3.0 license. This license requires that any distributed modified version of the model must also be open-source and distributed under the same license.

More Related APIs in Object Detection