web form ui field detection LLM Model

Introduction

The web-form-Detect model is a YOLOv8 object detection model designed to detect and locate UI form fields in images. It utilizes the ultralytics library and has been fine-tuned with a dataset comprising annotated UI form images. The model is suitable for applications that require automated detection of form fields such as names, numbers, emails, passwords, and buttons from images.

Architecture

The model is built on the YOLOv8 architecture, leveraging the ultralytics library for object detection tasks. It is specifically configured to handle UI form field detection with various parameters set for optimized performance.

Training

Training Data: Composed of 600 images of web UI forms from diverse sources, annotated with bounding box coordinates.
Fine-Tuning Process:
- Pretrained backbone: Initialized with a pretrained YOLO object detection model.
- Loss Function: Mean Average Precision (mAP) loss.
- Optimizer: Adam optimizer with a learning rate of 1e-4.
- Training Duration: 1 hour on an NVIDIA GeForce RTX 3090 GPU.

The model achieved an Average Precision (AP) of 0.51, precision of 0.80, recall of 0.70, and an F1 Score of 0.71 during evaluation.

Guide: Running Locally

Installation:

pip install ultralyticsplus==0.0.28 ultralytics==8.0.43

Load Model and Perform Prediction:

from ultralyticsplus import YOLO, render_result

# load model
model = YOLO('foduucom/web-form-ui-field-detection')

# set model parameters
model.overrides['conf'] = 0.25  # NMS confidence threshold
model.overrides['iou'] = 0.45  # NMS IoU threshold
model.overrides['agnostic_nms'] = False  # NMS class-agnostic
model.overrides['max_det'] = 1000  # maximum number of detections per image

# set image
image = '/path/to/your/document/images'

# perform inference
results = model.predict(image)

# observe results
print(results[0].boxes)
render = render_result(model=model, image=image, result=results[0])
render.show()

Hardware Recommendation: Given the model's computational requirements, using a cloud GPU like NVIDIA Tesla V100 is recommended for optimal performance.

License

For questions and contributions, reach out via email at info@foduu.com.

More Related APIs in Object Detection