doctr dummy torch crnn mobilenet v3 small
Felix92Introduction
The DOCTR-DUMMY-TORCH-CRNN-MOBILENET-V3-SMALL
model is designed for optical character recognition (OCR) tasks, facilitating the conversion of images to text. It utilizes TensorFlow 2 and PyTorch, making OCR accessible and efficient.
Architecture
This model employs a CRNN (Convolutional Recurrent Neural Network) architecture built on MobileNetV3-Small. It is optimized for image-to-text recognition tasks, leveraging the lightweight and efficient MobileNetV3 backbone to enhance processing speed and accuracy.
Training
Training details for this specific model are not provided. However, in typical OCR models, both detection and recognition tasks are fine-tuned using large datasets of labeled images. This ensures that the model can accurately interpret a wide variety of text styles and layouts.
Guide: Running Locally
To run this model locally, follow these steps:
-
Install the Required Libraries: Ensure that Python is installed, and set up an environment with TensorFlow and PyTorch.
-
Load the Model:
from doctr.io import DocumentFile from doctr.models import ocr_predictor, from_hub img = DocumentFile.from_images(['<image_path>']) model = from_hub('mindee/my-model') predictor = ocr_predictor(det_arch='db_mobilenet_v3_large', reco_arch='crnn_mobilenet_v3_small', pretrained=True)
-
Run OCR:
res = predictor(img)
Replace
'<image_path>'
with the path to your image file. -
Cloud GPU Suggestion: For intensive OCR tasks, consider using cloud GPUs from providers like AWS, Google Cloud, or Azure, which can significantly improve processing speed.
License
The model is available under the license specified by Mindee's Doctr framework. For more detailed licensing information, refer to Mindee's GitHub repository.