nutrition extractor
openfoodfactsIntroduction
The Nutrition Extractor model is a fine-tuned version of Microsoft's LayoutLMv3-Large, designed to automatically extract nutritional values from images of nutrition tables. It was developed as part of the Nutrisight project and achieves high precision, recall, F1 score, and accuracy on the evaluation dataset.
Architecture
The model is based on the LayoutLM architecture, which requires input in the form of images, tokens, and 2D coordinates of each token. The tokens and their positions are provided by an OCR model, specifically using Google Cloud Vision OCR results.
Training
The model was trained on the openfoodfacts/nutrient-detection-layout
dataset. Key training hyperparameters included:
- Learning rate: 1e-05
- Train batch size: 4
- Evaluation batch size: 4
- Total training steps: 3000
- Optimizer: Adam with betas=(0.9, 0.999) and epsilon=1e-08
- Mixed precision training: Native AMP
The model achieved a final loss of 0.0534 and demonstrated high metrics with precision (0.9545), recall (0.9647), F1 score (0.9596), and accuracy (0.9917).
Guide: Running Locally
-
Install Dependencies: Ensure you have Python installed and then install the necessary packages:
pip install transformers torch datasets tokenizers
-
Download the Model: Use the Hugging Face model repository to download the
nutrition-extractor
model. -
Prepare Input Data: Collect images, tokens, and their 2D coordinates using an OCR service like Google Cloud Vision.
-
Run the Model: Load the model into a Python script and input your prepared data to extract nutritional information.
-
Suggested Cloud GPUs: For faster processing, consider using cloud services like AWS EC2 with GPU instances, Google Cloud with NVIDIA GPUs, or Azure's GPU VMs.
License
The Nutrition Extractor model is licensed under the CC BY-NC-SA 4.0 (Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International). This license allows for sharing and adaptation for non-commercial purposes, provided appropriate credit is given and adaptations are shared under the same terms.