• This repository has been archived on 26/May/2023
  • Stars
    star
    204
  • Rank 186,544 (Top 4 %)
  • Language
    Jupyter Notebook
  • License
    GNU Affero Genera...
  • Created almost 6 years ago
  • Updated about 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Important: Please have a look at the higher level issue in Robotoff: openfoodfacts/robotoff#372 This is an old model and we have made progress since then.

Important: Please have a look at the higher level issue in Robotoff: openfoodfacts/robotoff#372 This is an old model and we have made progress since then.

off-nutrition-table-extractor

This repository is the accumulation of all the work done during Google Summer of Code 2018.

Technical Details

The pipeline is made up of three major parts namely table detection, text detection and OCR with post-processing.

Table Detection

For detecting tables in an image, we are using the Single Shot Detector (SSD) object detection model. The model is trained on Tensorflow's Object Detection API. The provided Jupyter Notebook shows how we are using the pre-trained graph to detect tables in product images. Before running the notebook, install the object detection model from the Tensorflow's Github Repository. Table detection

Text Detection and extraction

Text detection is done using the text-detection-ctpn which uses fast-rcnn to extract textual regions in the image. In future, we are planning to update it to a faster and more accurate text detection model. Text Detection

OCR and post-processing

For the text recognition, we are using Tesseract OCR. Every text box detected from the text detection step will be passed through the OCR and a raw string will be returned which is then passed throught many post processing steps that clean the string (through regular expressions) and rectify any spelling mistakes in the string (using the symspell spelling correction algorithm).

Final Results

Full pipeline detection Output for the above image is given below:

Nutritional content = {
    'Dietary Fiber': (2.0, 'g'), 
    'Sugars': (9.0, 'g'),
    'Soluble Fiber': (1.0, 'g'), 
    'Monounsaturated Fat': (0.5, 'g'), 
    'Polyunsaturated Fat': (0.5, 'g'), 
    'Trans Fat': (0.0, 'g'), 
    'Other Carbohydrate': (11.0, 'g')
}

Requirements

The code is compatible with Python 3.0+. If you find any other dependency required during the execution, do raise an issue and inform there.

1. Tensorflow
2. OpenCV
3. Pillow
4. Numpy
5. Tesseract v4.0
6. Pytesseract
7. Django-2.0.5 (Only for API)

How to test your image

  • Download the frozen model for ctpn from here.
  • Save the model in ./nutrition_extractor/data repository.
  • Make a directory named test_images and put the images in that folder.
  • run python detection.py -i [IMAGE-PATH] from inside nutrition_extractor folder.

Planned functionality

  • Develop a table detection model to extract the region of interest (nutritional facts table) from images.
  • Crop the RoI from images and apply text detection pipeline to the region.
  • Pass every text blob through Tesseract OCR to extract the text.
  • Develop a post-processing method to clean the text and extract the nutritional label and its value form it.
  • Create a spatial mapping algorithm to map the text blobs according to their location in the image. (Done but the accuracy is not upto the standards).

Future Work

With GSoC 2018 being the kickstarter of this project, we are just getting started. There are a lot of things to do that we are going to do

  • Improving the spatial mapping algorithm.
  • Training and using a faster and more accurate text detection model than the currently used fast-rcnn model.
  • Creating a bigger nutritional table dataset and training that on a recent and bleeding edge object detection model to improve the accuracy.
  • Developing a better image preprocessing algorithm to detect bold text.
  • Implementing a method to unify the two models into one since the same calculations are being done twice in initial layers of the two models.

More Repositories

1

openfoodfacts-androidapp

Native version of Open Food Facts on Android - Coders & Decoders welcome ๐Ÿคณ๐Ÿฅซ
Kotlin
747
star
2

smooth-app

The new Open Food Facts mobile application for Android and iOS, crafted with Flutter and Dart
Dart
651
star
3

openfoodfacts-server

Open Food Facts database, API server and web interface - ๐Ÿช๐Ÿฆ‹ Perl, CSS and JS coders welcome ๐Ÿ˜Š For helping in Python, see Robotoff or taxonomy-editor
HTML
604
star
4

openfoodfacts-ios

Native (Swift) version of Open Food Facts for iOS. Coders & Decoders welcome ๐Ÿคณ๐Ÿฅซ ๐Ÿ˜Š
Swift
357
star
5

openfoodfacts-python

Python package for Open Food Facts
Python
264
star
6

openfoodfacts-ai

This is a tracking repo for all our AI projects. ๐Ÿ• ๐Ÿค–๐Ÿผ
Python
208
star
7

openfoodfacts-dart

Open Food Facts API Wrapper
Dart
150
star
8

openfoodfacts-laravel

Open Food Facts API wrapper for Laravel
PHP
141
star
9

openfoodfacts-nodejs

Official Node package for Open Food Facts
TypeScript
125
star
10

openfoodfacts-cordova-app-old-with-blob

Open Food Facts App in Cordova (Android)
JavaScript
83
star
11

robotoff

Real-time and batch prediction service for Open Food Facts
Python
70
star
12

openfoodfacts-php

PHP wrapper for Open Food Facts
PHP
54
star
13

openfoodfacts-apirestpython

Python API for Open Food Facts (using a DB dump)
Python
53
star
14

openfoodfacts-go

Go Wrapper for Open Food Facts
Go
51
star
15

openfoodfacts-react-native

Code to send product data and photos to Open Food Facts
JavaScript
38
star
16

openbeautyfacts

Meta project for Open Beauty Facts ๐Ÿ’„
36
star
17

openfoodfacts-ruby

Open Food Facts API Wrapper
Ruby
33
star
18

openfoodfacts-cordova-app

Open Food Facts mobile app, developed with Cordova, for iOS, Android, Windows Phone, FirefoxOS etc.
JavaScript
28
star
19

open-prices

An open database of food prices - ๐Ÿงพ๐Ÿ’ธ๐Ÿ’ฐ๐Ÿท๏ธ๐Ÿค‘๐Ÿฝ๏ธ
Python
22
star
20

hunger-games

One click Mini-Games for Open Food Facts
TypeScript
21
star
21

api-documentation

Version 2 of the documentation of the V1 API. The code behind the API is at https://github.com/openfoodfacts/openfoodfacts-server. An effort is made there to create a V3 of the documentation based on OpenAPI
18
star
22

openfoodfacts-web

Content pages (and translations) for the web version
HTML
16
star
23

taxonomy-editor

Taxonomies are at the heart of Open Food Facts data structure - this project provides an editor
TypeScript
15
star
24

open-prices-frontend

A vue.js front-end for Open Prices
Vue
14
star
25

openfoodfacts-java

Java Wrapper for OpenFoodFacts
Java
12
star
26

power-user-script

User script for your browser, to empower Open Food Facts contribution
JavaScript
11
star
27

folksonomy_api

A light REST API designed for Open Food Facts folksonomy engine
Python
11
star
28

openfoodfacts-hungergames

One click Mini-Games for Open Food Facts for: categories, labels, weight, brands, logosโ€ฆ We'd need to port and improve nutrition and ingredients from the old version.
Vue
9
star
29

eu-food-data

This repository aggregates food packaging codes available about European countries, and foreign countries trading with the EU.
HTML
9
star
30

openbeautyfacts-ruby

Open Beauty Facts API Wrapper ๐Ÿ’Ž๐Ÿ’„
Ruby
8
star
31

rate-my-recipe

A project allowing you to get the Nutri-Score, Eco-Scoreโ€ฆon your own food recipe
TypeScript
8
star
32

off-category-classification

Jupyter Notebook
8
star
33

openfoodfacts-ubuntu

Open Food Facts project for Ubuntu Touch
QML
7
star
34

openfoodfacts-events

Events repository and API for product scans, photo uploads, robotoff annotations etc.
Python
7
star
35

community-portal

A community portal for Open Food Facts contributors
Python
6
star
36

openfoodfacts-resources

Resources (images, SVGs, presentations etc.) for the Open Food Facts project
CSS
6
star
37

search-a-licious

๐ŸŠ๐Ÿ”Ž A pluggable search service for large collections of objects (like Open Food Facts)
Python
5
star
38

openfoodfacts-hungergames-react

One click categorizer for Open Food Facts
JavaScript
5
star
39

facets-knowledge-panels

Providing knowledge panels for a particular open food fact facet (category, brand, etc...)
Python
5
star
40

openfoodfacts-design

5
star
41

impactestimator

Service providing product level Eco-Score for OFF products.
Python
4
star
42

openfoodfacts-kotlin

Official Kotlin package for Open Food Facts
Kotlin
4
star
43

ruby-games

Games to complete data on Open Food Facts
Ruby
4
star
44

openfoodfacts-rust

Rust SDK package
Rust
4
star
45

openfoodfacts-elixir

Elixir
4
star
46

robotoff-models

Models for Robotoff, the Open Food Facts AI
4
star
47

openfoodfacts-translations

Translations for the Open Food Facts blog. Most of the other folders are being moved elsewhere.
HTML
4
star
48

hungergames-old

Gamification of Open Food Facts using Python and Django
Python
4
star
49

openfoodfacts-explorer

An alternative frontend for OpenFoodFacts, made with SvelteKit
Svelte
3
star
50

openfoodfacts-java-demo

Demo application using Java Wrapper for OpenFoodFacts
Java
3
star
51

off-product-environmental-impact

A fork of https://framagit.org/GustaveCoste/off-product-environmental-impact
Jupyter Notebook
3
star
52

recipe-estimator

A recipe estimator for Open Food Facts products
Python
3
star
53

www

Repository for phonegapbuild
JavaScript
3
star
54

openfoodfacts-moodstocks

Open Food Facts app with Moodstocks scanner
Java
2
star
55

openbeautyfacts-cordova-app

JavaScript
2
star
56

fastlane-descriptions-smoothie

Automation of the Play Store and App Store listings for Smoothie with Fastlane
Ruby
2
star
57

egg-codes

Repository for Egg Codes
2
star
58

folksonomy_engine

2
star
59

nutripatrol

A moderation tool for Open Food Facts
Python
2
star
60

contributor-quality-issues

Report data quality issues due to contributing apps/users
1
star
61

openfoodfacts-metrics

1
star
62

openfoodfacts-connect

1
star
63

openfoodfacts-infrastructure

Where we collaboratively plan and maintain the infrastructure of Open Food Facts
Shell
1
star
64

openfoodfacts-marketing

1
star
65

openfoodfacts-csharp

C#
1
star
66

r-dashboard

R
1
star
67

openfoodfacts-upptime

๐Ÿ“ˆ Uptime monitor and status page for Upptime, powered by @upptime
Markdown
1
star
68

msc-codes

List of MSC Codes for Open Food Facts
1
star
69

fastlane-descriptions

JavaScript
1
star
70

folksonomy_frontend

Folksonomy Engine front end
JavaScript
1
star
71

brand-data

1
star
72

openfoodfacts-monitoring

Makefile
1
star
73

openfoodfacts-build-cache

A repo to store some build caches (when github cache is not the right option)
1
star
74

openfoodfacts-corrector

Ruby script to correct and enhance data on OpenFoodFacts
Ruby
1
star
75

openfoodfacts_flutter_lints

Lints for OpenFoodFacts Flutter apps & packages
Dart
1
star
76

openproductsfacts

1
star
77

openfoodfacts-ffos

Repo for the Firefox OS port of Open Food Facts
JavaScript
1
star
78

.github

A repository for default files such as style guides, issue templates, etc.
1
star
79

openfoodfacts-swift

Swift
1
star
80

nutripatrol-frontend

The front-end (React) of nutripatrol moderation tool
TypeScript
1
star
81

recipe-estimator-metrics

Metrics framework for recipe estimation (estimating percentage of each ingredient)
Python
1
star