Deep Learning for Polyp Detection and Classification in Colonoscopy
This repository was created from the following review paper: A. Nogueira-Rodríguez; R. Domínguez-Carbajales; H. López-Fernández; Á. Iglesias; J. Cubiella; F. Fdez-Riverola; M. Reboiro-Jato; D. Glez-Peña (2020) Deep Neural Networks approaches for detecting and classifying colorectal polyps. Neurocomputing.
Please, cite it if you find it useful for your research.
This repository collects the most relevant studies applying Deep Learning for Polyp Detection and Classification in Colonoscopy from a technical point of view, focusing on the low-level details for the implementation of the DL models. In first place, each study is categorized in three types: (i) polyp detection and localization (through bounding boxes or binary masks, i.e. segmentation), (ii) polyp classification, and (iii) simultaneous polyp detection and classification (i.e. studies based on the usage of a single model such as YOLO or SSD to performs simultaneous polyp detection and classification). Secondly, a summary of the public datasets available as well as the private datasets used in the studies is provided. The third section focuses on technical aspects such as the Deep Learning architectures, the data augmentation techniques and the libraries and frameworks used. Finally, the fourth section summarizes the performance metrics reported by each study.
Suggestions are welcome, please check the contribution guidelines before submitting a pull request.
Table of Contents:
- Research
- Datasets
- Deep Learning Models and Architectures
- Performance
- List of Acronyms and Abbreviations
- References and Further Reading
Research
Polyp Detection and Localization
Study | Date | Endoscopy type | Imaging technology | Localization type | Multiple polyp | Real time |
---|---|---|---|---|---|---|
Tajbakhsh et al. 2014, Tajbakhsh et al. 2015 | Sept. 2014 / Apr. 2015 | Conventional | N/A | Bounding box | No | Yes |
Zhu R. et al. 2015 | Oct. 2015 | Conventional | N/A | Bounding box (16x16 patches) | Yes | No |
Park and Sargent 2016 | March 2016 | Conventional | NBI, WL | Bounding box | No | No |
Yu et al. 2017 | Jan. 2017 | Conventional | NBI, WL | Bounding box | No | No |
Zhang R. et al. 2017 | Jan. 2017 | Conventional | NBI, WL | No | No | No |
Yuan and Meng 2017 | Feb. 2017 | WCE | N/A | No | No | No |
Brandao et al. 2018 | Feb. 2018 | Conventional/WCE | N/A | Binary mask | Yes | No |
Zhang R. et al. 2018 | May 2018 | Conventional | WL | Bounding box | No | No |
Misawa et al. 2018 | June 2018 | Conventional | WL | No | Yes | No |
Zheng Y. et al. 2018 | July 2018 | Conventional | NBI, WL | Bounding box | Yes | Yes |
Shin Y. et al. 2018 | July 2018 | Conventional | WL | Bounding box | Yes | No |
Urban et al. 2018 | Sep. 2018 | Conventional | NBI, WL | Bounding box | No | Yes |
Mohammed et al. 2018, GitHub | Sep. 2018 | Conventional | WL | Binary mask | Yes | Yes |
Wang et al. 2018, Wang et al. 2018 | Oct. 2018 | Conventional | N/A | Binary mask | Yes | Yes |
Qadir et al. 2019 | Apr. 2019 | Conventional | NBI, WL | Bounding box | Yes | No |
Blanes-Vidal et al. 2019 | March 2019 | WCE | N/A | Bounding box | Yes | No |
Zhang X. et al. 2019 | March 2019 | Conventional | N/A | Bounding box | Yes | Yes |
Misawa et al. 2019 | June 2019 | Conventional | N/A | No | Yes | No |
Zhu X. et al. 2019 | June 2019 | Conventional | N/A | No | No | Yes |
Ahmad et al. 2019 | June 2019 | Conventional | WL | Bounding box | Yes | Yes |
Sornapudi et al. 2019 | June 2019 | Conventional/WCE | N/A | Binary mask | Yes | No |
Wittenberg et al. 2019 | Sept. 2019 | Conventional | WL | Binary mask | Yes | No |
Yuan Y. et al. 2019 | Sept. 2019 | WCE | N/A | No | No | No |
Ma Y. et al. 2019 | Oct. 2019 | Conventional | N/A | Bounding box | Yes | No |
Tashk et al. 2019 | Dec. 2019 | Conventional | N/A | Binary mask | No | No |
Jia X. et al. 2020 | Jan. 2020 | Conventional | N/A | Binary mask | Yes | No |
Ma Y. et al. 2020 | May 2020 | Conventional | N/A | Bounding box | Yes | No |
Young Lee J. et al. 2020 | May 2020 | Conventional | N/A | Bounding box | Yes | Yes |
Wang W. et al. 2020 | July 2020 | Conventional | WL | No | No | No |
Li T. et al. 2020 | Oct. 2020 | Conventional | N/A | No | No | No |
Sánchez-Peralta et al. 2020 | Nov. 2020 | Conventional | NBI, WL | Binary mask | No | No |
Podlasek J. et al. 2020 | Dec. 2020 | Conventional | N/A | Bounding box | No | Yes |
Qadir et al. 2021 | Feb. 2021 | Conventional | WL | Bounding box | Yes | Yes |
Xu J. et al. 2021 | Feb. 2021 | Conventional | WL | Bounding box | Yes | Yes |
Misawa et al. 2021 | Apr. 2021 | Conventional | WL | No | Yes | Yes |
Livovsky et al. 2021 | June 2021 | Conventional | N/A | Bounding box | Yes | Yes |
Pacal et al. 2021 | July 2021 | Conventional | WL | Bounding box | Yes | Yes |
Liu et al. 2021 | July 2021 | Conventional | N/A | Bounding box | Yes | Yes |
Nogueira-Rodríguez et al. 2021 | Aug. 2021 | Conventional | NBI, WL | Bounding box | Yes | Yes |
Yoshida et al. 2021 | Aug. 2021 | Conventional | WL, LCI | Bounding box | Yes | Yes |
Ma Y. et al. 2021 | Sep. 2021 | Conventional | WL | Bounding box | Yes | No |
Pacal et al. 2022 | Nov. 2021 | Conventional | WL | Bounding box | Yes | Yes |
Nogueira-Rodríguez et al. 2022 | April 2022 | Conventional | NBI, WL | Bounding box | Yes | Yes |
Nogueira-Rodríguez et al. 2023 | March 2023 | Conventional | NBI, WL | Bounding box | Yes | Yes |
Polyp Classification
Study | Date | Endoscopy type | Imaging technology | Classes | Real time |
---|---|---|---|---|---|
Ribeiro et al. 2016 | Oct. 2016 | Conventional | WL | Neoplastic vs. Non-neoplastic | No |
Zhang R. et al. 2017 | Jan. 2017 | Conventional | NBI, WL | Adenoma vs. hyperplastic Resectable vs. non-resectable Adenoma vs. hyperplastic vs. serrated |
No |
Byrne et al. 2017 | Oct. 2017 | Conventional | NBI | Adenoma vs. hyperplastic | Yes |
Komeda et al. 2017 | Dec. 2017 | Conventional | NBI, WL, Chromoendoscopy | Adenoma vs. non-adenoma | No |
Chen et al. 2018 | Feb. 2018 | Conventional | NBI | Neoplastic vs. hyperplastic | No |
Lui et al. 2019 | Apr. 2019 | Conventional | NBI, WL | Endoscopically curable lesions vs. endoscopically incurable lesion | No |
Kandel et al. 2019 | June 2019 | Conventional | N/A | Adenoma vs. hyperplastic vs. serrated (sessile serrated adenoma/traditional serrated adenoma) | No |
Zachariah et al. 2019 | Oct. 2019 | Conventional | NBI, WL | Adenoma vs. serrated | Yes |
Bour et al. 2019 | Dec. 2019 | Conventional | N/A | Paris classification: not dangeours (types Ip, Is, IIa, and IIb) vs. dangerous (type IIc) vs. cancer (type III) | No |
Patino-Barrientos et al. 2020 | Jan. 2020 | Conventional | WL | Kudo's classification: malignant (types I, II, III, and IV) vs. non-malignant (type V) | No |
Cheng Tao Pu et al. 2020 | Feb. 2020 | Conventional | NBI, BLI | Modified Sano's (MS) classification: MS I (Hyperplastic) vs. MS II (Low-grade tubular adenomas) vs. MS IIo (Nondysplastic or low-grade sessile serrated adenoma/polyp [SSA/P]) vs. MS IIIa (Tubulovillous adenomas or villous adenomas or any high-grade colorectal lesion) vs. MS IIIb (Invasive colorectal cancers) | Yes |
Young Joo Yang et al. 2020 | May 2020 | Conventional | WL | 7-class: CRC T1 vs. CRC T2 vs. CRC T3 vs. CRC T4 vs. high-grade dysplasia (HGD) vs. tubular adenoma with or without low grade dysplasia (TA) vs. non-neoplastic lesions 4-class: advanced CRC (T2, T3, and T4) vs. early CRC/HGD (CRC T1 and HGD) vs. TA vs. non-neoplastic lesions Advanced colorectal lesions (HGD and T1, T2, T3, and T4 lesions) vs. non-advanced colorectal lesions (TA and non-neoplastic lesions) Neoplastic lesions (TA, HGD, and stages T1, T2, T3, and T4) vs. non-neoplastic lesions |
No |
Yoshida et al. 2021 | Aug. 2021 | Conventional | WL, LCI | Neoplastic vs. hyperplastic | Yes |
Simultaneous Polyp Detection and Classification
Study | Date | Endoscopy type | Imaging technology | Localization type | Multiple polyp | Classes | Real time |
---|---|---|---|---|---|---|---|
Tian Y. et al. 20191 | Apr. 2019 | Conventional | N/A | Bounding box | Yes | Modified Sano's (MS) classification: MS I (Hyperplastic) vs. MS II (Low-grade tubular adenomas) vs. MS IIo (Nondysplastic or low-grade sessile serrated adenoma/polyp [SSA/P]) vs. MS IIIa (Tubulovillous adenomas or villous adenomas or any high-grade colorectal lesion) vs. MS IIIb (Invasive colorectal cancers) | No |
Liu X. et al. 2019 | Oct. 2019 | Conventional | WL | Bounding box | Yes | Polyp vs. adenoma | No |
Ozawa. et al. 20202 | Feb. 2020 | Conventional | NBI, WL | Bounding box | Yes | Adenoma vs. hyperplastic vs. sesile serrated adenoma/polyp (SSAP) vs. cancer vs. other types (Peutz-Jeghers, juvenile, or inflammation polyps) | Yes |
Li K. et al. 20213 | Aug. 2021 | Conventional | N/A | Bounding box | Yes | Adenoma vs. hyperplastic | Yes |
- Tian X. et al. 2019 work is based on the usage of a single model (RetinaNet) that performs simultaneous polyp detection and classification. However, the paper only reports detection results using the ETIS-Larib dataset and therefore this results are included in the Polyp Detection and Localization section.
- Ozawa. et al. 2020 work is based on the usage of a single model (Single Show MultiBox Detector, SSD) that performs simultaneous polyp detection and classification. Nevertheless, since the detection and classification results are reported independently, they are included in the sections Polyp Detection and Localization and Polyp Classification, respectively.
- Li K. et al. 2021 work is based on the usage of several single models that perform simultaneous polyp detection ad classification. As they report different types of results (frame-based polyp localization, polyp-based classification, and simultaneous frame-based polyp detection and classification), they are included in the three results sections.
Datasets
Public Datasets
Dataset | References | Description | Format | Resolution (w x h) | Ground truth | Used in |
---|---|---|---|---|---|---|
CVC-ClinicDB | Bernal et al. 2015 https://polyp.grand-challenge.org/CVCClinicDB/ |
612 sequential WL images with polyps extracted from 31 sequences (23 patients) with 31 different polyps. | Image | 384 × 288 | Polyp locations (binary mask) | Brandao et al. 2018, Zheng Y. et al. 2018, Shin Y. et al. 2018, Wang et al. 2018, Qadir et al. 2019, Sornapudi et al. 2019, Wittenberg et al. 2019, Jia X. et al. 2020, Ma Y. et al. 2020, Young Lee J. et al. 2020, Podlasek J. et al. 2020, Qadir et al. 2021, Xu J. et al. 2021, Pacal et al. 2021, Liu et al. 2021, Nogueira-Rodríguez et al. 2022 |
CVC-ColonDB | Bernal et al. 2012 Vázquez et al. 2017 |
300 sequential WL images with polyps extracted from 13 sequences (13 patients). | Image | 574 × 500 | Polyp locations (binary mask) | Tajbakhsh et al. 2015, Brandao et al. 2018, Zheng Y. et al. 2018, Sornapudi et al. 2019, Jia X. et al. 2020, Podlasek J. et al. 2020, Qadir et al. 2021, Xu J. et al. 2021, Pacal et al. 2021, Li K. et al. 2021, Nogueira-Rodríguez et al. 2022 |
CVC-EndoSceneStill | Vázquez et al. 2017 | 912 WL images with polyps extracted from 44 videos (CVC-ClinicDB + CVC-ColonDB). | Image | 574 × 500, 384 × 288 | Locations for polyp, background, lumen and specular lights (binary mask) | Sánchez-Peralta et al. 2020 |
CVC-PolypHD | Bernal et al. 2012 Vázquez et al. 2017 Bernal et al. 2021 https://giana.grand-challenge.org |
56 WL images. | Image | 1920 × 1080 | Polyp locations (binary mask) | Sornapudi et al. 2019, Nogueira-Rodríguez et al. 2022 |
ETIS-Larib | Silva et al. 2014 https://polyp.grand-challenge.org/ETISLarib/ |
196 WL images with polyps extracted from 34 sequences with 44 different polyps. | Image | 1225 × 966 | Polyp locations (binary mask) | Brandao et al. 2018, Zheng Y. et al. 2018, Shin Y. et al. 2018, Tian Y. et al. 2019, Ahmad et al. 2019, Sornapudi et al. 2019, Wittenberg et al. 2019, Jia X. et al. 2020, Podlasek J. et al. 2020, Qadir et al. 2021, Xu J. et al. 2021, Pacal et al. 2021, Liu et al. 2021, Pacal et al. 2022, Nogueira-Rodríguez et al. 2022 |
Kvasir-SEG / HyperKvasir | Pogorelov et al. 2017 Jha et al. 2020 Borgli et al. 2020 https://datasets.simula.no/kvasir-seg https://datasets.simula.no/hyper-kvasir/ |
1 000 polyp images | Image | Various resolutions | Polyp locations (binary mask and bounding box) | Sánchez-Peralta et al. 2020, Podlasek J. et al. 2020, Nogueira-Rodríguez et al. 2022 |
ASU-Mayo Clinic Colonoscopy Video | Tajbakhsh et al. 2016 https://polyp.grand-challenge.org/AsuMayo/ |
38 small SD and HD video sequences: 20 training videos annotated with ground truth and 18 testing videos without ground truth annotations. WL and NBI. | Video | 688 × 550 | Polyp locations (binary mask) | Yu et al. 2017, Brandao et al. 2018, Zhang R. et al. 2018, Ahmad et al. 2019, Sornapudi et al. 2019, Wittenberg et al. 2019, Mohammed et al. 2018, Li K. et al. 2021 |
CVC-ClinicVideoDB | Angermann et al. 2017 Bernal et al. 2018 Bernal et al. 2021 https://giana.grand-challenge.org |
38 short and long sequences: 18 SD videos for training. | Video | 768 × 576 | Polyp locations (binary mask) | Shin Y. et al. 2018, Qadir et al. 2019, Ma Y. et al. 2020, Xu J. et al. 2021, Nogueira-Rodríguez et al. 2022 |
Colonoscopic Dataset | Mesejo et al. 2016 http://www.depeca.uah.es/colonoscopy_dataset/ |
76 short videos (both NBI and WL). | Video | 768 × 576 | Polyp classification (Hyperplastic vs. adenoma vs. serrated) | Zhang R. et al. 2017, Li K. et al. 2021 |
PICCOLO | Sánchez-Peralta et al. 2020 https://www.biobancovasco.org/en/Sample-and-data-catalog/Databases/PD178-PICCOLO-EN.html |
3 433 images (2 131 WL and 1 302 NBI) from 76 lesions from 40 patients. | Image | 854 × 480, 1920 × 1080 | Polyp locations (binary mask) Polyp classification, including: Paris and NICE classifications, Adenocarcinoma vs. Adenoma vs. Hyperplastic, and histological stratification |
Sánchez-Peralta et al. 2020, Pacal et al. 2022, Nogueira-Rodríguez et al. 2022 |
LDPolypVideo | Ma Y. et al. 2021 https://github.com/dashishi/LDPolypVideo-Benchmark |
160 videos (40 187 frames: 33 876 polyp images and 6 311 non-polyp images) with 200 labeled polyps. 103 videos (861 400 frames: 371 400 polyp images and 490 000 non-polyp images) without full annotations. |
Video | 768 x 576 (videos), 560 × 480 (images) | Polyp locations (bounding box) | Ma Y. et al. 2021, Nogueira-Rodríguez et al. 2022 |
KUMC dataset | Li K. et al. 2021 https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/FCBUOR |
80 colonoscopy video sequences. It also aggregates the CVC-ColonDB, ASU-Mayo Clinic Colonoscopy Video, and Colonoscopic Dataset datasets. | Image | Various resolutions | Polyp locations (bounding box) Polyp classification: Adenoma vs. Hyperplastic |
Li K. et al. 2021, Nogueira-Rodríguez et al. 2022 |
CP-CHILD-A, CP-CHILD-B | Wang W. et al. 2020 https://figshare.com/articles/dataset/CP-CHILD_zip/12554042 |
CP-CHILD-A contains 1 000 polyp images and 7 000 non-polyp images. CP-CHILD-B contains 400 polyp images and 1 100 normal or other pathological images. |
Image | 256 × 256 | Polyp detection: polyp vs. non-polyp annotations | Wang W. et al. 2020 |
SUN | Misawa et al. 2021 http://amed8k.sundatabase.org/ |
49 136 images with polyps from different 100 polyps. 109 554 non-polyp images from 13 video sequences. | Image | N/A | Polyp locations (bounding box) | Misawa et al. 2021, Pacal et al. 2022, Nogueira-Rodríguez et al. 2022 |
Colorectal Polyp Image Cohort (PIBAdb) | https://www.iisgaliciasur.es/home/biobanco/colorectal-polyp-image-cohort-pibadb/?lang=en | ~31 400 polyp images (~22 600 WL and ~8 800 NBI) from 1 176 different polyps. ~17 300 non-polyp images (including ~2 800 normal-mucosa images and ~500 clean-mucosa images) |
Video and image | 768 × 576 | Polyp locations (bounding box) Polyp classification: Adenoma vs. Hyperplastic vs. Sessile Serrated Adenoma vs. Traditional Serrated Adenoma vs. Non Epithelial Neoplastic vs. Invasive |
Nogueira-Rodríguez et al. 2022, Nogueira-Rodríguez et al. 2023 |
ENDOTEST | Fitting et al. 2022 | Validation dataset: 24 polyp and their corresponding non-polyp video sequences (22 856 images: 12 161 with polyps and 10 695 without polyps) Performance dataset: 10 full length colonoscopy videos with 24 different polyps (230 898 images). |
Video and image | N/A | Polyp locations (bounding box) |
Private Datasets
Study | Patients | No. Images | No. Videos | No. Unique Polyps | Purpose | Comments |
---|---|---|---|---|---|---|
Tajbakhsh et al. 2015 | N/A | 35 000 With polyps: 7 000 Without polyps: 28 000 |
40 short videos (20 positive and 20 negative) | N/A | Polyp localization | - |
Zhu R. et al. 2015 | N/A | 180 | - | N/A | Polyp localization | - |
Park and Sargent 2016 | N/A | 652 With polyps: 92 |
35 (20’ to 40’) | N/A | Polyp localization | - |
Ribeiro et al. 2016 | 66 to 86 | 85 to 126 | - | N/A | Polyp classification (neoplastic vs non-neoplastic) | 8 datasets by combining: (i) with or without staining mucosa, (ii) 4 acquisition modes (without CVC, i-Scan1, i-Scan2, i-Scan3). |
Zhang R. et al. 2017, Zheng Y. et al. 2018 | N/A | 1930 Without polyps: 1 104 Hyperplastic: 263 Adenomatous: 563 |
- | 215 polyps (65 hyperplastic and 150 adenomatous) | Polyp classification (hyperplastic vs. adenomatous) | PWH Database. Images taken under either WL or NBI endoscopy. |
Yuan and Meng 2017 | 35 | 4 000 Normal WCE images: 3 000 (1 000 bubbles, 1 000 turbid, and 1 000 clear) Polyp images: 1 000 |
- | N/A | Polyp detection | - |
Byrne et al. 2017 | N/A | N/A | 388 | N/A | Polyp classification (hyperplastic vs. adenomatous) | |
Komeda et al. 2017 | N/A | 1 800 Adenomatous: 1200 Non-adenomatous: 600 |
- | N/A | Polyp classification (adenomatous vs. non-adenomatous) | - |
Chen et al. 2018 | N/A | 2 441 Training: - Neoplastic: 1476 - Hyperplastic: 681 Testing: - Neoplastic: 188 - Hyperplastic: 96 |
- | N/A | Polyp classification (hyperplastic vs. neoplastic) | - |
Misawa et al. 2018 | 73 | N/A | 546 (155 positive and 391 negative) | 155 | Polyp detection | - |
Urban et al. 2018 | > 2000 | 8 641 | - | 4 088 | Polyp localization | Used as training dataset. |
Urban et al. 2018 | N/A | 1 330 With polyps: 672 Without polyps: 658 |
- | 672 | Polyp localization | Used as independent dataset for testing. |
Urban et al. 2018 | 9 | 44 947 With polyps: 13 292 Without polyps: 31 655 |
9 | 45 | Polyp localization | Used as independent dataset for testing. |
Urban et al. 2018 | 11 | N/A | 11 | 73 | Polyp localization | Used as independent dataset for testing with “deliberately more challenging colonoscopy videos.”. |
Wang et al. 2018 | 1 290 | 5 545 With polyps: 3 634 Without polyps: 1 911 |
- | N/A | Polyp localization | Used as training dataset. |
Wang et al. 2018 | 1 138 | 27 113 With polyps: 5 541 Without polyps: 21 572 |
- | 1 495 | Polyp localization | Used as testing dataset. |
Wang et al. 2018 | 110 | - | 138 | 138 | Polyp localization | Used as testing dataset. |
Wang et al. 2018 | 54 | - | 54 | 0 | Polyp localization | Used as testing dataset. |
Lui et al. 2019 | N/A | 8 000 Curable lesions: 4 000 Incurable lesions: 4 000 |
- | Curable lesions: 159 Incurable lesions: 493 |
Polyp classification (endoscopically curable vs. incurable lesions) | Used as training dataset. This study is focused on larger endoscopic lesions with risk of submucosal invasion and lymphovascular permeation. |
Lui et al. 2019 | N/A | 567 | - | Curable: 56 Incurable: 20 |
Polyp classification (endoscopically curable vs. incurable lesions) | Used as testing dataset. This study is focused on larger endoscopic lesions with risk of submucosal invasion and lymphovascular permeation. |
Tian Y. et al. 2019 | 218 | 871 MS I: 102 MS II: 346 MS IIo: 281 MS IIIa: 79 MS IIIb: 63 |
- | N/A | Polyp classification (5 classes) | - |
Blanes-Vidal et al. 2019 | 255 | 11 300 With polyps: 4 800 Without polyps: 6 500 |
N/A | 331 polyps (OC) and 375 (CCE) | Polyp localization | CCE: Colorectal capsule endoscopy. OC: conventional optical colonoscopy. |
Zhang X. et al. 2019 | 215 | 404 | - | N/A | Polyp localization | - |
Misawa et al. 2019 | N/A | 3 017 088 | - | 930 | Polyp detection | Used as training set. |
Misawa et al. 2019 | 64 (47 with polyps and 17 without polyps) | N/A | N/A | 87 | Polyp detection | Used as testing set. |
Kandel et al. 2019 | 552 | N/A | - | 963 | Polyp classification (hyperplastic, serrated adenomas (sessile/traditional), adenomas) | |
Zachariah et al. 2019 | N/A | 5 278 Adenoma: 3 310 Serrated: 1 968 |
- | 5 278 | Polyp classification (adenoma vs. serrated) | Used as training set. |
Zachariah et al. 2019 | N/A | 634 | - | N/A | Polyp classification (adenoma vs. serrated) | Used as testing set. |
Zhu X. et al. 2019 | 283 | 1 991 | - | N/A | Polyp detection | Adenomatous polyps. |
Ahmad et al. 2019 | N/A | 83 716 With polyps: 14 634 Without polyps: 69 082 |
17 | 83 | Polyp localization | White Light Images. |
Sornapudi et al. 2019 | N/A | 55 | N/A | 67 | Polyp localization | Wireless Capsule Endoscopy videos. Used as testing set. |
Sornapudi et al. 2019 | N/A | 1 800 With polyps: 530 Without polyps: 1 270 |
18 | N/A | Polyp localization | Wireless Capsule Endoscopy videos. Used as training set. |
Wittenberg et al. 2019 | N/A | 2 484 | - | 2 513 | Polyp localization | - |
Yuan Y. et al. 2019 | 80 | 7 200 Polyp images: 1 200 Normal images (mucosa, bubbles, and turbid): 6 000 |
80 | N/A | Polyp detection | - |
Ma Y. et al. 2019 | 1 661 | 3 428 | - | N/A | Polyp localization | - |
Liu X. et al. 2019 | 2 000 | 8 000 Polyp: 872 Adenoma: 1 210 |
- | N/A | Polyp localization and classification (polyp vs. adenoma) | - |
Bour et al. 2019 | N/A | 785 Not dangerous: 699 Dangerous: 25 Cancer: 61 |
- | N/A | Polyp classification (not dangerous vs. dangerous vs. cancer) | - |
Patino-Barrientos et al. 2020 | 142 | 600 Type I: 47 Type II: 90 Type III: 183 Type IV: 187 Type V: 93 |
- | N/A | Polyp classification (malignant vs. non-malignant) | - |
Cheng Tao Pu et al. 2020 | N/A | 1 235 MS I: 103 MS II: 429 MS IIo: 293 MS IIIa: 295 MS IIIb: 115 |
- | N/A | Polyp classification (5 classes) | Australian (AU) dataset (NBI). Used as training set. |
Cheng Tao Pu et al. 2020 | N/A | 20 MS I: 3 MS II: 5 MS IIo: 2 MS IIIa: 7 MS IIIb: 3 |
- | N/A | Polyp classification (5 classes) | Japan (JP) dataset (NBI). Used as testing set. |
Cheng Tao Pu et al. 2020 | N/A | 49 MS I: 9 MS II: 10 MS IIo: 10 MS IIIa: 11 MS IIIb: 9 |
- | N/A | Polyp classification (5 classes) | Japan (JP) dataset (BLI). Used as testing set. |
Ozawa. et al. 2020 | 3 417 (3 021 with polyps and 396 without polyps) | 20 431 WL: 17 566 - Adenoma: 9 310 - Hyperplastic: 2 002 - SSAP: 116 - Cancer: 1 468 - Other types: 657 - Normal mucosa: 4 013 NBI: 2 865 - Adenoma: 2 085 - Hyperplastic: 519 - SSAP: 23 - Cancer: 131 - Other types: 107 - Normal mucosa: 0 |
- | 4 752 Adenoma: 3 513 Hyperplastic: 1 058 SSAP: 22 Cancer: 68 Other types: 91 |
Polyp localization and classification (Adenoma vs. hyperplastic vs. SSAP vs. cancer vs. other types) | Used as training set. |
Ozawa. et al. 2020 | 174 | 7 077 WL: 6 748 - Adenoma: 639 - Hyperplastic: 145 - SSAP: 33 - Cancer: 30 - Other types: 27 - Normal mucosa: 5 874 NBI: 329 - Adenoma: 208 - Hyperplastic: 69 - SSAP: 8 - Cancer: 3 - Other types: 10 - Normal mucosa: 31 |
- | 309 Adenoma: 218 Hyperplastic: 63 SSAP: 7 Cancer: 4 Other types: 17 |
Polyp localization and classification (Adenoma vs. hyperplastic vs. SSAP vs. cancer vs. other types) | Used as testing set. |
Young Lee J. et al. 2020 | 103 | 8 075 | 181 | N/A | Polyp localization | Used as training set. |
Young Lee J. et al. 2020 | 203 | 420 | N/A | 322 hyperplastic or sessile serrated adenomas | Polyp localization | Used as training set. |
Young Lee J. et al. 2020 | 7 | 108 778 - With polyps: 7 022 - Without polyps: 101 756 |
7 | 26 | Polyp localization | Used as testing set. |
Young Joo Yang et al. 2020 | 1 339 | 3 828 - Tubular adenoma: 1 316 - Non-neoplastic: 896 - High-grade dysplasia: 621 |
- | N/A | Polyp classification | Used as training/test set. |
Young Joo Yang et al. 2020 | 240 | 240 - Tubular adenoma: 116 - Non-neoplastic: 113 - Early CRC/High-grade dysplasia: 8 - Advanced CRC: 3 |
- | N/A | Polyp classification | External validation dataset. |
Li T. et al. 2020 | - | 7 384 - With polyps: 509 - Without polyps: 6 875 |
23 | N/A | Polyp detection | Colonoscopy videos obtained from YouTube, VideoGIE, and Vimeo. |
Podlasek J. et al. 2020 | 123 | 79 284 | 157 | N/A | Polyp localization | Used as development (train/validation split) dataset. |
Podlasek J. et al. 2020 | - | 2 678 | - | N/A | Polyp localization | Used as development (train/validation split) dataset. |
Podlasek J. et al. 2020 | 34 | - | 42 | N/A | Polyp localization | Used as testing dataset. |
Xu J. et al. 2021 | 262 | 1 482 | - | 1 683 | Polyp localization | RenjiImageDB. Used as testing set. |
Xu J. et al. 2021 | 14 | 8 837 With polyps: 3 294 Without polyps: 5 543 |
14 | 15 | Polyp localization | RenjiVideoDB. Used as testing set. |
Misawa et al. 2021 | N/A | 56 668 With polyps: 55 644 Without polyps: 1024 |
N/A | N/A | Polyp localization | Used as development (train/validation split) dataset. |
Livovsky et al. 2021 | 2 487 | With polyps: 204 687 (189 994 video frames + 14 693 still images) Without polyps: 80 M (80 M video frames + 158 646 still images) |
3 611 | 8 471 | Polyp localization | Used as training set. |
Livovsky et al. 2021 | 1 181 | 33 M video frames | 1 393 | 3 680 | Polyp localization | Used as testing set. |
Nogueira-Rodríguez et al. 2021 | 330 | 28 576 White-light: 21 046 NBI: 7 530 |
- | 941 | Polyp localization | - |
Yoshida et al. 2021 | 25 | N/A | N/A | 100: LED endoscope: 53 (25 neoplastic and 28 hyperplastic) LASER endoscope: 47 (30 neoplastic and 17 hyperplastic) |
Polyp localization and classification (neoplastic vs. hyperplastic) | Testing set to evaluate the CAD EYE (Fujifilm) system. |
Deep Learning Models and Architectures
Deep Learning Architectures
Off-the-shelf Architectures
Study | Task | Models | Framework | TL | Layers fine-tuned | Layers replaced | Output layer |
---|---|---|---|---|---|---|---|
Ribeiro et al. 2016 | Classification | AlexNet, GoogLeNet, Fast CNN, Medium CNN, Slow CNN, VGG16, VGG19 | - | ImageNet | N/A | Layers after last CNN layer | SVM |
Zhang R. et al. 2017 | Detection and classification | CaffeNet | - | ImageNet and Places205 | N/A | Tested connecting classifier to each convolutional layer (5 convolutional layers) | SVM (Poly, Linear, RBF, and Tahn) |
Chen et al. 2018 | Classification | Inception v3 | - | ImageNet | N/A | Last layer | FCL |
Tian Y. et al. 2019 | Localization and Classification | RetinaNet (based on ResNet-50) | N/A | ImageNet | N/A | Last layer | N/A |
Misawa et al. 2018, Misawa et al. 2019 | Detection | C3D | - | N/A | N/A | N/A | N/A |
Zheng Y. et al. 2018 | Localization | - | YOLOv1 | PASCAL VOC 2007 and 2012 | All | - | - |
Shin Y. et al. 2018 | Localization | Inception ResNet-v2 | Faster R-CNN with post-learning schemes | COCO | All | - | RPN and detector layers |
Urban et al. 2018 | Localization | ResNet-50, VGG16, VGG19 | - | ImageNet Also without TL |
All | Last layer | FCL |
Wang et al. 2018 | Localization | VGG16 | SegNet | N/A | N/A | N/A | N/A |
Wittenberg et al. 2019 | Localization | ResNet101 | Mask R-CNN | COCO | All (incrementally) | Last layer | FCL |
Yuan Y. et al. 2019 | Detection | DenseNet | Tensorflow | - | All | - | FCL |
Ma Y. et al. 2019 | Localization | SSD Inception v2 | Tensorflow | N/A | N/A | - | - |
Liu X. et al. 2019 | Localization and classification | Faster R-CNN with Inception Resnet v2 | Tensorflow | COCO | All | - | - |
Zachariah et al. 2019 | Classification | Inception ResNet-v2 | Tensorflow | ImageNet | N/A | Last layer | Graded scale transformation with sigmoid activation |
Bour et al. 2019 | Classification | ResNet-50, ResNet-101, Xception, VGG19, Inception v3 | Keras (Tensorflow) | Yes | N/A | Last layer | N/A |
Patino-Barrientos et al. 2020 | Classification | VGG16 | Keras (Tensorflow) | ImageNet | None, Last three | Last layer | Dense with sigmoid activation |
Ozawa. et al. 2020 | Localization and Classification | SSD (Single Shot MultiBox Detector) | Caffe | N/A | All | - | - |
Ma Y. et al. 2020 | Localization | YOLOv3, RetinaNet | N/A | ImageNet | N/A | N/A | N/A |
Young Lee J. et al. 2020 | Localization | YOLOv2 | N/A | N/A | N/A | N/A | N/A |
Young Joo Yang et al. 2020 | Classification | ResNet-152, Inception-ResNet-v2 | PyTorch | ImageNet | All | N/A | N/A |
Wang W. et al. 2020 | Detection | VGG16, VGG19, ResNet-101, ResNet-152 | PyTorch | - | All | Last layer | Fully Connected Layer or Global Average Pooling |
Li T. et al. 2020 | Detection | AlexNet | Caffe | ImageNet | N/A | N/A | N/A |
Sánchez-Peralta et al. 2020 | Localization | Backbone: VGG-16 or Densenet121, Encoder-decoder: U-Net or LinkNet | Keras (Tensorflow) | No | - | N/A | N/A |
Podlasek J. et al. 2020 | Localization | EfficientNet B4, RetinaNet | N/A | No | - | N/A | N/A |
Misawa et al. 2021 | Localization | YOLOv3 | N/A | Yes | N/A | - | FCL |
Livovsky et al. 2021 | Localization | LSTM-SSD | N/A | No | - | - | - |
Nogueira-Rodríguez et al. 2021, Nogueira-Rodríguez et al. 2022, Nogueira-Rodríguez et al. 2023 | Localization | YOLOv3 | MXNet | PASCAL VOC 2007 and 2012 | All | - | FCL |
Ma Y. et al. 2021 | Localization | RetinaNet, Faster RCNN, YOLOv3, and CenterNet | N/A | ImageNet | - | - | N/A |
Custom Architectures
Study | Task | Based on | Highlights |
---|---|---|---|
Tajbakhsh et al. 2014, Tajbakhsh et al. 2015 | Localization | None | Combination of classic computer vision techniques (detection and location) with DL (correction of prediction). The ML method proposes candidate polyps. Then, three sets of multi-scale patches around the candidate are generated (color, shape and temporal). Each set of patches is fed to a corresponding CNN. Each CNN has 2 convolutional layers, 2 fully connected layers, and an output layer. The maximum score for each set of patches is computed and averaged. |
Zhu R. et al. 2015 | Localization | LeNet-5 | CNN fed with 32x32 images taken from patches generated via a sliding window of 16 pixels over the original images. The LeNet-5 network inspires the CNN architecture. ReLU used as activation function. Last two layers replaced with a cost-sensitive SVM. Positively selected patches are combined to generate the final output. |
Park and Sargent 2016 | Localization | None | Based on a previous work with no DL techniques. An initial quality assessment and preprocessing step filters and cleans images, and proposes candidate regions of interest (RoI). CNN replaces previous feature extractor. Three convolutional layers with two interspersed subsampling layers followed by a fully connected layer. A final step uses a Conditional Random Field (CRF) for RoI classification. |
Yu et al. 2017 | Localization | None | Two 3D-FCN are used: - An offline network trained with a training dataset. - An online network initialized with the offline weights and updated each 60 frames with the video frames. Only the last two layers are updated. The last 16 frames are used for predicting each frame. Two convolutional layers followed by a pooling layer each, followed by two groups of two convolutional layers followed by a pooling layer each and finished with two convolutional layers converted from fully connected layers. The output of each network is combined to generate the final output. |
Yuan and Meng 2017 | Detection | Stacked Sparse AutoEncoder (SSAE) | A modification of a Sparse AutoEncoder to include an image manifold constraint, named Stacked Sparse AutoEncoder with Image Manifold Constraint (SSAEIM). SSAEIM is built by stacking three SAEIM layers followed by an output layer. Image manifold information is used on each layer. |
Byrne et al. 2017 | Classification | Inception v3 | Last layer replaced with a fully connected layer. A credibility score is calculated for each frame with the current frame prediction and the credibility score of the previous frame. |
Komeda et al. 2017 | Classification | None | Two convolutional layers followed by a pooling layer each, followed by a final fully connected output layer. |
Brandao et al. 2018, Ahmad et al. 2019 | Localization | AlexNet, GoogLeNet, ResNet-50, ResNet-101, ResNet-152, VGG | Networks pre-trained with PASCAL VOC and ImageNet datasets where converted into fully-connected convolutional networks by replacing the fully connected and scoring layers with a convolution layer. A final deconvolution layer with an output with the same size as the input. A regularization operation is added between every convolutional and activation layer. VGG, ResNet-101 and ResNet-152 were tested also using shape-form-shading features. |
Zhang R. et al. 2018 | Localization | YOLO | Custom architecture RYCO that consist of two networks: 1. A regression-based deep learning with residual learning (ResYOLO) detection model to locate polyp in a frame. 2. A Discriminative Correlation Filter (DCF) based method called Efficient Convolution Operators (ECO) to track the detected polyps. The ResYOLO network detects new polyps in a frame, starting the polyp tracking. During tracking, both ResYOLO and ECO tracker are used to determine the polyp location. Tracking stops when a confidence score calculated using last frames is under a threshold value. |
Urban et al. 2018 | Detection | None | Two custom CNNs a proposed. First CNN is built just with convolutional, maximum pooling and fully connected layers. Second CNN also includes batch normalization layers and inception modules. |
Urban et al. 2018 | Localization | YOLO | The 5 CNNs used for detection (two custom, VGG16, VGG19 and ResNet-50) are modified by replacing the fully connected layers with convolutional layers. The last layer has 5 filter maps that have its outputs spaced over a grid over the input image. Each grid cell predicts its confidence with a sigmoid unit, the position of the polyp relative to the grid cell center, and its size. The final output is the weighted sum of all the adjusted positions and size predictions, weighted with the confidences. |
Mohammed et al. 2018 | Detection | Y-Net | The frame-work consists of two fully convolution encoder networks which are connected to a single decoder network that matches the encoder network resolution at each down-sampling operation. The network are trained with encoder specific adaptive learning rates that update the parameters of randomly initialized encoder network with a larger step size as compared to the encoder with pre-trained weights. The two encoders features are merged with a decoder network at each down-sampling paththrough sum-skip connection. |
Lui et al. 2019 | Classification | ResNet | Network with 5 convolutional layers and 2 fully connected layers but based on a pre-trained ResNet CNN backbone. |
Qadir et al. 2019 | Localization | None | Framework for false positive (FP) reduction is proposed. The framework adds a FP reduction unit to an RPN network. This unit exploits temporal dependencies between frames (forward and backward) to correct the output. Faster R-CNN and SSD RPNs were tested. |
Blanes-Vidal et al. 2019 | Localization | R-CNN with AlexNet | Several modifications done to AlexNet: - Last fully connected layer replaced to output two classes. - 5 convolutional and 3 fully connected layers were fine-tuned. - Max-Pooling kernels, ReLU activation function and dropout used to avoid overfitting and build robustness to intra-class deformations. - Stochastic gradient descent with momentum used as the optimization algorithm. |
Zhang X. et al. 2019 | Localization | SSD | SSD was modified to add three new pooling layers (Second-Max Pooling, Second-Min Pooling and Min-Pooling) and a new deconvolution layer whose features are concatenated to those from the Max-Pooling layer that are fed into the detection layer. Model was pre-trained on the ILSVRC CLS-LOC dataset. |
Kandel et al. 2019 | Classification | CapsNet | A convolutional layer followed by 7 convolutional capsule layers and finalized with a global average pool by capsule type. |
Sornapudi et al. 2019 | Localization | Mask R-CNN | The region proposal network (RPN) uses a Feature Pyramid Network with a ResNet backbone. ResNet-50 and ResNet-101 were used, improved by extracting features from 5 different levels of layers. ResNet networks were initialized with COCO and ImageNet. Additionally, 76 random balloon images from Flickr were used to fine-tune networks initialized with COCO. The regions proposed by the RPN were filtered before the ROIAlign layer. The ROIAlign layer is followed by a pixel probability mask network, comprised of 4 convolutional layers followed by a transposed convolutional layer and a final convolutional layer with a sigmoid activation function that generates the final output. All convolutional layers except final are built with ReLU activation function. |
Tashk et al. 2019 | Localization | U-Net | The U-Net architecture was modified to use as input any image or video formats associated with optical colonoscopy modalities. |
Patino-Barrientos et al. 2020 | Classification | None | The model is composed by four convolutional layers, each one of them followed by a max pooling layer. After that, the model has a dropout layer to reduce overfitting and then add a final dense layer with sigmoid activation that outputs the probability of the current polyp being malignant. The model was trained using the RMSprop optimizer with a learning rate of 1×10−4. |
Jia X. et al. 2020 | Localization | ResNet-50, Feature Pyramid Network, and Faster R-CNN | Authors propose a two-stage framework, where the polyp proposal stage (stage I) is constructed as a region-level polyp detector that is capable of guiding the pixel-level learning in the polyp segmentation stage (stage II), aiming to accurately segment the area the polyp occupies in the image. This framework has a backbone network composed by a ResNet-50 followed by a Feature Pyramid Network, producing a set of feature maps that are used by the two-stage framework. The polyp proposal stage was created as as an extension of faster R-CNN, which performs as a region-level polyp detector to recognize the lesion area as a whole. Then, the polyp segmentation stage is built in a fully convolutional fashion for pixelwise segmentation. This two-stage framework has a feature sharing strategy in which the learned semantics of polyp proposals of stage I are transferred to the segmentation task of stage II. |
Qadir et al. 2021 | Localization | Resnet34 and MDeNet | Authors propose a modified version of MDeNet, proposed them in Qadir et al. 2019. See section 2.3. F-CNN models for polyp detection of Qadir et al. 2021 for more details. |
Xu J. et al. 2021 | Localization | YOLOv3 | Authors present a framework based on YOLOv3 to improve detection. This frameworks adds: (i) a False Positive Relearning Module (FPRM) to make the detector network learning more about the features of FPs for higher precision; (ii) an Image Style Transfer Module (ISTM) to enhance the features of polyps for higher sensitivity; (iii) an Inter-Frame Similarity Correlation unit (ISCU) to integrate spatiotemporal information, which is combined with the image detector network to improve performance in video detection in order to reduce FPs. |
Pacal et al. 2021 | Localization | YOLOv4 | Authors propose several models based on YOLOv4. To create their "Proposed Model1 (Small)" they first replaced the whole structure with Cross Stage Partial Networks (CSPNet), then substitute the Mish activation function for the Leaky ReLu activation function and also substituted the Distance Intersection over Union (DIoU) loss for the Complete Intersection over Union (CIoU) loss. |
Liu et al. 2021 | Localization | Resnet101 and Domain adaptive Faster R-CNN | Authors propose a consolidated domain adaptive framework with a training free style transfer process, a hierarchical network, and a centre besiegement loss for accurate cross-domain polyp detection and localization. |
Pacal et al. 2022 | Localization | YOLOv3, YOLOv4 | Authors propose modified versions of YOLOv3 and YOLOv4 by integrating Cross Stage Partial Network (CSPNet). With the aim of improving the detection performance, they also use the Sigmoid-weighted Linear Unit (SiLU) activation function and the Complete Intersection over Union (CIoU) loss functions. |
Data Augmentation Strategies
Rotation | Flipping (Mirroring) | Shearing | Crop | Random brightness | Translation (Shifting) | Scale | Zooming | Gaussian smoothing | Blurring | Saturation adjustment | Gaussian distortion | Resize | Random contrast | Exposure adjustment | Color augmentations in HSV | Mosaic | Mix-up | Histogram equalization | Skew | Random erasing | Color distribution adjust | Clipping | Sharpening | Cutmix | Color jittering | Random image expansion | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Num. Studies | 28 | 26 | 12 | 9 | 9 | 8 | 8 | 6 | 4 | 4 | 3 | 3 | 3 | 3 | 2 | 2 | 2 | 2 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
Tajbakhsh et al. 2015 | X | X | X | X | X | ||||||||||||||||||||||
Park and Sargent 2016 | X | X | |||||||||||||||||||||||||
Ribeiro et al. 2016 | X | X | |||||||||||||||||||||||||
Yu et al. 2017 | X | X | |||||||||||||||||||||||||
Byrne et al. 2017 | X | X | X | ||||||||||||||||||||||||
Brandao et al. 2018 | X | ||||||||||||||||||||||||||
Zhang R. et al. 2018 | X | X | X | X | X | ||||||||||||||||||||||
Zheng Y. et al. 2018 | X | ||||||||||||||||||||||||||
Shin Y. et al. 2018 | X | X | X | X | X | X | |||||||||||||||||||||
Urban et al. 2018 | X | X | X | ||||||||||||||||||||||||
Mohammed et al. 2018 | X | X | X | X | X | X | |||||||||||||||||||||
Qadir et al. 2019 | X | X | X | X | |||||||||||||||||||||||
Tian Y. et al. 2019 | X | X | X | X | X | ||||||||||||||||||||||
Blanes-Vidal et al. 2019 | X | X | X | ||||||||||||||||||||||||
Zhang X. et al. 2019 | X | ||||||||||||||||||||||||||
Zhu X. et al. 2019 | X | X | X | ||||||||||||||||||||||||
Sornapudi et al. 2019 | X | X | X | X | X | X | |||||||||||||||||||||
Wittenberg et al. 2019 | X | X | |||||||||||||||||||||||||
Yuan Y. et al. 2019 | X | X | X | X | X | ||||||||||||||||||||||
Ma Y. et al. 2019 | X | X | X | ||||||||||||||||||||||||
Bour et al. 2019 | X | X | X | X | X | X | X | X | X | ||||||||||||||||||
Patino-Barrientos et al. 2020 | X | X | X | X | X | ||||||||||||||||||||||
Cheng Tao Pu et al. 2020 | X | X | X | ||||||||||||||||||||||||
Ma Y. et al. 2020 | X | X | X | X | X | ||||||||||||||||||||||
Young Lee J. et al. 2020 | X | X | X | X | |||||||||||||||||||||||
Young Joo Yang et al. 2020 | X | ||||||||||||||||||||||||||
Wang W. et al. 2020 | X | X | X | ||||||||||||||||||||||||
Li T. et al. 2020 | X | X | X | X | |||||||||||||||||||||||
Podlasek J. et al. 2020 | X | X | |||||||||||||||||||||||||
Qadir et al. 2021 | X | X | X | X | |||||||||||||||||||||||
Xu J. et al. 2021 | X | X | X | ||||||||||||||||||||||||
Misawa et al. 2021 | X | X | X | ||||||||||||||||||||||||
Livovsky et al. 2021 | X | X | |||||||||||||||||||||||||
Pacal et al. 2021a | X | X | X | X | X | X | X | X | X | X | X | X | X | X | |||||||||||||
Liu et al. 2021 | X | X | X | X | X | X | |||||||||||||||||||||
Nogueira-Rodríguez et al. 2021 | X | X | X | X | |||||||||||||||||||||||
Ma Y. et al. 2021 | X | X | X | X | X | ||||||||||||||||||||||
Pacal et al. 2021B | X | X | X | X | X |
Frameworks and Libraries
Performance
Note: Some performance metrics are not directly reported in the papers, but were derived using raw data or confusion matrices provided by them.
Polyp Detection and Localization
Performance metrics on public and private datasets of all polyp detection and localization studies.
- Between parentheses it is specified the type of performance metric: i = image-based, bb = bounding-box-based, p = polyp-based, pa = patch, and pi = pixel-based.
- Between curly brackets it is specified the training dataset, where "P" stands for private.
- Between square brackets it is specified the test dataset used for computing the performance metric, where "P" stands for private.
- For instance, [{P}] means that development and test splits of the same private dataset have been used for training and testing respectively.
- Performances marked with an * are reported on training datasets (e.g. k-fold cross-validation).
- AP stands for Average Precision.
Note: Since february 2022, the former frame-based (f) type was split into image-based and bounding-box-based, which accurately reflects the type of evaluation done. Please, note that our review paper uses frame-based and includes both.
Study | Recall (sensitivity) | Precision (PPV) | Specificity | Others | Manually selected images? |
---|---|---|---|---|---|
Tajbakhsh et al. 2015 | 70% (bb) {[P]} | 63% (bb) {[P]} | 90% (bb) {[P]} | F1: 0.66, F2: 0.68 (bb) {[P]} | No |
Zhu R. et al. 2015 | 79.44% (pa) {[P]} | N/A | 79.54% (pa) {[P]} | Acc: 79.53% (pa) {[P]} | Yes |
Park and Sargent 2016 | 86% (bb) {P} * | - | 85% (bb) {P} * | AUC: 0.86 (bb) {P} * | Yes (on training) |
Yu et al. 2017 | 71% (bb) {[ASU-Mayo]} | 88.1% (bb) {[ASU-Mayo]} | N/A | F1: 0.786, F2: 0.739 (bb) {[ASU-Mayo]} | No |
Zhang R. et al. 2017 | 97.6% (i) {[P]} | 99.4% (i) {[P]} | N/A | F1: 0.98, F2: 0.98, AUC: 1.00 (i) {[P]} | Yes |
Yuan and Meng 2017 | 98% (i) {P} * | 97% (i) {P} * | 99% (i) {P} * | F1: 0.98, F2: 0.98 (i) [P] | Yes |
Brandao et al. 2018 | ~90% (bb) {CVC-ClinicDB + ASU-Mayo} [ETIS-Larib] ~90% (bb) {CVC-ClinicDB + ASU-Mayo} [CVC-ColonDB] |
~73% (bb) {CVC-ClinicDB + ASU-Mayo} [ETIS-Larib] ~80% (bb) {CVC-ClinicDB + ASU-Mayo} [CVC-ColonDB] |
N/A | F1: ~0.81, F2: ~0.86 (bb) {CVC-ClinicDB + ASU-Mayo} [ETIS-Larib] F1: ~0.85, F2: ~0.88 (bb) {CVC-ClinicDB + ASU-Mayo} [CVC-ColonDB] |
Yes |
Zhang R. et al. 2018 | 71.6% (bb) {[ASU-Mayo]} | 88.6% (bb) {[ASU-Mayo]} | 97% (bb) {[ASU-Mayo]} | F1: 0.792, F2: 0.744 (bb) {[ASU-Mayo]} | No |
Misawa et al. 2018 | 90% (i) {[P]} 94% (p) {[P]} |
55.1% (i) {[P]} 48% (p) {[P]} |
63.3% (i) {[P]} 40% (p) {[P]} |
F1: 0.68 (i) 0.63 (p), F2: 0.79 (i) 0.78 (p) {[P]} Acc: 76.5% (i) 60% (p) {[P]} |
No |
Zheng Y. et al. 2018 | 74% (bb) {CVC-ClinicDB + CVC-ColonDB} [ETIS-Larib] | 77.4% (bb) {CVC-ClinicDB + CVC-ColonDB} [ETIS-Larib] | N/A | F1: 0.757, F2: 0.747 (bb) {CVC-ClinicDB + CVC-ColonDB} [ETIS-Larib] | Yes |
Shin Y. et al. 2018 | 80.3% (bb) {CVC-ClinicDB} [ETIS-Larib] 84.2% (bb) {CVC-ClinicDB} [ASU-Mayo] 84.3% (bb) {CVC-ClinicDB} [CVC-ClinicVideoDB] |
86.5% (bb) {CVC-ClinicDB} [ETIS-Larib] 82.7% (bb) {CVC-ClinicDB} [ASU-Mayo] 89.7% (bb) {CVC-ClinicDB} [CVC-ClinicVideoDB] |
N/A | F1: 0.833, F2: 0.815 (bb) {CVC-ClinicDB} [ETIS-Larib] F1: 0.834, F2: 0.839 (bb) {CVC-ClinicDB} [ASU-Mayo] F1: 0.869, F2: 0.853 (bb) {CVC-ClinicDB} [CVC-ClinicVideoDB] |
Yes (ETIS-Larib) No (ASU-Mayo, CVC-ClinicVideoDB) |
Urban et al. 2018 | 93% (bb) {P1} [P2] 100% (p) {P1} [P2] 93% (p) {P1} [P3] |
74% (bb) {P1} [P2] 35% (p) {P1} [P2] 60% (p) {P1} [P3] |
93% (bb) {P1} [P2] | F1: 0.82, F2: 0.88 (bb) {P1} [P2] F1: 0.52, F2: 0.73 (p) {P1} [P2] F1: 0.73, F2: 0.84 (p) {P1} [P3] |
No |
Wang et al. 2018 | 88.24% (bb) {P} [CVC-ClinicDB] 94.38% (bb) {P} [P (dataset A)] 91.64% (bb), 100% (p) {P} [P (dataset C)] |
93.14% (bb) {P} [CVC-ClinicDB] 95.76% (bb) {P} [P (dataset A)] |
95.40% (bb) {P} [P (dataset D)] | F1: 0.91, F2: 0.89 (bb) {P} [CVC-ClinicDB] F1: 0.95, F2: 0.95, AUC: 0.984 (bb) {P} [P (dataset A)] |
Yes (dataset A, CVC-ClinicDB) No (dataset C/D) |
Mohammed et al. 2018 | 84.4% (bb) {[ASU-Mayo]} | 87.4 % (bb) {[ASU-Mayo]} | N/A | F1: 0.859, F2: 0.85 (bb) {[ASU-Mayo]} | No |
Qadir et al. 2019 | 81.51% (bb) {CVC-ClinicDB} [CVC-ClinicVideoDB] | 87.51% (bb) {CVC-ClinicDB} [CVC-ClinicVideoDB] | 84.26% (bb) {CVC-ClinicDB} [CVC-ClinicVideoDB] | F1: 0.844, F2: 0.83 (bb) {CVC-ClinicDB} [CVC-ClinicVideoDB] | No |
Tian Y. et al. 2019 | 64.42% (bb) {P} [ETIS-Larib] | 73.6% (bb) {P} [ETIS-Larib] | - | F1: 0.687, F2: 0.66 (bb) {P} [ETIS-Larib] | Yes |
Blanes-Vidal et al. 2019 | 97.1% (bb) {[P]} | 91.4% (bb) {[P]} | 93.3% (bb) {[P]} | Acc: 96.4%, F1: 0.94, F2: 0.95 (bb) {[P]} | N/A (not clear in the paper) |
Zhang X. et al. 2019 | 76.37% (bb) {[P]} | 93.92% (bb) {[P]} | N/A | F1: 0.84, F2: 0.79(bb) {[P]} | Yes |
Misawa et al. 2019 | 86% (p) {P1} [P2] | N/A | 74% (i) {P1} [P2] | - | No |
Zhu X. et al. 2019 | 88.5% (i) {P1} [P2] | N/A | 96.4% (i) {P1} [P2] | - | No |
Ahmad et al. 2019 | 91.6% (bb) {P} [ETIS-Larib] 84.5% (bb) {ETIS-Larib + P} [P] |
75.3% (bb) {P} [ETIS-Larib] | 92.5% (bb) {ETIS-Larib + P} [P] | F1: 0.83, F2: 0.88 (bb) {P} [ETIS-Larib] | Yes (ETIS-Larib) No (private) |
Sornapudi et al. 2019 | 91.64% (bb) {CVC-ClinicDB} [CVC-ColonDB] 78.12% (bb) {CVC-ClinicDB} [CVC-PolypHD] 80.29% (bb) {CVC-ClinicDB} [ETIS-Larib] 95.52% (bb) {[P]} |
89.94% (bb) {CVC-ClinicDB} [CVC-ColonDB] 83.33% (bb) {CVC-ClinicDB} [CVC-PolypHD] 72.93% (bb) {CVC-ClinicDB} [ETIS-Larib] 98.46% (bb) {[P]} |
N/A | F1: 0.9073, F2: 0.9127 (bb) {CVC-ClinicDB} [CVC-ColonDB] F1: 0.8065, F2: 0.7911 (bb) {CVC-ClinicDB} [CVC-PolypHD] F1: 0.7643, F2: 0.7870 (bb) {CVC-ClinicDB} [ETIS-Larib] F1: 0.966, F2: 0.961 (bb) {[P]} |
Yes (CVC-ClinicDB, CVC-ColonDB, ETIS-Larib) No (WCE video) |
Wittenberg et al. 2019 | 86% (bb) {P} [CVC-ClinicDB] 83% (bb) {P} [ETIS-Larib] 93% (bb) {[P]} |
80% (bb) {P} [CVC-ClinicDB] 74% (bb) {P} [ETIS-Larib] 86% (bb) {[P]} |
N/A | F1: 0.82, F2: 0.85 (bb) {P} [CVC-ClinicDB] F1: 0.79, F2: 0.81 (bb) {P} [ETIS-Larib] F1: 0.89, F2: 0.92 (bb) {[P]} |
Yes |
Yuan Y. et al. 2019 | 90.21% (i) {[P]} | 74.51% (i) {[P]} | 94.07% (i) {[P]} | Accuracy: 93.19%, F1: 0.81, F2: 0.86 (i) {[P]} | Yes |
Ma Y. et al. 2019 | 93.67% (bb) {[P]} | N/A | 98.36% (bb) {[P]} | Accuracy: 96.04%, AP: 94.92% (bb) {[P]} | Yes |
Tashk et al. 2019 | 82.7% (pi) {[CVC-ClinicDB]} 90.9% (pi) {[ETIS-Larib]} 82.4% (pi) {[CVC-ColonDB]} |
70.2% (pi) {[CVC-ClinicDB]} 70.2 (pi) {[ETIS-Larib]} 62% (pi) {[CVC-ColonDB]} |
- | Accuracy: 99.02%, F1: 0.76, F2: 0.798 (pi) {[CVC-ClinicDB]} Accuracy: 99.6%, F1: 0.7923, F2: 0.858 (pi) {[ETIS-Larib]} Accuracy: 98.2%, F1: 0.707, F2: 0.773 (pi) {[CVC-ColonDB]} |
Yes (CVC-ClinicDB, CVC-ColonDB, ETIS-Larib) |
Jia X. et al. 2020 | 92.1% (bb) {CVC-ColonDB} [CVC-ClinicDB] 59.4% (pi) {CVC-ColonDB} [CVC-ClinicDB] 81.7% (bb) {CVC-ClinicDB} [ETIS-Larib] |
84.8% (bb) {CVC-ColonDB} [CVC-ClinicDB] 85.9% (pi) {CVC-ColonDB} [CVC-ClinicDB] 63.9% (bb) {CVC-ClinicDB} [ETIS-Larib] |
- | F1: 0.883, F2: 0.905 (bb) {CVC-ColonDB} [CVC-ClinicDB] F1: 0.702, F2: 0.633, Jaccard: 74.7±20.5, Dice: 83.9±13.6 (pi) {CVC-ColonDB} [CVC-ClinicDB] F1: 0.717, F2: 0.774 (bb) {CVC-ClinicDB} [ETIS-Larib] |
Yes (CVC-ClinicDB, ETIS-Larib) |
Ozawa. et al. 2020 | 92% (bb) {P1} [P2] 90% (bb) {P1} [P2: WL] 97% (bb) {P1} [P2: NBI] 98% (p) {P1} [P2] |
86% (bb) {P1} [P2] 83% (bb) {P1} [P2: WL] 97% (bb) {P1} [P2: NBI] |
N/A | F1: 0.88, F2: 0.88 (bb) {P1} [P2] F1: 0.86, F2: 0.84 (bb) {P1} [P2: WL] F1: 0.97, F2: 0.97 (bb) {P1} [P2: NBI] |
Yes |
Ma Y. et al. 2020 | 92% (bb) {CVC-ClinicDB} [CVC-ClinicVideoDB] | 87.50% (bb) {CVC-ClinicDB} [CVC-ClinicVideoDB] | N/A | F1: 0.897, F2: 0.911 (bb) {CVC-ClinicDB} [CVC-ClinicVideoDB] | No |
Young Lee J. et al. 2020 | 96.7% (bb) {[P]} 90.2% (bb) {P} [CVC-ClinicDB] |
97.4% (bb) {[P]} 98.2% (bb) {P} [CVC-ClinicDB] |
N/A | F1: 0.97, F2: 0.97 (bb) {[P]} F1: 0.94, F2: 0.96 (bb) {P} [CVC-ClinicDB] |
Yes (CVC-ClinicDB, private) |
Wang W. et al. 2020 | 97.5% (i) {[CP-CHILD-A]} 98% (i) {CP-CHILD-A} [CP-CHILD-B] |
N/A | 99.85% (i) {[CP-CHILD-A]} 99.83% (i) {CP-CHILD-A} [CP-CHILD-B] |
Accuracy: 99.25% (i) {[CP-CHILD-A]} Accuracy: 99.34% (i) {CP-CHILD-A} [CP-CHILD-B] |
Yes |
Li T. et al. 2020 | 73% (i) {[P]} | 93% (i) {[P]} | 96% (i) {[P]} | NPV: 83%, Acc: 86%, AUC: 0.94 (i) {[P]} | Yes |
Sánchez-Peralta et al. 2020 | 74.73% (pi) {PICCOLO} [Kvasir-SEG] 71.88% (pi) {PICCOLO} [CVC-EndoSceneStill] 72.89% (pi) {[PICCOLO]} 69.77% (pi) {PICCOLO} [PICCOLO-WL] 63.31% (pi) {CVC-EndoSceneStill} [Kvasir-SEG] 79.22% (pi) {[CVC-EndoSceneStill]} 45.09% (pi) {CVC-EndoSceneStill} [PICCOLO] 57.06% (pi) {CVC-EndoSceneStill} [PICCOLO-WL] 88.98% (pi) {[Kvasir-SEG]} 83.46% (pi) {Kvasir-SEG} [CVC-EndoSceneStill] 58.11% (pi) {Kvasir-SEG} [PICCOLO] 54.63% (pi) {Kvasir-SEG} [PICCOLO-WL] |
81.31% (pi) {PICCOLO} [Kvasir-SEG] 84.35% (pi) {PICCOLO} [CVC-EndoSceneStill] 77.58% (pi) {[PICCOLO]} 71.33% (pi) {PICCOLO} [PICCOLO-WL] 77.80% (pi) {CVC-EndoSceneStill} [Kvasir-SEG] 87.88% (pi) {[CVC-EndoSceneStill]} 52.84% (pi) {CVC-EndoSceneStill} [PICCOLO] 60.93% (pi) {CVC-EndoSceneStill} [PICCOLO-WL] 81.68% (pi) {[Kvasir-SEG]} 83.54% (pi) {Kvasir-SEG} [CVC-EndoSceneStill] 59.54% (pi) {Kvasir-SEG} [PICCOLO] 63.61% (pi) {Kvasir-SEG} [PICCOLO-WL] |
97.41% (pi) {PICCOLO} [Kvasir-SEG] 98.85% (pi) {PICCOLO} [CVC-EndoSceneStill] 97.96% (pi) {[PICCOLO]} 97.37% (pi) {PICCOLO} [PICCOLO-WL] 98.15% (pi) {CVC-EndoSceneStill} [Kvasir-SEG] 99.00% (pi) {[CVC-EndoSceneStill]} 97.30% (pi) {CVC-EndoSceneStill} [PICCOLO] 91.12% (pi) {CVC-EndoSceneStill} [PICCOLO-WL] 96.49% (pi) {[Kvasir-SEG]} 97.65% (pi) {Kvasir-SEG} [CVC-EndoSceneStill] 93.29% (pi) {Kvasir-SEG} [PICCOLO] 98.06% (pi) {Kvasir-SEG} [PICCOLO-WL] |
F1: 0.779, F2: 0.760, Jaccard: 65.33±30.66, Dice: 73.54±30.15 (pi) {PICCOLO} [Kvasir-SEG] F1: 0.776, F2: 0.741, Jaccard: 64.18±33.04, Dice: 71.66±32.98 (pi) {PICCOLO} [CVC-EndoSceneStill] F1: 0.752, F2: 0.738, Jaccard: 64.01±36.23, Dice: 70.10±36.45 (pi) {[PICCOLO]} F1: 0.705, F2: 0.701, Jaccard: 58.70±38.90, Dice: 64.51±39.18 (pi) {PICCOLO} [PICCOLO-WL] F1: 0.698, F2: 0.658, Jaccard: 56.12±34.29, Dice: 64.26±35.35 (pi) {CVC-EndoSceneStill} [Kvasir-SEG] F1: 0.833, F2: 0.808, Jaccard: 72.16±30.93, Dice: 78.61±29.48 (pi) {[CVC-EndoSceneStill]} F1: 0.487, F2: 0.465, Jaccard: 39.52±37.9, Dice: 45.5±41.51 (pi) {CVC-EndoSceneStill} [PICCOLO] F1: 0.589, F2: 0.578, Jaccard: 45.00±35.60, Dice: 52.81±38.33 (pi) {CVC-EndoSceneStill} [PICCOLO-WL] F1: 0.852, F2: 0.874, Jaccard: 74.52±22.81, Dice: 82.68±21.28 (pi) {[Kvasir-SEG]} F1: 0.835, F2: 0.835, Jaccard: 71.82±29.87, Dice: 78.78±28.14 (pi) {Kvasir-SEG} [CVC-EndoSceneStill] F1: 0.588, F2: 0.584, Jaccard: 44.92±37.37, Dice: 51.87±39.79 (pi) {Kvasir-SEG} [PICCOLO] F1: 0.588, F2: 0.562, Jaccard: 47.74±39.55, Dice: 53.62±41.68 (pi) {Kvasir-SEG} [PICCOLO-WL] |
Yes |
Podlasek J. et al. 2020 | 91.2% (bb) {P} [CVC-ClinicDB] 88.2% (bb) {P} [Hyper-Kvasir] 74.1% (bb) {P} [CVC-ColonDB] 67.3% (bb) {P} [ETIS-Larib] |
97.4% (bb) {P} [CVC-ClinicDB] 97.5% (bb) {P} [Hyper-Kvasir] 92.4% (bb) {P} [CVC-ColonDB] 79% (bb) {P} [ETIS-Larib] |
N/A | F1: 0.942, F2: 0.924 (bb) {P} [CVC-ClinicDB] F1: 0.926, F2: 0.899 (bb) {P} [Hyper-Kvasir] F1: 0.823, F2: 0.771 (bb) {P} [CVC-ColonDB] F1: 0.727, F2: 0.693 (bb) {P} [ETIS-Larib] |
Yes |
Qadir et al. 2021 | 86.54% (bb) {CVC-ClinicDB} [ETIS-Larib] 91% (bb) {CVC-ClinicDB} [CVC-ColonDB] |
86.12% (bb) {CVC-ClinicDB} [ETIS-Larib] 88.35% (bb) {CVC-ClinicDB} [CVC-ColonDB] |
N/A | F1: 0.863, F2: 0.864 (bb) {CVC-ClinicDB} [ETIS-Larib] F1: 0.896, F2: 0.904 (bb) {CVC-ClinicDB} [CVC-ColonDB] |
Yes |
Xu J. et al. 2021 | 75.70% (bb) {CVC-ClinicDB + CVC-ColonDB + ETIS-Larib + CVC-ClinicVideoDB} [P] 71.63% (bb) {CVC-ClinicDB} [ETIS-Larib] 66.36% (bb) {CVC-ClinicDB} [CVC-ClinicVideoDB] |
85.54% (bb) {CVC-ClinicDB + CVC-ColonDB + ETIS-Larib + CVC-ClinicVideoDB} [P] 83.24% (bb) {CVC-ClinicDB} [ETIS-Larib] 88.5% (bb) {CVC-ClinicDB} [CVC-ClinicVideoDB] |
N/A | F1: 0.799, F2: 0.773 (bb) {CVC-ClinicDB + CVC-ColonDB + ETIS-Larib + CVC-ClinicVideoDB} [P] F1: 0.77, F2: 0.737 (bb) {CVC-ClinicDB} [ETIS-Larib] F1: 0.7586, F2: 0.698 (bb) {CVC-ClinicDB} [CVC-ClinicVideoDB] |
Yes (ETIS-Larib, Private) No (CVC-ClinicVideoDB) |
Misawa et al. 2021 | 98% (p) {P} [SUN] 90.5% (i) {P} [SUN] |
88.2% (i) {P} [SUN] | 93.7% (i) {P} [SUN] | F1: 0.893, F2: 0.900, NPV: 94.96% (i) {P} [SUN] | No. |
Livovsky et al. 2021 | 97.1% (p) {P1} [P2] | N/A | N/A | N/A | No. |
Pacal et al. 2021 | 82.55% (bb) {CVC-ClinicDB} [ETIS-Larib] 96.68% (bb) {CVC-ClinicDB} [CVC-ColonDB] |
91.62% (bb) {CVC-ClinicDB} [ETIS-Larib] 96.04% (bb) {CVC-ClinicDB} [CVC-ColonDB] |
N/A | F1: 0.868, F2: 0.842 (bb) {CVC-ClinicDB} [ETIS-Larib] F1: 0.964, F2: 0.965 (bb) {CVC-ClinicDB} [CVC-ColonDB] |
Yes |
Liu et al. 2021 | 87.5% (bb) {CVC-ClinicDB} [ETIS-Larib] | 77.8% (bb) {CVC-ClinicDB} [ETIS-Larib] | - | F1: 0.824, F2: 0.854 (bb) {CVC-ClinicDB} [ETIS-Larib] | Yes (ETIS-Larib) |
Li K. et al. 2021 | 86.2% (bb) {[KUMC]} | 91.2% (bb) {[KUMC]} | N/A | F1: 0.886, F2: 0.8715, AP: 88.5% (bb) {[KUMC]} | Yes |
Nogueira-Rodríguez et al. 2021 | 87% (bb) {[P]} 89.91% (p) {[P]} |
89% (bb) {[P]} | 54.97% (p) {[P]} | F1: 0.881, F2: 0.876 (bb) {[P]} | Yes |
Yoshida et al. 2021 | 83% (p) {CAD EYE} [P-LED WLI] 87.2% (p) {CAD EYE} [P-LASER WLI] 88.7% (p) {CAD EYE} [P-LED LCI] 89.4% (p) {CAD EYE} [P-LASER LCI] |
N/A | N/A | N/A | - |
Ma Y. et al. 2021 | 64% (bb) {CVC-ClinicDB} [CVC-ClinicVideoDB] 47% (bb) {CVC-ClinicDB} [LDPolypVideo] |
85% (bb) {CVC-ClinicDB} [CVC-ClinicVideoDB] 65% (bb) {CVC-ClinicDB} [LDPolypVideo] |
N/A | F1: 0.73, F2: 0.67 (bb) {CVC-ClinicDB} [CVC-ClinicVideoDB] F1: 0.55, F2: 0.50 (bb) {CVC-ClinicDB} [LDPolypVideo] |
Yes (CVC-ClinicDB) No (LDPolypVideo, CVC-ClinicVideoDB) |
Pacal et al. 2022 | 91.04% (bb) {SUN + PICCOLO + CVC-ClinicDB} [ETIS-Larib] 90.57% (bb) {SUN + CVC-ClinicDB} [ETIS-Larib] 88.24% (bb) {SUN} [ETIS-Larib] 75.53% (bb) {PICCOLO} [ETIS-Larib] 79.85% (bb) {[PICCOLO]} 86.48% (bb) {[SUN]} |
90.61% (bb) {SUN + PICCOLO + CVC-ClinicDB} [ETIS-Larib] 90.14% (bb) {SUN + CVC-ClinicDB} [ETIS-Larib] 88.24% (bb) {SUN} [ETIS-Larib] 87.29% (bb) {PICCOLO} [ETIS-Larib] 92.60% (bb) {[PICCOLO]} 96.49% (bb) {[SUN]} |
N/A | F1: 0.908, F2: 0.909 (bb) {SUN + PICCOLO + CVC-ClinicDB} [ETIS-Larib] F1: 0.903, F2: 0.902 (bb) {SUN + CVC-ClinicDB} [ETIS-Larib] F1: 0.882, F2: F1: 0.882 (bb) {SUN} [ETIS-Larib] F1: 0.809, F2: 0.846 (bb) {PICCOLO} [ETIS-Larib] F1: 0.857, F2: 0.821 (bb) {[PICCOLO]} F1: 0.912, F2: 0.883 (bb) {[SUN]} |
Yes (ETIS-Larib, PICCOLO) No (SUN) |
Nogueira-Rodríguez et al. 2022 | 82% (bb) {P} [CVC-ClinicDB] 84% (bb) {P} [CVC-ColonDB] 75% (bb) {P} [CVC-PolypHD] 72% (bb) {P} [ETIS-Larib] 78% (bb) {P} [Kvasir-SEG] 60% (bb) {P} [PICCOLO] 80% (bb) {P} [CVC-ClinicVideoDB] 81% (bb) {P} [KUMC dataset] 76% (bb) {P} [KUMC dataset–Test] 78% (bb) {P} [SUN] 49% (bb) {P} [LDPolypVideo] |
87% (bb) {P} [CVC-ClinicDB] 81% (bb) {P} [CVC-ColonDB] 86% (bb) {P} [CVC-PolypHD] 71% (bb) {P} [ETIS-Larib] 84% (bb) {P} [Kvasir-SEG] 76% (bb) {P} [PICCOLO] 75% (bb) {P} [CVC-ClinicVideoDB] 83% (bb) {P} [KUMC dataset] 81% (bb) {P} [KUMC dataset–Test] 83% (bb) {P} [SUN] 56% (bb) {P} [LDPolypVideo] |
- | F1: , F2: 0.83, AP: 0.82 (bb) {P} [CVC-ClinicDB] F1: 0.83, F2: 0.83, AP: 0.85 (bb) {P} [CVC-ColonDB] F1: 0.80, F2: 0.77, AP: 0.79 (bb) {P} [CVC-PolypHD] F1: 0.72, F2: 0.72, AP: 0.69 (bb) {P} [ETIS-Larib] F1: 0.81, F2: 0.82, AP: 0.79 (bb) {P} [Kvasir-SEG] F1: 0.67, F2: 0.62, AP: 0.63 (bb) {P} [PICCOLO] F1: 0.77, F2: 0.79, AP: 0.77 (bb) {P} [CVC-ClinicVideoDB] F1: 0.82, F2: 0.81, AP: 0.83 (bb) {P} [KUMC dataset] F1: 0.78, F2: 0.77, AP: 0.79 (bb) {P} [KUMC dataset–Test] F1: 0.81, F2: 0.79, AP: 0.81 (bb) {P} [SUN] F1: 0.52, F2: 0.50, AP: 0.44 (bb) {P} [LDPolypVideo] |
Yes (CVC-ClinicDB, CVC-ColonDB, CVC-PolypHD, ETIS-Larib, Kvasir-SEG, PICCOLO, KUMC) No (CVC-ClinicVideoDB, SUN, LDPolypVideo) |
Nogueira-Rodríguez et al. 2023 | 87.2% (bb) {P} [P] 86.7% (bb) {P2} [P] 87.5% (bb) {P5} [P] 85% (bb) {P10} [P] 88% (bb) {P15} [P] |
Intra-dataset EvaluationNogueira et al. 202189% (bb) {P} [P] 88.2% (bb) {P} [P2] 87.1% (bb) {P} [P5] 85.2% (bb) {P} [P10] 83.6% (bb) {P} [P15] Not-polyp images increment 2%89.4% (bb) {P2} [P] 89% (bb) {P2} [P2] 88.6% (bb) {P2} [P5] 87.9% (bb) {P2} [P10] 87.1% (bb) {P2} [P15] Not-polyp images increment 5%90.2% (bb) {P5} [P] 89.9% (bb) {P5} [P2] 89.5% (bb) {P5} [P5] 88.8% (bb) {P5} [P10] 88.1% (bb) {P5} [P15] Not-polyp images increment 10%90.4% (bb) {P10} [P] 90.2% (bb) {P10} [P2] 90.1% (bb) {P10} [P5] 89.7% (bb) {P10} [P10] 89.5% (bb) {P10} [P15] Not-polyp images increment 15%91% (bb) {P15} [P] 90.9% (bb) {P15} [P2] 90.7% (bb) {P15} [P5] 90.4% (bb) {P15} [P10] 90.1% (bb) {P15} [P15] |
- | Intra-dataset EvaluationNogueira et al. 2021F1:0.881 (bb) {P} [P] F1:0.882 (bb) {P} [P2] F1:0.871 (bb) {P} [P5] F1:0.852 (bb) {P} [P10] F1:0.836 (bb) {P} [P15] Not-polyp images increment 2%F1:0.880 (bb) {P2} [P] F1:0.890 (bb) {P2} [P2] F1:0.886 (bb) {P2} [P5] F1:0.879 (bb) {P2} [P10] F1:0.871 (bb) {P2} [P15] Not-polyp images increment 5%F1:0.888 (bb) {P5} [P] F1:0.899 (bb) {P5} [P2] F1:0.895 (bb) {P5} [P5] F1:0.888 (bb) {P5} [P10] F1:0.881 (bb) {P5} [P15] Not-polyp images increment 10%F1:0.876 (bb) {P10} [P] F1:0.902 (bb) {P10} [P2] F1:0.901 (bb) {P10} [P5] F1:0.897 (bb) {P10} [P10] F1:0.895 (bb) {P10} [P15] Not-polyp images increment 15%F1:0.895 (bb) {P15} [P] F1:0.909 (bb) {P15} [P2] F1:0.907 (bb) {P15} [P5] F1:0.904 (bb) {P15} [P10] F1:0.901 (bb) {P15} [P15] Inter-dataset EvaluationLDPolypVideoF1:0.522 (bb) {P} [LDPolypVideo] F1:0.563 (bb) {P2} [LDPolypVideo] F1:0.516 (bb) {P5} [LDPolypVideo] F1:0.491 (bb) {P10} [LDPolypVideo] F1:0.564 (bb) {P10} [LDPolypVideo] CVC-ClinicVideoDBF1:0.774 (bb) {P} [CVC-ClinicVideoDB] F1:0.803 (bb) {P2} [CVC-ClinicVideoDB] F1:0.813 (bb) {P5} [CVC-ClinicVideoDB] F1:0.809 (bb) {P10} [CVC-ClinicVideoDB] F1:0.800 (bb) {P15} [CVC-ClinicVideoDB] KUMC datasetF1:0.818 (bb) {P} [KUMC dataset] F1:0.811 (bb) {P2} [KUMC dataset] F1:0.819 (bb) {P5} [KUMC dataset] F1:0.762 (bb) {P10} [KUMC dataset] F1:0.831 (bb) {P15} [KUMC dataset] PICCOLOF1:0.667 (bb) {P} [PICCOLO] F1:0.601 (bb) {P2} [PICCOLO] F1:0.691 (bb) {P5} [PICCOLO] F1:0.759 (bb) {P10} [PICCOLO] F1:0.691 (bb) {P15} [PICCOLO] CVC-ClinicDBF1:0.845 (bb) {P} [CVC-ClinicDB] F1:0.843 (bb) {P2} [CVC-ClinicDB] F1:0.867 (bb) {P5} [CVC-ClinicDB] F1:0.786 (bb) {P10} [CVC-ClinicDB] F1:0.824 (bb) {P15} [CVC-ClinicDB] CVC-ColonDBF1:0.826 (bb) {P} [CVC-ColonDB] F1:0.848 (bb) {P2} [CVC-ColonDB] F1:0.883 (bb) {P5} [CVC-ColonDB] F1:0.689 (bb) {P10} [CVC-ColonDB] F1:0.797 (bb) {P15} [CVC-ColonDB] SUNF1:0.805 (bb) {P} [SUN] F1:0.764 (bb) {P2} [SUN] F1:0.738 (bb) {P5} [SUN] F1:0.765 (bb) {P10} [SUN] F1:0.746 (bb) {P15} [SUN] Kvasir-SEGF1:0.807 (bb) {P} [Kvasir-SEG] F1:0.800 (bb) {P2} [Kvasir-SEG] F1:0.797 (bb) {P5} [Kvasir-SEG] F1:0.840 (bb) {P10} [Kvasir-SEG] F1:0.830 (bb) {P15} [Kvasir-SEG] ETIS-LaribF1:0.718 (bb) {P} [ETIS-Larib] F1:0.732 (bb) {P2} [ETIS-Larib] F1:0.679 (bb) {P5} [ETIS-Larib] F1:0.594 (bb) {P10} [ETIS-Larib] F1:0.685 (bb) {P15} [ETIS-Larib] CVC-PolypHDF1:0.800 (bb) {P} [CVC-PolypHD] F1:0.729 (bb) {P2} [CVC-PolypHD] F1:0.826 (bb) {P5} [CVC-PolypHD] F1:0.820 (bb) {P10} [CVC-PolypHD] F1:0.820 (bb) {P15} [CVC-PolypHD] |
Yes (PIBAdb, CVC-ClinicDB, CVC-ColonDB, CVC-PolypHD, ETIS-Larib, Kvasir-SEG, PICCOLO, KUMC) No (CVC-ClinicVideoDB, SUN, LDPolypVideo) |
Polyp Classification
Performance metrics on public and private datasets of all polyp classification studies.
- Between curly brackets it is specified the training dataset, where "P" stands for private.
- Between square brackets it is specified the test dataset used for computing the performance metric, where "P" stands for private.
- For instance, [{P}] means that development and test splits of the same private dataset have been used for training and testing respectively.
- Performances marked with an * are reported on training datasets (e.g. k-fold cross-validation).
Study | Classes | Recall (sensitivity) | Specificity | PPV | NPV | Others | Polyp-level vs. frame-level | Dataset type |
---|---|---|---|---|---|---|---|---|
Zhang R. et al. 2017 | Adenoma vs. hyperplastic Resectable vs. non-resectable Adenoma vs. hyperplastic vs. serrated |
92% (resectable vs. non-resectable) {[Colonoscopic Dataset]} 87.6% (adenoma vs. hyperplastic) {[P]} |
89.9% (resectable vs. non-resectable) {[Colonoscopic Dataset]} 84.2% (adenoma vs. hyperplastic) {[P]} |
95.4% (resectable vs. non-resectable) {[Colonoscopic Dataset]} 87.30% (adenoma vs. hyperplastic) {[P]} |
84.9% (resectable vs. non-resectable) {[Colonoscopic Dataset]} 87.2% (adenoma vs. hyperplastic) {[P]} |
Acc: 91.3% (resectable vs. non- resectable) {[Colonoscopic Dataset]} Acc: 86.7% (adenoma vs. serrated adenoma vs. hyperplastic) {[Colonoscopic Dataset]} Acc: 85.9% (adenoma vs. hyperplastic) {[P]} |
frame | video (manually selected images) |
Byrne et al. 2017 | Adenoma vs. hyperplastic | 98% {P1} [P2] | 83% {P1} [P2] | 90% {P1} [P2] | 97% {P1} [P2] | - | polyp | unaltered video |
Chen et al. 2018 | Neoplastic vs. hyperplastic | 96.3% {P1} [P2] | 78.1% {P1} [P2] | 89.6% {P1} [P2] | 91.5% {P1} [P2] | N/A | frame | image dataset |
Lui et al. 2019 | Endoscopically curable lesions vs. endoscopically incurable lesions | 88.2% {P1} [P2] | 77.9% {P1} [P2] | 92.1% {P1} [P2] | 69.3% {P1} [P2] | Acc: 85.5% {P1} [P2] | frame | image dataset |
Kandel et al. 2019 | Hyperplastic vs. serrated adenoma (near focus) Hyperplastic vs. adenoma (far focus) |
57.14% (hyperplastic vs. serrated) {P} * 75.63% (hyperplastic vs. adenoma) {P} * |
68.52% (hyperplastic vs. serrated) {P} * 63.79% (hyperplastic vs. adenoma) {P} * |
N/A | N/A | Acc: 67.21% (hyperplastic vs. serrated) {P} * Acc: 72.48% (hyperplastic vs. adenoma) {P} * |
frame | image dataset |
Zachariah et al. 2019 | Adenoma vs. serrated | 95.7% {P} * | 89.9% {P} * | 94.1% {P} * | 92.6% {P} * | Acc: 93.6%, F1: 0.948, F2: 0.953 {P} * | polyp | image dataset |
Bour et al. 2019 | Not dangerous vs. dangerous vs. cancer | 88% (Cancer vs. others) [P] 84% (Not dangerous vs. others) [P] 90% (Dangerous vs. others) [P] |
94% (Cancer vs. others) [P] 93% (Not dangerous vs. others) [P] 93% (Dangerous vs. others) |
88% (Cancer vs. others) [P] 87% (Not dangerous vs. others) [P] 86% (Dangerous vs. others) |
N/A | Acc: 87.1% [P] F1: 0.88 (Cancer vs. others) [P] F1: 0.86 (Not dangerous vs. others) [P] F1: 0.88 (Dangerous vs. others) |
frame | image dataset |
Patino-Barrientos et al. 2020 | Malignant vs. non-malignant | 86% {[P]} | N/A | 81% {[P]} | N/A | Acc: 83% {[P]} F1: 0.83 {[P]} |
frame | image dataset |
Cheng Tao Pu et al. 2020 | 5-class (I, II, IIo, IIIa, IIIb) Adenoma (classes II + IIo + IIIa) vs. hyperplastic (class I) |
97% (adenoma vs. hyperplastic) {P: AU} * 100% (adenoma vs. hyperplastic) {P: AU} [P: JP-NBI] 100% (adenoma vs. hyperplastic) {P: AU} [P: JP-BLI] |
51% (adenoma vs. hyperplastic) {P: AU} * 0% (adenoma vs. hyperplastic) {P: AU} [P: JP-NBI] 0% (adenoma vs. hyperplastic) {P: AU} [P: JP-BLI] |
95% (adenoma vs. hyperplastic) {P: AU} * 82.4% (adenoma vs. hyperplastic) {P: AU} [P: JP-NBI] 77.5% (adenoma vs. hyperplastic) {P: AU} [P: JP-BLI] |
63.5% (adenoma vs. hyperplastic) {P: AU} * - (adenoma vs. hyperplastic) {P: AU} [P: JP-NBI] - (adenoma vs. hyperplastic) {P: AU} [P: JP-BLI] |
AUC (5-class): 94.3% {P: AU} * AUC (5-class): 84.5% {P: AU} [P: JP-NBI] AUC (5-class): 90.3% {P: AU} [P: JP-BLI] Acc: 72.3% (5-class) {P: AU} * Acc: 59.8% (5-class) {P: AU} [P: JP-NBI] Acc: 53.1% (5-class) {P: AU} [P: JP-BLI] Acc: 92.7% (adenoma vs. hyperplastic) {P: AU} * Acc: 82.4% (adenoma vs. hyperplastic) {P: AU} [P: JP-NBI] Acc: 77.5% (adenoma vs. hyperplastic) {P: AU} [P: JP-BLI] |
frame | image dataset |
Ozawa. et al. 2020 | Adenoma vs. hyperplastic vs. SSAP vs. cancer vs. other types | 97% (adenoma vs. other classes) {P1} [P2: WL] 90% (adenoma vs. hyperplastic) {P1} [P2: WL] 97% (adenoma vs. other classes) {P1} [P2: NBI] 86% (adenoma vs. hyperplastic) {P1} [P2: NBI] |
81% (adenoma vs. hyperplastic) {P1} [P2: WL] 88% (adenoma vs. hyperplastic) {P1} [P2: NBI] |
86% (adenoma vs. other classes) {P1} [P2: WL] 98% (adenoma vs. hyperplastic) {P1} [P2: WL] 83% (adenoma vs. other classes) {P1} [P2: NBI] 98% (adenoma vs. hyperplastic) {P1} [P2: NBI] |
85% (adenoma vs. other classes) {P1} [P2: WL] 48% (adenoma vs. hyperplastic) {P1} [P2: WL] 91% (adenoma vs. other classes) {P1} [P2: NBI] 54% (adenoma vs. hyperplastic) {P1} [P2: NBI] |
Acc: 83% (5-class) {P1} [P2: WL] F1: 0.91, F1: 0.88 (adenoma vs. other classes) {P1} [P2: WL] F1: 0.94, F2: 0.96 (adenoma vs. hyperplastic) {P1} [P2: WL] Acc: 81% (5-class) {P1} [P2: NBI] F1: 0.89, F2: 0.85 (adenoma vs. other classes) {P1} [P2: NBI] F1: 0.92, F2: 0.95 (adenoma vs. hyperplastic) {P1} [P2: NBI] |
frame | image dataset |
Young Joo Yang et al. 2020 | 7-class (CRC T1 vs. CRC T2 vs. CRC T3 vs. CRC T4 vs. high-grade dysplasia (HGD) vs. tubular adenoma with or without low grade dysplasia (TA) vs. non-neoplastic lesions) 4-class (advanced CRC (T2, T3, and T4) vs. early CRC/HGD (CRC T1 and HGD) vs. TA vs. non-neoplastic lesions) Advanced colorectal lesions vs. non-advanced colorectal lesions Neoplastic lesions vs. non-neoplastic lesions |
94.1% (Neoplastic vs. non-neoplastic) {[P1]} 83.2% (Advanced vs. non-advanced) {[P1]} |
34.1% (Neoplastic vs. non-neoplastic) {[P1]} 89.7% (Advanced vs. non-advanced) {[P1]} |
86.1% (Neoplastic vs. non-neoplastic) {[P1]} 84.5% (Advanced vs. non-advanced) {[P1]} |
65% (Neoplastic vs. non-neoplastic) {[P1]} 88.7% (Advanced vs. non-advanced) {[P1]} |
Acc: 0.795, F1: 0.899, F2: 0.923, AUC: 0.832 (Neoplastic vs. non-neoplastic) {[P1]} Acc: 93.5%, F1: 0.838, F2: 0.934, AUC: 0.935 (Advanced vs. non-advanced) {[P1]} Acc: 71.5%, AUC: 0.760 (Neoplastic vs. non-neoplastic) {P1} [P2] Acc: 87.1%, AUC: 0.935 (Advanced vs. non-advanced) {P1} [P2] Acc (7-class): 60.2% {[P1]} 74.7% {P1} [P2] Acc (4-class): 67.7% {[P1]} 76% {P1} [P2] |
frame | image dataset |
Li K. et al. 2021 | Adenoma vs. hyperplastic | 86.8% {[KUMC]} | N/A | 85.8% {[KUMC]} | N/A | F1: 0.863 {[KUMC]} | polyp | image dataset |
Yoshida et al. 2021 | Neoplastic vs. hyperplastic | 91.7% {CAD EYE} [P non-magnified BLI] 90.9% {CAD EYE} [P-magnified BLI] |
86.8% {CAD EYE} [P non-magnified BLI] 85.2% {CAD EYE} [P-magnified BLI] |
82.5% {CAD EYE} [P non-magnified BLI] 83.3% {CAD EYE} [P-magnified BLI] |
93.9% {CAD EYE} [P non-magnified BLI] 92.0% {CAD EYE} [P-magnified BLI] |
Acc: 88.8% {CAD EYE} [P non-magnified BLI] Acc: 87.8% {CAD EYE} [P-magnified BLI] |
polyp | live video |
Simultaneous Polyp Detection and Classification
Performance metrics on public and private datasets of all simultaneous polyp detection and classification studies.
- Between curly brackets it is specified the training dataset, where "P" stands for private.
- Between square brackets it is specified the test dataset used for computing the performance metric, where "P" stands for private.
- For instance, [{P}] means that development and test splits of the same private dataset have been used for training and testing respectively.
- APIoU stands for Average Precision and mAPIoU for Mean Average Precision (i.e. the mean of each class AP), calculated at the specified IoU (Intersection over Union) level.
Study | Classes | AP | mAP | Recall (sensitivity) | Specificity | PPV | NPV | Others | Manually selected images? |
---|---|---|---|---|---|---|---|---|---|
Liu X. et al. 2019 | Polyp vs. adenoma | Polyp: AP0.5 = 83.39% {[P]} Adenoma: AP0.5 = 97.90% {[P]} |
mAP0.5 = 90.645% {[P]} | N/A | N/A | N/A | N/A | N/A | Yes |
Li K. et al. 2021 | Adenoma vs. Hyperplastic | Adenoma: AP = 81.1% {[KUMC]} Hyperplastic: AP = 65.9% {[KUMC]} |
mAP = 73.5% {[KUMC]} | 61.3% {[KUMC]} | 86.3% {[KUMC]} | 92.2% {[KUMC]} | 49.1% {[KUMC]} | F1: 0.736 {[KUMC]} | Yes |
List of Acronyms and Abbreviations
- AP: Average Precision.
- BLI: Blue Light Imaging.
- LCI: Linked-Color Imaging.
- mAP: Mean Average Precision.
- NBI: Narrow Band Imaging.
- SSAP: Sesile Serrated Adenoma/Polyp.
- WCE: Wireless Capsule Endoscopy.
- WL: White Light.
References and Further Reading
- Object Detection Metrics.
- Evaluation metrics for object detection and segmentation: mAP.
- A survey on Image Data Augmentation for Deep Learning.
Reviews
- Jun Ki Min, Min Seob Kwak, and Jae Myung Cha. Overview of Deep Learning in Gastrointestinal Endoscopy. Gut Liver. 2019 Jul; 13(4): 388–393.
- Samy A Azer. Challenges Facing the Detection of Colonic Polyps: What Can Deep Learning Do?. Medicina (Kaunas). 2019 Aug; 55(8): 473.
- Wei-Lun Chao, Hanisha Manickavasagan, and Somashekar G. Krishna. Application of Artificial Intelligence in the Detection and Differentiation of Colon Polyps: A Technical Review for Physicians. Diagnostics (Basel). 2019 Sep; 9(3): 99.
- Thomas KL. Lui, Chuan-Guo Guo, and Wai K. Leung. Accuracy of Artificial Intelligence on Histology Prediction and Detection of Colorectal Polyps: a Systematic Review and Meta-Analysis. Gastrointest Endosc. 2020 Feb 28.
- Cristina Sánchez-Montes, Jorge Bernal, Ana García-Rodríguez, Henry Córdova, and Gloria Fernández-Esparrach. Review of computational methods for the detection and classification of polyps in colonoscopy imaging. Gastroenterol Hepatol. 2020 Apr; 43(4): 222-232.
- Luisa F.Sánchez-Peralta, Luis Bote-Curiel, Artzai Picón, Francisco M.Sánchez-Margallo, J. Blas Pagador. Deep learning to find colorectal polyps in colonoscopy: A systematic literature review. Artificial Intelligence in Medicine. 2020 Aug; 108: 101923.
- Munish Ashat, Jagpal Singh Klair, Dhruv Singh, Arvind Rangarajan Murali, Rajesh Krishnamoorthi. Impact of real-time use of artificial intelligence in improving adenoma detection during colonoscopy: A systematic review and meta-analysis. Endoscopy International Open. 2021 March; 09(04): E513-E521.
- Alexander Hann, Joel Troya, and Daniel Fitting. Current status and limitations of artificial intelligence in colonoscopy. United European Gastroenterology. 2021 Jun; 1-7.
- Michelle Viscaino, Javier Torres Bustos, Pablo Muñoz, Cecilia Auat Cheein, and Fernando Auat Cheein. Artificial intelligence for the early detection of colorectal cancer: comprehensive review of its advantages and misconceptions. World Journal of Gastroenterology. 2021 Oct; 27(38): 6399-6414.
- Britt B S L Houwen, Karlijn J Nass , Jasper L A Vleugels, Paul Fockens, Yark Hazewinkel, Evelien Dekker. Comprehensive review of publicly available colonoscopic imaging databases for artificial intelligence research: availability, accessibility, and usability. Gastrointestinal Endoscopy. 2022 Sep.
Randomized Clinical Trials
Study | Title | Date | Number of patients |
---|---|---|---|
Wang et al. 2019 | Real-time automatic detection system increases colonoscopic polyp and adenoma detection rates: a prospective randomised controlled study | Sep. 2019 | 1058 |
Gong et al. 2020 | Detection of colorectal adenomas with a real-time computer-aided system (ENDOANGEL): a randomised controlled study | Jan. 2020 | 704 |
Wang et al. 2020 | Effect of a deep-learning computer-aided detection system on adenoma detection during colonoscopy (CADe-DB trial): a double-blind randomised study | Jan. 2020 | 1010 |
Liu et al. 2020 | Study on detection rate of polyps and adenomas in artificial-intelligence-aided colonoscopy | Feb. 2020 | 1026 |
Su et al. 2019 | Impact of a real-time automatic quality control system on colorectal polyp and adenoma detection: a prospective randomized controlled study (with videos) | Feb. 2020 | 659 |
Repici et al. 2020 | Efficacy of Real-Time Computer-Aided Detection of Colorectal Neoplasia in a Randomized Trial | Aug. 2020 | 685 |