• Stars
    star
    212
  • Rank 185,065 (Top 4 %)
  • Language
  • Created over 4 years ago
  • Updated 11 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Multimodal Question Answering in the Medical Domain: A summary of Existing Datasets and Systems

Existing Medical QA & VQA Datasets

Multimodal Question Answering (QA) in the Medical Domain: A summary of Existing Datasets and Systems

*** Two Main Tasks: Medical Question Answering (QA) & Visual Question Answering (VQA) ***

I) Medical QA Datasets:

  1. Corpus for Evidence Based Medicine Summarization (Mollá, 2010): https://sourceforge.net/projects/ebmsumcorpus
  2. CLEF QA4MRE Alzheimer’s task (Peñas et al, 2012).
  3. BioASK datasets (2012-2020): http://bioasq.org/participate/challenges
  4. TREC LiveQA-Med (Ben Abacha et al, 2017): https://github.com/abachaa/LiveQA_MedicalTask_TREC2017
  5. MEDIQA-2019 datasets on NLI, RQE, and QA (Ben Abacha et al., 2019): https://github.com/abachaa/MEDIQA2019
  6. MEDIQA-AnS dataset of question-driven summaries of answers (Savery et al., 2020): https://osf.io/fyg46/ Paper: https://www.nature.com/articles/s41597-020-00667-z
  7. MedQuaD Collection of 47k QA pairs (Ben Abacha and Demner-Fushman, 2019): https://github.com/abachaa/MedQuAD
  8. Medication QA Collection (Ben Abacha et al., 2019): https://github.com/abachaa/Medication_QA_MedInfo2019
  9. Consumer Health Question Summarization (Ben Abacha and Demner-Fushman, 2019): https://github.com/abachaa/MeQSum
  10. emrQA: QA on Electronic Medical Records (Pampari et al., 2018). Scripts to generate emrQA from i2b2 data: https://github.com/panushri25/emrQA
  11. EPIC-QA dataset on COVID-19 (Goodwin et al., 2020): https://bionlp.nlm.nih.gov/epic_qa/
  12. BiQA Corpus (Lamurias et al., 2020): https://github.com/lasigeBioTM/BiQA Paper:https://ieeexplore.ieee.org/document/9184044
  13. HealthQA Dataset (Zhu et al., 2019): https://github.com/mingzhu0527/HAR Paper: https://dmkd.cs.vt.edu/papers/WWW19.pdf
  14. MASH-QA Dataset on Multiple Answer Spans Healthcare Question Answering, with 35k QA pairs (Zhu et al., 2020): https://github.com/mingzhu0527/MASHQA Paper: https://www.aclweb.org/anthology/2020.findings-emnlp.342.pdf

II) Medical VQA Datasets (Radiology):

  1. VQA-RAD (Lau et al. 2018): https://osf.io/89kps
  2. VQA-Med 2018 (Hasan et al. 2018): https://www.aicrowd.com/challenges/imageclef-2018-vqa-med
  3. VQA-Med 2019 (Ben Abacha et al. 2019): https://github.com/abachaa/VQA-Med-2019
  4. VQA-Med 2020 (Ben Abacha et al. 2020): https://github.com/abachaa/VQA-Med-2020

III) Online QA Systems:

-- I searched and tested several systems (e.g. AskHERMES, MiPACQ, SimQ). This list includes only the systems that are still maintained.

  1. CHiQA (Consumer Health Question Answering System): chiqa.nlm.nih.gov
  2. Neural Covidex: covidex.ai

IV) Medical Datasets Relevant to Question Answering:

  1. i2b2 shared tasks (2006-2016): www.i2b2.org/NLP
  2. n2c2 NLP clinical challenges (2018-2019): https://n2c2.dbmi.hms.harvard.edu https://dbmi.hms.harvard.edu/programs/national-nlp-clinical-challenges-n2c2
  3. TREC Medical Records Track (2012-2013).
  4. TREC Clinical Decision Support Track (2014-2016): http://www.trec-cds.org
  5. TREC Precision Medicine Track (2017-2019): http://www.trec-cds.org
  6. CLEF eHealth (2013-2020): https://clefehealth.imag.fr
  7. COVID dataset (CORD-19): https://www.kaggle.com/allen-institute-for-ai/CORD-19-research-challenge

V) Medical Datasets Relevant to VQA:

  1. ImageCLEF Medical Automatic Image Annotation (2008-2009): https://www.imageclef.org/2008/medaat and https://www.imageclef.org/2009/medanno
  2. ImageCLEF Medical User-oriented Image Retrieval Task (2011): https://www.imageclef.org/2011/medicaluseroriented
  3. ImageCLEF Medical Retrieval Task (2008-2012): https://www.imageclef.org/2012/medical
  4. ImageCLEF AMIA: Medical task (2013): https://www.imageclef.org/2013/medical
  5. ImageCLEFmed: Medical classification (2015): https://www.imageclef.org/2015/medical
  6. ImageCLEF Medical Clustering (2015): https://www.imageclef.org/2015/clustering
  7. ImageCLEFmed (2016): https://www.imageclef.org/2016/medical
  8. ImageCLEFcaption (2017-2020): https://www.imageclef.org/2017/caption
  9. ImageCLEFmedical tasks (2019-2020): https://www.imageclef.org/2019/medical and https://www.imageclef.org/2020/medical
  10. MIMIC-CXR Database (2019): https://physionet.org/content/mimic-cxr/2.0.0/

Last update on January 26, 2021.


More Repositories

1

MedQuAD

Medical Question Answering Dataset of 47,457 QA pairs created from 12 NIH websites
334
star
2

VQA-Med-2019

Visual Question Answering in the Medical Domain VQA-Med 2019
79
star
3

MEDIQA2019

Challenge on Textual Inference and Question Entailment in the Medical Domain https://sites.google.com/view/mediqa2019
Python
50
star
4

MTS-Dialog

A new collection of 1.7k doctor-patient conversations and corresponding clinical notes/summaries.
48
star
5

MEDIQA-Chat-2023

MEDIQA-Chat Shared Tasks @ ACL-ClinicalNLP 2023
Python
45
star
6

LiveQA_MedicalTask_TREC2017

Medical Question-Answering datasets prepared for the TREC 2017 LiveQA challenge (Medical Task)
38
star
7

MeQSum

Dataset for medical question summarization introduced in the ACL 2019 paper "On the Summarization of Consumer Health Questions" (A. Ben Abacha & D. Demner-Fushman)
26
star
8

MEDIQA2021

Python
21
star
9

VQA-Med-2021

VQA-Med 2021
Python
16
star
10

RQE_Data_AMIA2016

The medical question entailment data introduced in the AMIA 2016 Paper (Recognizing Question Entailment for Medical Question Answering)
14
star
11

Medication_QA_MedInfo2019

The gold standard corpus for medication question answering introduced in the MedInfo 2019 paper (Bridging the Gap between Consumers’ Medication Questions and Trusted Answers)
14
star
12

VQA-Med-2020

VQA-Med 2020
Python
12
star
13

MEDIQA-CORR-2024

Jupyter Notebook
11
star
14

3D-MIR

3D Medical Image Retrieval in Radiology
Jupyter Notebook
7
star
15

ImageCLEF-CaptionTask-2021

ImageCLEFmed 2021 - Caption Prediction and Concept Detection Tasks
2
star
16

EvaluationMetrics-ACL23

Python
2
star