• Stars
    star
    239
  • Rank 168,763 (Top 4 %)
  • Language
  • Created over 4 years ago
  • Updated about 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Multimodal Question Answering in the Medical Domain: A summary of Existing Datasets and Systems

Existing Medical QA & VQA Datasets

Multimodal Question Answering (QA) in the Medical Domain: A summary of Existing Datasets and Systems

*** Two Main Tasks: Medical Question Answering (QA) & Visual Question Answering (VQA) ***

I) Medical QA Datasets:

  1. Corpus for Evidence Based Medicine Summarization (Mollá, 2010): https://sourceforge.net/projects/ebmsumcorpus
  2. CLEF QA4MRE Alzheimer’s task (Peñas et al, 2012).
  3. BioASK datasets (2012-2020): http://bioasq.org/participate/challenges
  4. TREC LiveQA-Med (Ben Abacha et al, 2017): https://github.com/abachaa/LiveQA_MedicalTask_TREC2017
  5. MEDIQA-2019 datasets on NLI, RQE, and QA (Ben Abacha et al., 2019): https://github.com/abachaa/MEDIQA2019
  6. MEDIQA-AnS dataset of question-driven summaries of answers (Savery et al., 2020): https://osf.io/fyg46/ Paper: https://www.nature.com/articles/s41597-020-00667-z
  7. MedQuaD Collection of 47k QA pairs (Ben Abacha and Demner-Fushman, 2019): https://github.com/abachaa/MedQuAD
  8. Medication QA Collection (Ben Abacha et al., 2019): https://github.com/abachaa/Medication_QA_MedInfo2019
  9. Consumer Health Question Summarization (Ben Abacha and Demner-Fushman, 2019): https://github.com/abachaa/MeQSum
  10. emrQA: QA on Electronic Medical Records (Pampari et al., 2018). Scripts to generate emrQA from i2b2 data: https://github.com/panushri25/emrQA
  11. EPIC-QA dataset on COVID-19 (Goodwin et al., 2020): https://bionlp.nlm.nih.gov/epic_qa/
  12. BiQA Corpus (Lamurias et al., 2020): https://github.com/lasigeBioTM/BiQA Paper:https://ieeexplore.ieee.org/document/9184044
  13. HealthQA Dataset (Zhu et al., 2019): https://github.com/mingzhu0527/HAR Paper: https://dmkd.cs.vt.edu/papers/WWW19.pdf
  14. MASH-QA Dataset on Multiple Answer Spans Healthcare Question Answering, with 35k QA pairs (Zhu et al., 2020): https://github.com/mingzhu0527/MASHQA Paper: https://www.aclweb.org/anthology/2020.findings-emnlp.342.pdf

II) Medical VQA Datasets (Radiology):

  1. VQA-RAD (Lau et al. 2018): https://osf.io/89kps
  2. VQA-Med 2018 (Hasan et al. 2018): https://www.aicrowd.com/challenges/imageclef-2018-vqa-med
  3. VQA-Med 2019 (Ben Abacha et al. 2019): https://github.com/abachaa/VQA-Med-2019
  4. VQA-Med 2020 (Ben Abacha et al. 2020): https://github.com/abachaa/VQA-Med-2020

III) Online QA Systems:

-- I searched and tested several systems (e.g. AskHERMES, MiPACQ, SimQ). This list includes only the systems that are still maintained.

  1. CHiQA (Consumer Health Question Answering System): chiqa.nlm.nih.gov
  2. Neural Covidex: covidex.ai

IV) Medical Datasets Relevant to Question Answering:

  1. i2b2 shared tasks (2006-2016): www.i2b2.org/NLP
  2. n2c2 NLP clinical challenges (2018-2019): https://n2c2.dbmi.hms.harvard.edu https://dbmi.hms.harvard.edu/programs/national-nlp-clinical-challenges-n2c2
  3. TREC Medical Records Track (2012-2013).
  4. TREC Clinical Decision Support Track (2014-2016): http://www.trec-cds.org
  5. TREC Precision Medicine Track (2017-2019): http://www.trec-cds.org
  6. CLEF eHealth (2013-2020): https://clefehealth.imag.fr
  7. COVID dataset (CORD-19): https://www.kaggle.com/allen-institute-for-ai/CORD-19-research-challenge

V) Medical Datasets Relevant to VQA:

  1. ImageCLEF Medical Automatic Image Annotation (2008-2009): https://www.imageclef.org/2008/medaat and https://www.imageclef.org/2009/medanno
  2. ImageCLEF Medical User-oriented Image Retrieval Task (2011): https://www.imageclef.org/2011/medicaluseroriented
  3. ImageCLEF Medical Retrieval Task (2008-2012): https://www.imageclef.org/2012/medical
  4. ImageCLEF AMIA: Medical task (2013): https://www.imageclef.org/2013/medical
  5. ImageCLEFmed: Medical classification (2015): https://www.imageclef.org/2015/medical
  6. ImageCLEF Medical Clustering (2015): https://www.imageclef.org/2015/clustering
  7. ImageCLEFmed (2016): https://www.imageclef.org/2016/medical
  8. ImageCLEFcaption (2017-2020): https://www.imageclef.org/2017/caption
  9. ImageCLEFmedical tasks (2019-2020): https://www.imageclef.org/2019/medical and https://www.imageclef.org/2020/medical
  10. MIMIC-CXR Database (2019): https://physionet.org/content/mimic-cxr/2.0.0/

Last update on January 26, 2021.


More Repositories

1

MedQuAD

Medical Question Answering Dataset of 47,457 QA pairs created from 12 NIH websites
348
star
2

VQA-Med-2019

Visual Question Answering in the Medical Domain VQA-Med 2019
81
star
3

MEDIQA2019

Challenge on Textual Inference and Question Entailment in the Medical Domain https://sites.google.com/view/mediqa2019
Python
51
star
4

MTS-Dialog

A new collection of 1.7k doctor-patient conversations and corresponding clinical notes/summaries.
51
star
5

MEDIQA-Chat-2023

MEDIQA-Chat Shared Tasks @ ACL-ClinicalNLP 2023
Python
48
star
6

LiveQA_MedicalTask_TREC2017

Medical Question-Answering datasets prepared for the TREC 2017 LiveQA challenge (Medical Task)
40
star
7

MeQSum

Dataset for medical question summarization introduced in the ACL 2019 paper "On the Summarization of Consumer Health Questions" (A. Ben Abacha & D. Demner-Fushman)
28
star
8

MEDIQA2021

Python
21
star
9

VQA-Med-2021

VQA-Med 2021
Python
16
star
10

Medication_QA_MedInfo2019

The gold standard corpus for medication question answering introduced in the MedInfo 2019 paper (Bridging the Gap between Consumers’ Medication Questions and Trusted Answers)
15
star
11

RQE_Data_AMIA2016

The medical question entailment data introduced in the AMIA 2016 Paper (Recognizing Question Entailment for Medical Question Answering)
14
star
12

VQA-Med-2020

VQA-Med 2020
Python
13
star
13

MEDIQA-CORR-2024

Jupyter Notebook
11
star
14

3D-MIR

3D Medical Image Retrieval in Radiology
Jupyter Notebook
8
star
15

ImageCLEF-CaptionTask-2021

ImageCLEFmed 2021 - Caption Prediction and Concept Detection Tasks
2
star
16

EvaluationMetrics-ACL23

Python
2
star