• Stars
    star
    111
  • Rank 314,510 (Top 7 %)
  • Language
    Jupyter Notebook
  • License
    Other
  • Created over 5 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Deep Learning-based Clustering Approaches for Bioinformatics

Deep Learning-based Clustering Approaches for Bioinformatics

Codes and supplementary materials for our paper "Deep Learning-based Clustering Approaches for Bioinformatics" published in Briefings in Bioinformatics journal. This repo will be updated periodically. In particular, more complete Jupyter notebooks will be added. In this article, we reviewed deep learning-based approaches for cluster analysis, including network training, representation learning, parameter optimization, and formulating clustering quality metrics. We also discussed how representation learning based on different autoencoder architectures (e.g., vanilla, variational, LSTM, and convolutional) can be more effective than ML-based approaches (e.g., PCA) in different scenarios, e.g., bio-imaging, gene expression clustering, and clustering biomedical texts.

Deep learning-based unsupervised/clustering methods, link to papers & codes

We provide the list of deep learning-based unsupervised/clustering methods, link to papers, and codes. Besides, new articles proposing approaches and paper will be listed. So stay tuned!

Title Article Conference/Journal Code
Deep clustering with convolutional autoencoders (DCEC) Link ICONIP'2017 GitHub
Unsupervised Data Augmentation for Consistency Training (UDA) Link Arxiv'2019 GitHub
Deep Clustering via joint convolutional autoencoder embedding and relative entropy minimization (DEPICT) Link ICCV'2017 GitHub
Discriminatively Boosted Clustering (DBC) Link Arxiv'2017 N/A
Variational Deep Embedding (VADE) Link IJCAI'2017 GitHub
Convolutional Embedded Networks (CEN)} Link Arxiv'2018 GitHub
Deep Subspace Clustering Networks (DSC-Nets) Link NIPS'2017 GitHub
Graph Clustering with Dynamic Embedding (GRACE) Link Arxiv'2017 N/A
Deep Unsupervised Clustering Using Mixture of Autoencoders (MIXAE) Link Arxiv'2017 N/A
Deep Embedded Clustering (DEC) Link ICML'2016 GitHub
A Survey of Clustering With Deep Learning: From the Perspective of Network Architecture Link IEEE ACCESS 2018
GEMSEC: Graph Embedding with Self Clustering Link Arxiv,2018 GitHub
Clustering with Deep Learning: Taxonomy and New Methods Link Arxiv, 2018 GitHub
Deep Continuous Clustering (DCC) Link Arxiv, 2018 GitHub
Deep Clustering with Convolutional Autoencoders (DCEC) Link ICONIP'2018 GitHub
SpectralNet: Spectral Clustering Using Deep Neural Networks Link ICLR'2018 GitHub
Subspace clustering using a low-rank constrained autoencoder (LRAE) Link Information Sciences'2018 N/A
Clustering-driven Deep Embedding with Pairwise Constraints (CPAC) Link Arxiv'2018 GitHub
Towards K-means-friendly Spaces: Simultaneous Deep Learning and Clustering Link PMLR'2017 N/A
Deep Unsupervised Clustering With Gaussian Mixture Variational AutoEncoders (GMVAE) Link ICLR'2017 GitHub
Is Simple Better?: Revisiting Simple Generative Models for Unsupervised Clustering Link NIPS'2017 Workshop GitHub
Imporved Deep Embedding Clustering (IDEC) Link IJCAI'2017 GitHub
Deep Clustering Network (DCN) Link Arxiv'2016 GitHub
Joint Unsupervised Learning of Deep Representations and Image Clustering (JULE) Link CVPR'2016 GitHub
Deep Embedding Network for Clustering (DEN) Link ICPR'2014 N/A
Auto-encoder Based Data Clustering (ABDC) Link CIARP'2013 GitHub
Learning Deep Representations for Graph Clustering Link AAAI'2014 GitHub

Running provided Jupyter notebooks

To run the examples interactively, you need to install some Python modules and libraries.

  • Python 3
  • Scikit-learn
  • Keras
  • TensorFlow.

For the Jupyter notebook, git it from this Link and install it on your machine. Then clone this repo using following command, given that you have already installed the git:

git clone https://github.com/rezacsedu/Deep-learning-for-clustering-in-bioinformatics.git

Alternatively, install all the required libraries by issuing the following command:

 cd Deep-learning-for-clustering-in-bioinformatics
 pip3 install -r requirements.txt
 cd Notebboks

Then start Jupyter notebbok by issuing the following command:

jupyter notebook

In the opened browser, go to Jupyter tab and window open the notebook.

LSTM_AE_Text_Clustering.ipynb

If you want to skip the training, soon we'll provide the pre-trained weights, which you can restore and start fine-tuning. Happy coding! Leave a comment if you have any question.

Acknowledgement

The ClusteringLayer class and the target_distribution function are based on DEC from https://github.com/XifengGuo/DCEC/blob/master/DCEC.py by Xifeng Guo

Citation request

If you use the code of this repository in your research, please consider citing the folowing papers:

@article{karim2021deep,
      title={Deep learning-based clustering approaches for bioinformatics},
      author={Karim, Md Rezaul and Beyan, Oya and Zappa, Achille and Costa, Ivan G and Rebholz-Schuhmann, Dietrich and Cochez, Michael and Decker, Stefan},
      journal={Briefings in bioinformatics},
      volume={22},
      number={1},
      pages={393--415},
      year={2021},
      publisher={Oxford University Press}
      }

Contributing

If you find more related work, which are not listed here, please create a PR or sugest by filing issues. Your contribution will be highly appreciated. For any questions, feel free to open an issue or contact at [email protected].

More Repositories

1

Drug-Drug-Interaction-Prediction

Drug-Drug Interaction Prediction Based on Knowledge Graph Embeddings and Convolutional-LSTM Network
Jupyter Notebook
66
star
2

Classification_Benchmarks_Benglai_NLP

Classification Benchmarks for Under-resourced Bengali Language based on Multichannel Convolutional-LSTM Network
Jupyter Notebook
20
star
3

Multimodal-autoencoder-for-breast-cancer

Prognostically Relevant Subtypes and Survival Prediction for Breast Cancer Based on Multimodal Genomics Data
Python
20
star
4

Intrusion-Detection-Spark-Conv-LSTM

Intrusion detection system with Apache Spark and deep learning
Jupyter Notebook
19
star
5

Neural-Ensemble-Method-for-Cancer-Prediction

A Snapshot Neural Ensemble Method for Cancer Type Prediction Based on Copy Number Variations
Jupyter Notebook
19
star
6

Bengali-Hate-Speech-Dataset

Dataset for identifying potential hates (e.g., political, religious, personal, gender abusive, geopolitical, etc.) for under-resourced Bengali language.
15
star
7

Convolutional-embedded-networks

Convolutional Embedded Networks for Population Scale Clustering and Bio-ancestry Inferencing
Python
11
star
8

DeepCOVIDExplainer

DeepCOVIDExplainer: Explainable COVID-19 Diagnosis from Chest X-ray Images
Jupyter Notebook
10
star
9

DeepHateExplainer

DeepHateExplainer
Jupyter Notebook
9
star
10

RandomForestSpark

Java
8
star
11

OncoNetExplainer

OncoNetExplainer: Explainable Prediction of Cancer Types Based on Gene Expression Data
Jupyter Notebook
7
star
12

Recent-Papers-Knowledge-Graph-Embeddings

Some recent papers on Knowledge Graph Embedding
6
star
13

XAI-for-bioinformatics

Explainable AI for Bioinformatics
Jupyter Notebook
6
star
14

ApacheLuceneDemo

Java
4
star
15

Guided-Grad-Cam

Python
4
star
16

Multimodal-Deep-Belief-Net-Breast-Cancer

Multimodal deep belief networks for breast cancer subtype and subvival rate predictions
Python
3
star
17

OutlierDetectionUsingSpark2.0.0

Java
3
star
18

2Stage-Big-Data-Analytics-SparkML-LSTM

Classification of Cardiac Arrhythmia and Indentifying Suspicious URLs with Spark ML and LSTM Networks
Scala
3
star
19

Multimodal-Hate-Bengali

Python
2
star
20

TitanicSurvivalPredictionDataset

Titanic Survival Prediction Dataset
2
star
21

ApacheFlinkDemo

Java
2
star
22

DeepKneeOAExplainer_

Explainable Knee Osteoarthritis Diagnosis from Radiographs & MRIs
Jupyter Notebook
2
star
23

PacktMLwithSpark

Java
2
star
24

SemanticRefreeCancer

Semantic Refree for Cancer Genomics
Java
1
star
25

PHT_Train_Metadata

PHT_Train_Metadata
1
star
26

HypotheissTestingSpark

Java
1
star
27

AnomalyWithR

R
1
star
28

FHIR_Resource_Generator

FHIR resource generator
Python
1
star
29

HeartDiseasePredictionUsingSparkWithJava

Java
1
star
30

TCGAModel

1
star
31

RDF_Rule_Mining_Horn

1
star
32

SemanticRobot

Java
1
star
33

CSVtoLibSVMConverterinR

CSV to LibSVM converterin using R
R
1
star
34

Mining-Maximal-Frequent-Pattern-Spark

Implementation of Static mining part of "Mining maximal frequent patterns in transactional databases and dynamic data streams: A spark-based approach" Information Sciences, Volume 432, March 2018, Pages 278-300
Java
1
star