• Stars
    star
    153
  • Rank 243,368 (Top 5 %)
  • Language
    Python
  • Created over 7 years ago
  • Updated over 7 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Improved Deep Embedded Clustering (IDEC)

Keras implementation for our IJCAI-17 paper:

and re-implementation for paper:

  • Junyuan Xie, Ross Girshick, and Ali Farhadi. Unsupervised deep embedding for clustering analysis. ICML 2016.

This code requires pretrained autoencoder weights provided. Use IDEC-toy code for a quick start.

Usage

  1. Install Keras v2.0, scikit-learn and git
    sudo pip install keras scikit-learn
    sudo apt-get install git

  2. Clone the code to local.
    git clone https://github.com/XifengGuo/IDEC.git IDEC

  3. Prepare datasets.

     cd IDEC/data/usps   
     bash ./download_usps.sh   
     cd ../reuters  
     bash ./get_data.sh   
     cd ../..
    
  4. Get pre-trained autoencoder's weights.
    Follow instructions at https://github.com/piiswrong/dec to pre-train the autoencoder. Then save the trained weights to a keras model (e.g. mnist_ae_weights.h5) and put it in folder 'ae_weights'.
    If you do not want to install Caffe package, you can download the pretrained weights from
    https://github.com/XifengGuo/data-and-models
    Then put .h5 file in ae_weights in local folder 'ae_weights'.
    Or you can just use IDEC-toy code for a quick start, but the results may be not promising.

  5. Run experiment on MNIST.
    python IDEC.py mnist --ae_weights ae_weights/mnist_ae_weights.h5
    or
    python DEC.py mnist --ae_weights ae_weights/mnist_ae_weights.h5
    The IDEC (or DEC) model is saved to "results/idec/IDEC_model_final.h5" (or "results/dec/DEC_model_final.h5").

  6. Run experiment on USPS.
    python IDEC.py usps --ae_weights ae_weights/usps_ae_weights --update_interval 30
    python DEC.py usps --ae_weights ae_weights/usps_ae_weights --update_interval 30

  7. Run experiment on REUTERSIDF10K.
    python IDEC.py reutersidf10k --ae_weights ae_weights/reutersidf10k_ae_weights --n_clusters 4 --update_interval 3
    python DEC.py reutersidf10k --ae_weights ae_weights/reutersidf10k_ae_weights --n_clusters 4 --update_interval 20

Models

The DEC model:

The IDEC model: