• Stars
    star
    3
  • Rank 3,963,521 (Top 79 %)
  • Language
    Jupyter Notebook
  • Created over 3 years ago
  • Updated over 3 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

We have blood sample data of 355 people with 4 most common cancer types: Colon cancer, breast cancer, lung cancer, and prostate cancer. You are given a label file, labels.csv, indicating the sample names, and the disease type of each person with the corresponding sample name. The data are stored in data.csv. Again, each row has the sample name of the corresponding person, and the remaining are the number of DNA fragments belonging to each microorganism type (virus or bacteria). 1836 different microorganisms appear as features.