100-Days-of-ML-Code
Daily log to track my progress on the 100 days of ML code challenge.
Day 1 (09-09-18) : Naive Bayes
- Started with the intro to machine learning course on Udacity
- Learnt the basics of a Naive Bayes classifier on the
iris
dataset - Working on classifying the
Stanley Terrain
dataset and graph the decision surface
Day 2 (10-09-18) : Naive Bayes mini-project
- Working on the Naive Bayes mini-project to classify email.
- Tried really hard to make the python 2.7 code compatible with 3.6 and learnt about
dos2unix
and pickling of data. - Completed the Naive Bayes project with accuracy of 90.24% (Need to improve it!)
Day 3 (11-09-18) : SVM and Linear Algebra
- Improved efficiency to 97.869% and completed the mini-project.
- Started the lesson on Support Vector Machines.
- Completed Week 1 of Mathematics for Machine Learning: Linear Algebra, a course from Imperial College London on Coursera.
Day 4 (12-09-18) : SVM and Decision Trees
- Completed the SVM mini-project with 99.08% accuracy using an
rbf
kernel - Started the lesson on Decision Trees
Day 5 (13-09-18) : Decision Trees mini-project
- Working on the Decision tree mini-project
- Referred to 3Blue1Brown's Essence of Calculus playlist
Day 6 (14-09-18) Decision Tree(Entropy and Information gain) and KNN
- Completed the Decision Tree mini-project
- Learnt about the K-Nearest Neighbours classifier and implemented the same
Day 7 (15-09-18) K-Nearest Neighbours
- Implemented the KNN classisier after referring to this Medium article
- Watched 2 more videos from 3Blue1Brown's Essence of Calculus playlist
- Watched Siraj Raval's video on classifiers
Day 8 (16-09-18) RandomForest classifier, Datasets and Questions
- Completed the lesson on datasets and questions to gain key inferences
- Completed the lesson
WEB 3.0
from Siraj Raval's Decentralized Applications playlist - Implemented the RandomForest classifier and read up about adaBoost
Day 9 (17-09-18) Linear Regression, Unsupervised Learning (K Means)
- Completed the lesson on Regressions and implemented the same in the mini-project
- Completed the analysis of outliers in the enron dataset and the Q&A on the analysis
- Completed the lesson on unsupervised learning (K-Means clustering)
- Implemented K Means clustering on the Enron dataset
- Completed the lesson on feature scaling (MinMaxScaler)
Day 10 (18-09-18) Bag of words, stemming and TfIdf using NLTK
- Stemming using NLTK(Natural Language Toolkit)
- Completed the lesson on text learning
- Completed implementing the string processing techniques in the dataset (17578 emails)
Day 11 (19-09-18) Feature Selection, Dimensionality Reduction(PCA) and Validation
- Completed the lesson on feature selection
- Implemented Lasso regression to understand regularization
- Completed the lesson on dimensionality reduction
- Working on the eigenfaces mini-project
- Completed the lesson on Validation and its exercises
Day 12 (20-09-18) Evaluation metrics and intro to neural networks
- Completed the lesson on evaluation metrics and its exercises
- Started the Deeplearning.ai course Neural Networks and Deep learning by Andrew NG
- Completed the intro to machine learning course on Udacity!!
Day 13 (21-09-18) Enron Fraud Detection
- Working on finding the persons of interest from the Enron emails dataset
- Completed Week 1 of the Neural networks and deep learning course
Day 14 (22-09-18) Intro to tensorflow and tensorflow.js
- Read up about Tensorflow from the documentation and medium articles
- Watched 2 Coding Train videos to understand Tensorflow.js
Day 15 (23-09-18) Intro to deep learning
- Implemented classifier and regressor using tensorflow and compared the same with the sklearn implementations
- Learnt about the softmax, one-hot encoding and cross-entropy loss minimization using gradient descent
Day 16 (24-09-18) Data preprocessing and handling missing data
- Learn best practices to handle missing data and effective feature selection
- Practiced the preprocessing workflow
Day 17 (25-09-18) Stock Predictor App
- Built a basic stock predictor app that predicts the value of the stock and the value of the company
- Referred to this video by Siraj Raval
Day 18 (26-09-18) Data Science on the HI-SEAS dataset
- Analyzed the Mars HI-SEAS dataset using SVM (and PCA) to unearth outliers and analyze for predictive analytics
- Performed data wrangling and analysis using dplyr in R
Day 19 (27-09-18) Intro to deep learning
- Started the Intro to deep learning course by Google Brain's principal scientist Vincent Vanhoucke
Day 20 (28-09-18) Neural network for notMNIST
- Built a neural network with 84% accuracy for the notMNIST dataset
- Completed lesson 1 of the intro to deep learning course
Day 21 (29-09-18) Neural Networks and deep learning
- Working on week 2 of Andrew NG's course on deep learning and neural networks
- Implemented gradient descent from scratch
Day 22 (30-09-18) Neural Networks and Deep Learning
- Completed assignment 1 of week 2
- Implemented logistic regression using a neural network approach to classify images
- Completed Week 2 of Andrew NG's course
Day 23 (1-10-18) Implemented gradient descent from scratch
- Implemented gradient descent form scratch
- Learnt more about activation functions sigmoid, tanh, ReLU and leaky ReLU
- Learnt about the advantages and differences between tensorflow.js and tensorflow
Day 24 (2-10-18) Planar data classification using a neural network
- Completed planar classification assignment
- Completed Week 3 of Andrew NG's Neural Networks course
- Started Week 4 of the course
Day 25 (3-10-18) Deep neural networks
- Completed all lecture videos of Week 4 pertaining to deep neural networks
- Working on the programming assignments
- Completed assignment 1
Day 26 (4-10-18) Cat-notCat classifier from scratch
- Working on a cat-notCat binary classifier using a deep neural net
- Completed Week 4 of the course and obtained the certificate!
Day 27 (5-10-18) Hyperparameter tuning and regularization
- Learnt the math behind Frobenius norm and regularization
- Started course 2 of Andrew NG's Deeplearning.ai specialization
Day 28 (6-10-18) Optimization and regularization
- Completed week 1 materials and working on the optimization exercises
- Implemented l2-regularization from scratch
- Implemented dropout (forward and back-prop) from scratch
- Implemented Gradient checking from scratch
- Completed Week 1 of the course
Day 29 (7-10-18) mini-batch gradient descent with momentum and Adam
- Implemented mini-batch gradient descent with momentum
- Implemented Adam optimization from the ICLR 2015 paper
- Completed week 2 of the course
Day 30 (8-10-18) Batch normalization, softmax and Structuring ML Projects!
- Implemented batch normalization from scratch
- Working on the SIGNS dataset to identify numbers from sign language (Using Tensorflow)
- Completed the course on Improving deep neural nets - certificate
Day 31 (9-10-18) Structuring ML Projects and transfer learning
- Completed the course on structuring machine learning projects! Certificate
- Learnt more about transfer learning
Day 32 (10-09-18) Familiarizing myself with Tensorflow
- Read and practiced from the Tensorflow documentation to better understand the workflow
- Understood the importance of GPUs in Deep Learning and the
tensorflow-gpu
module
Day 33 (11-10-18) Edge detection and convolutions
- Started Week 1 of Andrew NG's course on Convolutional Neural Networks
- Learnt more about Tensorflow from Jordi Torres' Deep Learning book
Day 34 (12-10-18) Building a Convolutional layer and Pooling
- Completed Week 1 of Convolutional Neural Networks
- Learnt about pooling(POOL) and fully connected(FC)
Day 35 (13-10-18) GDG DevFest 2018 and CNN step by step
- Attended GDG DevFest 2018! Was a very informative event for ML/AI practitioners
- Working on building a CNN step by step
Day 36 (14-10-18) What is AlphaGoZero and intro to RL
- Learnt more about Google's ALphaGoZero and why it's such a big breakthrough
- Learnt the very basics of Reinforcement Learning
Day 37 (15-10-18) Basics of Reinforcement Learning
- Learnt about Basics of RL from David Silver's online course
Day 38 (16-10-18) CNNs
- Learnt about Pooling layers for CNNs and improved implementation
- Working on Week 2 content of Andrew NG's CNNs course
Day 39 (17-10-18) Landing a rocket using Reinforcement Learning
- Learning about PPOs (Proximal Policy Optimization) in RL
- Learning about rocket launches to build an app to track space-flight schedules
- Building and training a ConvNet in TensorFlow for a classification problem
Day 40 (18-10-18) Nasa SpaceApps Preparartion
- Spent some time preparing data from Nasa datasets for the topic "Do YOU Know When the Next Rocket Launch Is?"
Day 41 (19-10-18) Data preparation and pre-processing
- Prepared and pre-processed the data for the Nasa SapceApps competition
Day 42 (20-10-18) Nasa SpaceApps Challenge Nationals
- Using the GLOBE dataset to predict effective sunlight cover on solar panels
Day 43 (21-10-18) Worked on CNNs and Monte Carlo Simulations
- Used Monte Carlo simulations and normalization to predict the conversion factor for solar panels
- Used the conversion factor thus obtained to build a calculator to visualize the data
- Worked on CNNs with a 'selu' activation function for better learning rate with normalization
Day 44 (22-10-18) Revised CNNs from Andrew NG's course notes
- Revised building CNNs from scratch from Andrew NG's course notes
Day 45 (23-10-18) Artificial Intelligence
- Studying for internal exam on the subject of Artificial Intelligence
Day 46 (24-10-18) Artificial Intelligence
- Studied for my AI exam on 25th
- This includes pedicate logic, Bayesian statistics, Bayesian networks and partitioned semantic nets
Day 47 (25-10-18) AI exam and CNNs for roof exposure estimation
- Gave my AI exam and probably aced it!
- Working on training a model on a scraped data of roof pictures with given dimensions (labelled) into a CNN to estimate the solar irradiance incident on the surface
Day 48 (26-10-18) Car detection using YOLOv2
- Working on a You Only Learn Once model for car detection
- The ML project pipeline is underway
- Went to the Google office for a meetup called #chAI where early stage AI startups explailned the deep learning they have been doing
- fixed all deployment bugs in the Nasa SpaceApps project and hosted the website
Day 49 (27-10-18) Learnt more about YOLO
- Learnt more about YOLOv2 from medium articles
- Got project guidance and tips on the Solar roof CNNs project from Vibhor Kalra from merak.ai
- He suggested to look into tensorflow.js if browser based real-time models need to be deployed
- Need to learn about deploying a tensorflow project
- Learnt about tf-lite models and their merits and demerits for DL apps
Day 50 (28-10-18) Monte Carlo simulations and curve fitting in R
- Improved the prediction model for the solar project and working on the final submission as today is the last day
- Registered for the Microsoft AI challenge to improve Bing's suggestion box answers using DL models
- Re-doing the plan for the next 50 days to get the most done from this challenge
Day 51 (29-10-18) Sentiment classification
- An implementation from Andrew Trask's blog about sentiment classification to frame problems in deep learning
- Completed CNN implementation from scratch
- Still working on a feedback analysis of the progress thus far to get much more done in the second half of the challenge
Day 52 (30-10-18) Improving CNN Backpropagation
- Studying the math behind backpropagation (for CNNs) from Ian Goodfellow's Deep Learning Textbook
Day 53 (31-10-18) Fully functioning ConvNets using Tensorflow
- Implemented a fully functional CNN using Tensorflow
- Improved the friend dashboard project
- ALso created a Genomic and AI related github organization for related projects
Day 54 (1-11-18) GenomicAI's website and revising data science in R from Rafael's Textbook
- Working on GenomicAI's website. Looking to finish it up after Monday's exam
- Revising data science in R from Harvard Prof Rafael's textbook
Day 55 (2-11-18) GCP's How Google does ML
- Completed half the course by Google Cloud Platform on 'How Google does ML'
- Working on the paper on 'Genomic analysis for persoanlized medicine'
Day 56 (3-11-18) GCP Datalab for ML instances
- Earthquakes project using a datalab instance Link
- Project Link
- The Common pitfalls in ML deployment. Gosh it has much more to do with stuff other than ML!
- Completed Google Cloud's first course 'How Google does ML' Link
Day 57 (4-11-18) Launching into ML on GCP
- Started learning the procedure to prepare and pre-process datasets to bucket in cloud instances
- Working on GenomicAI's about page
Day 58 (5-11-18) (Still!)Launching into ML on GCP
- Made UI improvements to the Social Network
- Learnt more about operationalizing ML models for production using the Google Cloud Platform
Day 59 (6-11-18) Completed Launching into ML
- Awaiting the scholarship confirmation. I just about finished the content in the course from Coursera
- Working on the site page for GenomicsAI
Day 60 (7-11-18) Comparing AWS and GCP
- Exploring AWS ML APIs as compared to GCP's ML APIs
- Buying parts for my Deep Learning rig. Got a GTX 1080 and an 8Th gen Intel i7 processor. Need to save up and buy the rest of the parts!
Day 61 (8-11-18) Learning up about building a recommendation engine
- Made some progress on Google's GCP challenge on specializing in ML by 30th November
- Learnt about some of the methods to build and operationalize a recommendation engine
Day 62 (9-11-18) Working on a recommendation engine for the social network
- Working on the prototype for a recommendation engine for user's feed in a social network I am building
- The social network is built on a PostgreSQL database with a flask business logic. Check out my profile for details
Day 63 (10-11-18) Started the PyTorch Challenge
- Working on lesson 1 of the content to build a neural network using PyTorch
- Made significant headway in the Social Network project
Day 64 (11-11-18) Learnt more about using PyTorch for DL
- Continued the Udacity course on PyTorch
Day 65 (12-11-18) Started the Move 37 course
- Started Siraj Raval's Move 37 course for Reinforcement Learning
- Working on the social networking application for my DBMS project. Looking for ideas to include ML concepts in it
Day 66 (13-11-18) Completed Lesson 1 in the PyTorch Challenge
- Completed lesson 1 in the PyTorch challenge
- Learnt about the Bellman equation in Reinforcement Learning (Move 37)
Day 67 (14-11-18) PyTorch Challenge
- Fixed performance bugs in the 'Social Network' project (which I have to submit soon)
- Working on the PyTorch challenge lesson 2
Day 68 (15-11-18) CUDA programming using PyTorch
- Learnt about CUDA programming to utilize a GPU to its max
- Learnt about TFlite models for deep learning on a smartphone
Day 69 (16-11-18) Google Cloud : Machine Learning and BigQuery
- Learnt about the basics of using Google Cloud along with BigQuery datasets and ML
- Worked on learning about TFX to build end to end Deep Learning models
- Completed the minimum viable project for DBMS lab!
Day 70 (17-11-18) Completed lesson 3 of PyTorch Challenge
- Completed lesson 3 of the PyTorch challenge
- Working on the DNNs with PyTorch lesson
Day 71 (18-11-18) Learnt more about PyTorch and Tensorflow workflows
- Imperative programming in PyTorch and the dynamic front end is more suited for research implementations
- Learnt more about deploying models as low level C++ and the production-ready Tensorflow workflow
Day 72 (19-11-18) Learnt the Keras workflow for RESNET-50 Implementation
- Read about classic CNN architectures like AlexNet, Lenet-5, VGG-16 and Microsoft Research's Resnet
- Learnt the Keras workflow to implement Resnet
- Learnt more about Skip connections with Convolutional as well as ID blocks
Day 73 (20-11-18) Working on object detection for cars
- Completed the Resnet implementation
- Working on car/object detection
Day 74 (21-11-18) Finished the YOLOv2 implementation
- Part of the YOLO paper released on June 12th 2015 but without the K-Means clustering for drawing the bounding boxes
- Learnt about implementing non-max suppression and IoU for filtering probabilistic results
Day 75 (22-11-18) Working on Faster CNNs implementation
- Despite Yolo being a good solution, tried to implement Fast CNNs and Faster CNNs from scratch in Tensorflow
- Poor results on this. It is more or less guessing the solution despite using Adam optimization
- YOLO with K-Means clustering seems like a better option. Will look into it soon
Day 76 (23-11-18) O'Reilly Tensorflow for Deep Learning
- Learnt about the Computation graph and how paralellizing TF clusters improves performance
- Ordered parts for my Deep Learning PC!
- Have an i7-8700, an NVIDIA GTX 1080 and the MSI A-Pro Z370 motherboard so far!
Day 77 (24-11-18) Working on Face Recognition
- Working on Siamese Networks for learning Similarity functions
- Improving Happy House with Face recognition
Day 78 (25-11-18) Working on the PyTorch lesson 4
- Working on Udacity's Lesson 4 of the PyTorch challenge
- Got the O'Reilly Data Science book in R with the Tidyverse
- Working on the problem statement for Hackference Hackathon
Day 79 (26-11-18) O'Reilly Tensorflow Models
- Work on the Hackference hack cancelled as the deadline for documentation submission passed hours before we submitted our proposal!
- Learnt more about GPU and CUDA programming for Deep Learning
- Completed 2 chapters of the O'Reilly Deep Learning with Tensorflow book. It is a fantastic book to read!
Day 80 (27-11-18) Pandas and Matplotlib for Data Science in Python
- Watched a few videos by SentDex on Youtube for a hands-on refresher in Pandas and Matplotlib
- Preparing for data science internships and thus reviewed Sampling theory and some sample interview questions
- Learnt more about the Tidyverse in R and how statisticians build their workflow in it
- The initial steps include: Exploratory Data Analysis(EDA) with ggplot2, Wrangling with tidyr, dplyr and programming with magrittr and purr
Day 81 (28-11-18) Deep Learning in insurance
- Had to spend time on exam preparation, but managed to review data science material in R
- Revised notes from a previous Harvard Data Science in Genomics course that I had audited in Feb
Day 82 (29-11-18) Papers papers everywhere!
- Spent time on reading papers in Computer Vision to further deepen my understanding of CNNs
- Read about CNN architectures in depth
- Ran the numbers and did some research for the Acko insurance hackathon proposal
Day 83 (30-11-18) Neural Style Transfer paper
- Spent time reviewing Deep Dream and Neural Style transfer with Gatys et al and their paper
- Worked on the documentation for the Acko Hack proposal of insurance premium predictions for self-driving car adoption
- Made plans for the home stretch of the challenge with lots of cool stuff planned for the 18 days to come and more!
Day 84 (1-12-18) Worked on CNNs Assignment and my own implementation of the paper
- Worked on the neural style transfer assignment
- Read up more and worked on the implementation of the Neural Style Transfer paper from scratch
Day 85 (2-12-18) Completed the course on CNNs
- Completed the course on CNNs as part of the Deep Learning Specialization Certificate
Day 86 (3-12-18) Hierarchical Clustering
- Spent learning about the different clustering techniques apart from K-Means to solve the question for an internship interview
- Learnt about the end-to-end data science pipeline
Day 87 (6-12-18) Papers papers everywhere 2!
- Spent a whole lot of time going through different papers handpicked from Arxiv and Arxiv sanity
- Topics and papers include CycleGan (Didn't really get it!), DeepFace, FaceNet..etc
Day 88 (7-12-18) Google Cloud Developers Meetup
- Started the Sequence Models course by Andrew NG
- Attended the Google Cloud Meetup which covered topics including using Kibana, elasticsearch and Cloud ML API
Day 89 (8-12-18) GDG AI Meetup - Hands-on session on NLP
- Attended the GDG AI Meetup at Altimetrik
- Worked on the colab notebook pertaining to building a top class sentiment analyser using Spacy and Altair
- NLP with Spacy which is an industry grade NLP library along with NLTK
Day 90 (9-12-18) Time Series with GCP
- Worked on time series analysis with Google cloud codelab
- GPU accelarated sessions with PyTorch support (CUDA backed)
Day 91 (10-12-18) O'Reilly From Scratch Challenge
- Worked on O'Reilly's Tensorflow from scratch challenge form the E-Mail newsletter
- Also had my DBMS lab exam today. Did great by the way!
Day 92 (11-12-18) RNNs and GRUs
- Worked on implementing RNNs and GRUs from scratch
- Understood the efficiency tradeoffs of GRUs while working on it
Day 93 (12-11-18) Breast Cancer Classification
- Using the UCI FNA (Fine Needle Aspiartion) dataset to classify tumours
- Worked on implementing the entire Machine learning pipeline from data preprocessing to model validation
- Used pandas for data pre-processing, seaborn for exploratory data analysis and used a SVM
Day 94 (13-12-18) Acko Insurance Hackathon Project: Phase 1
- Working on 3 Kaggle datasets to predict insurance claims in various categories
- Used the AllClaims Insurance dataset to precict the insurance claims in Auto and Life insurance industries
- Working on a chat application for better customer retention
Day 95 (14-12-18) Acko Insurance Hackathon Project: Phase 2
- Working on the Acko insurance project submission
- Learnt about lstms and seq2seq word embeddings to build a chatbot
- All set for the hackathon tomorrow!
Day 96 (15-12-18) Sequence Models
- Completed Week 1 of Andrew NG's Sequence Models course
- Working on the Jazz production with lstms problem statement
- Successfully submitted the Acko Hackathon solution (We didn't get shorlisted :( )
Day 97 (17-12-18) Working on my research paper
- Resumed work on my Genomics Research paper
- Learnt about Python galaxy for DNA Sequencing
- Spent time learning about Stanford's clusters for gene sequencing and their enormous budgets!
Day 98 (18-12-18) Built my Deep Learning Rig!
Day 99 (21-12-18) ARIMA models for Time series analysis
- Autoregressive Integrated Moving Average models are perfect for time series prediction
- Used it on data that includes a seasonal temporal shift. The data was non-stationary and had trends in the distribution and thus had to be integrated wth the differences as used in Box-Jenkins approach
- Walk-forward validation is extremely accurate as it provides every iteration with all the available data. This is computationally intensive and hence can used only for small datasets.
Day 100 (22-12-18) Memory Networks for single supporting fact problems
- Learnt about memory networks and applied it on the bAbI dataset
- Memory networks can also be used to make chatbots as they have more information gain than lstms with seq2seq embeddings
- Worked on the Reddit dataset to build a general purpose dataset
Day 101-105 (23-12-18 - 26-12-18) Genomics for personalised medicine Whitepaper
- Private repo for now. Will make it public soon!
Challenge Complete!
- It has been a wonderful learning curve and am looking forward to do another one post my exams!