predicting-number-of-comments-on-reddit-using-random-forest-classifier
I determine which characteristics of a post on Reddit contribute most to the overall interaction as measured by number of comments. We first colect data by scraping Reddit, we then use Natural Language Processing to convert text data to features that enable us to build models and then we build a Random Forest classifier that predicts whether or not a given Reddit post will have above or below the median number of comments.