Text-Classification-20-Newsgroups
The dataset is a collection of approximately 20,000 newsgroup documents, partitioned (nearly) evenly across 20 different newsgroups. • Builded vocabulary from the dataset which was used as a feature set. • Implemented Multinomial Naive Bayes classifier from scratch for classifying news into appropriate group.