There are no reviews yet. Be the first to send feedback to the community and the maintainers!
als-recommender-pyspark
Recommender System is an information filtering tool that seeks to predict which product a user will like, and based on that, recommends a few products to the users. For example, Amazon can recommend new shopping items to buy, Netflix can recommend new movies to watch, and Google can recommend news that a user might be interested in. The two widely used approaches for building a recommender system are the content-based filtering (CBF) and collaborative filtering (CF).Named-Entity-Recognition
Named-entity recognition (NER) (also known as entity identification, entity chunking and entity extraction) is a subtask of information extraction that seeks to locate and classify named entities mentioned in unstructured text into pre-defined categories such as person names, organizations, locations, medical codes, time expressions, quantities, monetary values, percentages, etc.bow_tfidf
This project follows the traditional techniques like the Bag of Words and tf-idf to represent words in a corpus in a numeric format for multilabel classification.Naive-Bayes-Spam-Classifier-on-PySpark
Spam detection is one of the major applications of Machine Learning in the interwebs today. Most of the email service providers have spam detection built in to automatically classify such mails as 'Junk Mail'.MovieLens_Exploratory
The purpose of this study is to look at the distribution of ratings, movie and users over time, impact of user mood on average rating score and average rating score of genre over time. The analysis is divided into 4 di↵erent 5-year batches to run analysis on sections of data. It was found, the growth, trend and level are stable after the first 5 periods (i.e. after the year 2000). With frequency of rating showing high correlation to new movies and users added, trend for rating over time shows combining e↵ect of growth in user and movie base . Further, weekday-weekend analysis show most of the ratings (approx.70%) are happening over the weekdays. For average rating score, a notable observation is, the shift in the rating pattern for the last batch(latest batch, 2011-2015). In this batch approximately 50% of the rating scores are average and the 25% each for poor and high rating scores in comparison to the other batches where it was 80-20 between average and high/poor rating scores. In the genre analysis it was found 9.4% times users rated genre below 3, 17.5% times for high and 70% times average.collaborate-github
In this article we will walk through the steps involved in collaborating over vcs to version control and proof read their codes.EBay_Sales_Analysis
covid-spread-bokeh
This project aims to visualise covid spread in UK using a python visualisation package Bokeh.Love Open Source and this site? Check out how you can help us