• Stars
    star
    6
  • Rank 2,537,063 (Top 51 %)
  • Language
    Python
  • Created over 8 years ago
  • Updated over 8 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Data Mining and Information Retrieval. This project makes use of the #nowplaying dataset which can be found here. A subset of the #nowplaying dataset was extracted using Reservoir Sampling, because the original dataset was too large (13GB). More information on the sampling workflow can be found in the report, accompanying this assignment.

More Repositories

1

thesis-bitcoin-clustering

The Bitcoin currency is a publicly available, transparent, large scale network in which every single transaction can be analysed. Multiple tools are used to extract binary information, pre-process data and train machine learning models from the decentralised blockchain. As Bitcoin popularity increases both with consumers and businesses alike, this paper looks at the threat to privacy faced by users through commercial adoption by deriving user attributes, transaction properties and inherent idioms of the network. We define the Bitcoin network protocol, describe heuristics for clustering, mine the web for publicly available user information and finally train supervised learning models. We show that two machine learning algorithms perform successfully in clustering the Bitcoin transactions based on only graphical metrics measured from the transaction network. The Logistic Regression algorithm achieves an F1 score of 0.731 and the Support Vector Machines achieves an F1 score of 0.727. This work demonstrates the value of machine learning and network analysis for business intelligence; on the other hand it also reveals the potential threats to user privacy.
Jupyter Notebook
34
star
2

sponsored-search-auction

In auction theory, a Vickrey–Clarke–Groves (VCG) auction is a type of sealed-bid auction of multiple items. Bidders submit bids that report their valuations for the items, without knowing the bids of the other people in the auction. This code simulates an auction between bidders and available slots.
Python
6
star
3

click-through-rates-predictions

Predict the user’s click response to each auctioned ad impression in real-time bidding (RTB) display advertising. Specifically, given the information of the incoming bid request, the bid agent should estimate the probability that the user will click on its ad if it is displayed.
Python
1
star