Discover @cacoderquan Open Source projects

Top Rating
- Top Contributors
  Discover the Top Open Source contributors by country or by language
- Interviews
  Discover real stories from Open Source developers
Discover

Discover your Favorite Language
Discover the top trending repositories and projects on Github. Explore the latest trends in your preferred languages.

Shell

Perl

Swift

C#

CSS

Clojure

Emacs Lisp

Crystal

More Languages
Awesome

Awesome repositories
Discover the most awesome repositories and projects of your favorite languages. Inspired by the Awesome-* lists trend in GitHub.

Go

Dart

Java

C#

PowerShell

Shell

R

Clojure

More Languages
By Country

Rankings by Country
Discover the community of talented open source contributors in each country.

🇬🇱 Greenland

🇦🇲 Armenia

🇨🇮 Côte d'Ivoire

🇭🇹 Haiti

🇳🇿 New Zealand

🇭🇷 Croatia

🇨🇻 Cape Verde

🇾🇪 Yemen

All Countries Compare Countries

random (@cacoderquan)

cacoderquan

Stars
23
Global Rank 587,022 (Top 21 %)
Followers 4
Following 2
Registered over 9 years ago
Most used languages

Python
100.0 %
Location 🇭🇰 Hong Kong
Country Total Rank 1,364
Country Ranking

Python
547

Predict-financial-recession

The major goal of this project is to predict financial re- cession given the frequencies of the top 500 word stems in the reports of financial companies. After applying various learning models, we can see that the prediction of financial recession by the bag of words has an accuracy of more than 90%. Hence, there is indeed a correlation between the two. Moreover, we have compared different learning models (ensemble methods with Decision Tree, SVM, and KNN) with various parameters to find the best model with a relatively high average accuracy and low variance of accuracy by cross-validation on the training data set. In addition, we have also tried several pre-processing methods (tf-idf, feature selection, and centroid-based clustering) to improve the accuracy of the learning models. In the end, the best model is Gradient Boosting with Decision Tree using the pre-processed tf-idf data set.

Sentiment-Analysis-on-the-Rotten-Tomatoes-movie-review-dataset

The Rotten Tomatoes movie review corpus is a collection of movie reviews collected by Pang and Lee in [2]. This corpus has been analysed in [3] where each sentence is parsed into its tree structure and each node is assigned a fine-grained sentiment label ranging from 1 − 5 where the numbers represent very negative, negative, neutral, positive and very positive respectively. In this paper we use this data on ath000 phrases and all the methods in this paper are assessed by training on a random subset of phrases (and their subphrases) of size approximately 4/5 of the data set and testing using the remaining 1/5. The idea is to use the non-associative functions and the parser trees structures to modify the feature vectors.

Visualization-of-Latent-Factors-from-Movies

The goal is to visualize and interpret a 2-dimensional latent features for movies of the given data-set. We are given the categorizations of about 1600 movies into 19 genres, and ratings of some users to a specific movie. We applied matrix factorization to the sparse ratings matrix (since not all users are going to rate all movies) to look for latent factor matrices of the movies and users. Then we used the principal component analysis (PCA) to analyze the latent factor matrix of movies and projected each movie to the two strongest latent factors. Finally, we did visualization and interpretation on each category and compared the category average of the two major latent factors.