visualization of related subreddits
This project builds a graph of related subreddits.
Recommendations are constructed based on Redditors who commented to this subreddit, also commented to...
Play with it here: https://anvaka.github.io/sayit/
The data
I used data from two months worth of comments (August and September of 2018) - which contains ~38 millions user <-> subreddit
records.
You can find original data by following this discussion
I computed Jaccard Similarity between subreddits, and then stored results into github pages. Repository is available here. Please let me know if you are curious to learn more about this or anything else - feel free to reach out to me on twitter or via issues in this repository
Note: for very popular subreddits Jaccard Similarity didn't give meaningful results. They all were connected
to each other (e.g. /r/aww
, /r/pics
, /r/funny
and so on). I manually collected references to other subreddits
from subreddit description where it was available. Where description did not include any recommendation -
I looked into actual comments and used most often mentioned subreddits as "related". You can find list of all
overrides in the sayit-data repository
Local Build Setup
# install dependencies
npm install
# serve with hot reload at localhost:8080
npm run dev
# build for production with minification
npm run build
# build for production and view the bundle analyzer report
npm run build --report
For a detailed explanation on how things work, check out the guide and docs for vue-loader.
Thanks!
Credit for data goes to Jason Baumgartner also known as u/Stuck_In_the_Matrix.
Huge thanks to Felipe Hoffa for putting the data into BigQuery.
If you like this work and would like to support it - I have a Patreon page, paypal.me, github sponsors.
Thank you!