• Stars
    star
    125
  • Rank 286,335 (Top 6 %)
  • Language
    JavaScript
  • Created about 11 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

An interactive map of Stack Exchange tags for all sites.

TagOverflow

An interactive map of tags from Stack Exchange sites. Click here for the live version! As of now it looks more or less like:

Screenshot Dev

History

It is a continuation of my older project Tag Graph Map of Stack Exchange, which met with a warm reception of the Stack Exchange community (see e.g here and here; I even got t-shirts from the SE team!).

Main ingredients

What's there?

Each question on Stack Exchange site has one to five tags describing its content. Unlike on Twitter, these tags are well curated (to a point, you can get a taxonomist badge).

Nodes represent the most popular tags, with their area being proportional to the number of questions with them.

Edges represent relation between tags. Their width is related to the number of questions with both tags (e.g. with both python and list), while their shade - how much more often they occur than one should expect by random chance. Default coloring is due to community detection - automated splitting of a graph into densely connected subgraphs.

You can click on a tag to get additional data, like users who have asked or answered a lot of questions, along with the best questions with this tag. (Who knows, maybe you are one of the to guys and gals?)

Moreover, especially for Stack Overflow, which is a big place, you can draw conditional graphs. That is, consider only questions with a given tag (e.g. javascript). For example, it will count only those occurrences array, which happen to be with javascript. This tag DO NOT appear for the same reason that the site name does not appear a tag.

Methods and tricks

The co-occurrence weight (use for edge shade and strength) is calculated from the observed to expected ratio. It goes as follows:

oe_ratio =  (all_qustions_count * tag_count_AB) / (tag_count_A * tag_count_B)  

It is exactly 1 if and only if two tags co-appear at random. If it is more, it means that they do "like" each other I draw an edge. (I also ignore it when oe_ratio is less than one - i.e. when they avoid each other.) Believe me, this measure is much better than making correlations of some vectors (I tried).

The limit of 100 questions is because of the API limit. However, for dynamic graphs it is also a sane limit. But for most sites 32 tags should be well enough, except for a few sited that are bigger.

In any case, it does a lot of queries and (from time to time) Stack Exchange may block you. Don't worry, it lasts only for a few minutes.

Positions of the nodes are due to D3.js force layout. That is, nodes connected via an edge attract each other. The strength of such attraction depends on the strength of an edge. Plus, all nodes repeal each other at a short distance to prevent overlaps.

For community detection I wrote a greedy hierarchical modularity maximization (as in arXiv:cond-mat/0408187). (AFAIK there is no other JavaScript implementation of community detection; if there is a need, I would be happy to implement something more serious like Louvain or Infomap. If you want it to happen, a few encouraging e-mails will work. :) EDIT: there is a good implementation of Louvain in JS.)

There are some tricks. For example, to calculate tag statistics (e.g. average number of answer per tag) it is unfeasible to probe all questions, and there is no REST API to get these numbers directly. So, it takes 100 newest questions with a given tag, which are at least a month old (so their scores stabilize a bit).

The best askers and answerers, unfortunately, do not work properly for conditional tags (as the respective API queries can be done only for a single tag).

As tag statistics (like average score) have long tails but also can be zero or negative, neither linear nor log scale fits. So, Marta built an asinh scale! In short, for small values it works as linear, but for large - as logarithm; and is antisymmetric.

On code quality

Before looking at code: beware, when you gaze long into a code the code also gazes into you!

(Some excuses: I started it long time ago, changed it in various directions, used to learn JS, teach JS - so it has most of bad practices it could get. I should rewrite it completely into Angular.JS + D3.js; but instead, I decided to show the result, hoping that you forgive me the dirty code.)

Nonetheless, if something does not work, raise an Issue or (even better!) propose a Pull Request.

Citing

Feel free to use it for anything. Just please to refer to it as:

And for any academic papers, please cite:

  • Piotr Migdał, Symmetries and self-similarity of many-body wavefunctions, PhD Thesis (ICFO), arXiv:1412.6796

(If you are wondering about the relation of my PhD thesis to this project - well, one of main topics is community detection. While introducing basic methods, I use TagOverflow as an example.)

More Repositories

1

science-based-games-list

Science-based games - a collaborative list
1,577
star
2

livelossplot

Live training loss plot in Jupyter Notebook for Keras, PyTorch and others
Python
1,294
star
3

interactive-machine-learning-list

A collaborative list of interactive Machine Learning, Deep Learning and Statistics websites
JavaScript
432
star
4

quantum-game

Quantum Game (old version) - a puzzle game with real quantum mechanics in a browser
JavaScript
343
star
5

thinking-in-tensors-writing-in-pytorch

Thinking in tensors, writing in PyTorch (a hands-on deep learning intro)
Jupyter Notebook
329
star
6

keras-sequential-ascii

ASCII summary for simple sequential models in Keras
Jupyter Notebook
126
star
7

tag-graph-map-of-stackexchange

Generates map in form of a graph from tags on StackExchange sites, e.g. StackOverflow.
Python
53
star
8

weltschmerz

Weltschmerz by age - "I am X years old and... [Google autocomplete]"
JavaScript
23
star
9

keras-mini-examples

Small Keras examples to get you started
Jupyter Notebook
20
star
10

data-science-pl

Data Science PL knowledge base / baza wiedzy
14
star
11

qubism

Self-similar visualization of many-body wavefunctions (and also: time series, DNA, proteins).
Mathematica
13
star
12

which-ml-are-you

Which ML are you?
Vue
12
star
13

jekyll-blog-pre-2022

Old Piotr Migdał's blog, in Jekyll, pre 2022
HTML
9
star
14

delab-matury

Analiza i wizualizacja danych maturalnych z lat 2010-2014
Jupyter Notebook
8
star
15

wizualizacja-wolnych-lektur

Themes and colours of readings from wolnelektury.pl - a visualization in D3.js.
Python
8
star
16

kfnrd_viz

Wizualizacje danych Krajowego Funduszu na rzecz Dzieci
JavaScript
7
star
17

cv-resume

My Curriculum Vitae / Resume
TeX
7
star
18

uw-ml-python-2017

Jupyter Notebook
6
star
19

hackart-you-in-artwork

Skarby muzeum (1 miejsce w HackArt Muzeum Narodowego w Warszawie)
Jupyter Notebook
6
star
20

pytorch-intro

Intro to PyTorch stuff; now internal, for interns
Jupyter Notebook
5
star
21

gossipingcommons

Gossiping Commons - “don't tell alike” and “no author, please” open licenses
5
star
22

stable-diffusion-keras-m1-gpu

Stable diffusion image generation with KerasCV for Macbook M1 GPU
Jupyter Notebook
4
star
23

se-api-py

A lightweight Python wrapper for StackExchange API v2.1
Python
4
star
24

matrix-decomposition-viz

Work in progress
JavaScript
4
star
25

talk_20160119_jupyter_notebook

Presentation on Jupyter Notebook (IPython Notebook + R) - at Data Science Warsaw Meetup
Jupyter Notebook
3
star
26

old-blog-gridsome-pre-2024

Old Piotr Migdał blog - Gridsome, 2022-2024
Vue
3
star
27

A-numerical-model-of-the-Mafia-game

Party game Mafia (a.k.a. Werewolf) investigated numerically.
Python
3
star
28

menger-vr

Menger Sponge - 3D Fractal VR - A-Frame
HTML
3
star
29

piotr_migdal_resume

Piotr Migdal Resume 2022+, LaTeX AltaCV template
TeX
2
star
30

dl-diag-d3js

Deep Learning architecture diagrams - a D3.js library
JavaScript
2
star
31

random_data_explorations

Random data explorations
Jupyter Notebook
2
star
32

art-tensor-diagrams

Tensor Diagrams expository article in RDMarkdown Distill
TeX
2
star
33

python-neuroaspects-2016

First steps with data analysis in Python - Aspects of Neuroscience 2016
Jupyter Notebook
2
star
34

trypo-brainhack

Jupyter Notebook
1
star
35

diffraction-gratings

Diffraction Gratings, Moire Pattern and Spiral Zone Plates - in PostScript
1
star
36

pypi-search-meteor

PyPI interactive package search in Meteor
JavaScript
1
star
37

stared.github.io

Personal website and blog by Piotr Migdał, in Nuxt 3 Content
Vue
1
star
38

szkolomat_dane

Szkołomat - dane
1
star
39

yarn-adding-pure-typescript-package-example

TypeScript
1
star
40

trolleython

Trolley with Friends - a cynical game (in dev)
JavaScript
1
star
41

nalogi-viz

Wizualizacja nałogów - co dają, co zwalczają (przyciągające się słowa)
Jupyter Notebook
1
star