• Stars
    star
    242
  • Rank 167,048 (Top 4 %)
  • Language
    JavaScript
  • License
    Apache License 2.0
  • Created about 9 years ago
  • Updated over 8 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

t-distributed stochastic neighbor embedding (t-SNE) algorithm implemented in JavaScript

t-SNE.js

build status npm version

t-distributed stochastic neighbor embedding (t-SNE) algorithm implemented in JavaScript

  • Runs in the browser (also runs in Web Workers)

  • Runs in node.js

  • Uses efficient in-place matrix operations via ndarray

  • Follows closely the API of scikit-learn, allowing specification of perplexity and early exaggeration factor, among other parameters.

INTERACTIVE DEMO

Background

t-SNE is a powerful manifold technique for embedding data into low-dimensional space (typically 2-d or 3-d for visualization purposes) while preserving small pairwise distances or local data structures in the original high-dimensional space. In practice, this results in a much more intuitive layout within the low-dimensional space as compared to other techniques. The low-dimensional embedding is learned by minimizing the Kullback-Leibler divergence between the pairwise-similarity probability distribution over the original data space and distribution over the embedding space.

An important note is that the objective function is non-convex with numerous local minima, and thus the results are non-deterministic. There are a few model parameters which influence the learning and optimization process. Selecting appropriate parameters for the input data can significantly improve the chances the model converge on good solutions.

Currently implemented is the exact fomulation, which has computational complexity O(dN^2), where d is the original dimensionality of the data and N is the number of samples. Implementation of the O(dN*logN) Barnes-Hut approximation variant is planned (contributions welcome!).

[source](http://lvdmaaten.github.io/tsne/)

Usage

Can be run in node.js or the browser. In the browser, should ideally be run in a web worker.

node.js
$ npm install tsne-js --save
import TSNE from 'tsne-js';

let model = new TSNE({
  dim: 2,
  perplexity: 30.0,
  earlyExaggeration: 4.0,
  learningRate: 100.0,
  nIter: 1000,
  metric: 'euclidean'
});

// inputData is a nested array which can be converted into an ndarray
// alternatively, it can be an array of coordinates (second argument should be specified as 'sparse')
model.init({
  data: inputData,
  type: 'dense'
});

// `error`,  `iter`: final error and iteration number
// note: computation-heavy action happens here
let [error, iter] = model.run();

// rerun without re-calculating pairwise distances, etc.
let [error, iter] = model.rerun();

// `output` is unpacked ndarray (regular nested javascript array)
let output = model.getOutput();

// `outputScaled` is `output` scaled to a range of [-1, 1]
let outputScaled = model.getOutputScaled();
browser
<script src="tsne.min.js"></script>

Then it's the same API as above. A browser example using Web Workers is in the example/ folder.

Model Parameters
  • dim: number of embedding dimensions, typically 2 or 3

  • perplexity: approximately related to number of nearest neighbors used during learning, typically between 5 and 50

  • earlyExaggeration: parameter which influences spacing between clusters, must be at least 1.0

  • learningRate: learning rate for gradient descent, typically between 100 and 1000

  • nIter: maximum number of iterations, should be at least 200

  • metric: distance measure to use for input data, currently implemented measures include

    • euclidean
    • manhattan
    • jaccard (boolean data)
    • dice (boolean data)

Build

To run build yourself, for both the browser (outputs to build/tsne.min.js) and node.js (outputs to dist/):

$ npm run build

To build for just the browser, run npm run build-browser, and to build for just node.js, run npm run build-node.

Tests

$ npm test

References

The original paper on t-SNE:

L.J.P. van der Maaten and G.E. Hinton.
Visualizing High-Dimensional Data Using t-SNE.
Journal of Machine Learning Research 9(Nov):2579-2605, 2008.

Paper on Barnes-Hut variant t-SNE:

L.J.P. van der Maaten.
Accelerating t-SNE using Tree-Based Algorithms.
Journal of Machine Learning Research 15(Oct):3221-3245, 2014.

License

Apache 2.0

More Repositories

1

neocortex

Run trained deep neural networks in the browser or node.js
JavaScript
275
star
2

mesh-tree

Utility functions for traversing the Medical Subject Heading (MeSH) ontology tree
JavaScript
38
star
3

scholarly.vernacular.io

A vernacular of HTML for scholarly publishing
HTML
38
star
4

crossref

Client for the Crossref API
JavaScript
31
star
5

omml2mathml

Small utility to convert from Microsoft's OMML to MathML
JavaScript
20
star
6

react-blobber

Create orthogonal blobs from arrays of rectangles
JavaScript
15
star
7

pubmed-schema-org

PubMed and PubMed central mapped to schema.org and expressed as JSON-LD and RDFa.
JavaScript
13
star
8

paper-input

Paper Input React component
JavaScript
9
star
9

react-router-redux-sync

Sync params and location from react-router to a redux store
JavaScript
9
star
10

react-pouchdb-changes

React component middleware for listening to the changes feed from CouchDB or PouchDB
JavaScript
8
star
11

ldstars

Rate (as in five stars linked data) a schema.org document in JSON-LD
JavaScript
8
star
12

cssnow

Sort of like cssnext used to be
JavaScript
7
star
13

RJSONLD

Export results of standard analytics to JSON-LD format
R
6
star
14

vernacular.io

The vernacular.io website
HTML
4
star
15

paper-textarea

Paper Textarea React component
JavaScript
4
star
16

5-star-linked-data-icons

icons for 5 star linked data
4
star
17

doc-dna

Visualize the DNA of the CreativeWork a JSON-LD document using schema.org context
JavaScript
4
star
18

blog

Standard Analytics IO weblog (powered by poet and purecss.io)
CSS
3
star
19

stat-ontology

An experimentation for a statistics ontology
JavaScript
3
star
20

style-guide-db

Searchable database of style guides
JavaScript
3
star
21

ducted

General purpose pipeline manager
JavaScript
3
star
22

optimizify

Browserify transform for optimize-js
JavaScript
3
star
23

jsonld-context-infer

Infer a JSON-LD context from a readable stream of tabular data.
JavaScript
3
star
24

data-citations

Web-First Data Citations
HTML
2
star
25

get-xml

Parse XML transparently in Node or the browser
JavaScript
2
star
26

preview-tabular-data

Extract the first x lines of tabular dataset
JavaScript
2
star
27

paper-checkbox

Paper Checkbox React component
JavaScript
1
star
28

embed

makes trees out of flattened JSON-LD documents
JavaScript
1
star
29

time-picker

A very simple React time picker
JavaScript
1
star