• Stars
    star
    2,641
  • Rank 17,319 (Top 0.4 %)
  • Language
    JavaScript
  • License
    MIT License
  • Created about 12 years ago
  • Updated over 4 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

AFINN-based sentiment analysis for Node.js.

sentiment

AFINN-based sentiment analysis for Node.js

CircleCI codecov Greenkeeper badge

Sentiment is a Node.js module that uses the AFINN-165 wordlist and Emoji Sentiment Ranking to perform sentiment analysis on arbitrary blocks of input text. Sentiment provides several things:

  • Performance (see benchmarks below)
  • The ability to append and overwrite word / value pairs from the AFINN wordlist
  • The ability to easily add support for new languages
  • The ability to easily define custom strategies for negation, emphasis, etc. on a per-language basis

Table of contents

Installation

npm install sentiment

Usage example

var Sentiment = require('sentiment');
var sentiment = new Sentiment();
var result = sentiment.analyze('Cats are stupid.');
console.dir(result);    // Score: -2, Comparative: -0.666

Adding new languages

You can add support for a new language by registering it using the registerLanguage method:

var frLanguage = {
  labels: { 'stupide': -2 }
};
sentiment.registerLanguage('fr', frLanguage);

var result = sentiment.analyze('Le chat est stupide.', { language: 'fr' });
console.dir(result);    // Score: -2, Comparative: -0.5

You can also define custom scoring strategies to handle things like negation and emphasis on a per-language basis:

var frLanguage = {
  labels: { 'stupide': -2 },
  scoringStrategy: {
    apply: function(tokens, cursor, tokenScore) {
      if (cursor > 0) {
        var prevtoken = tokens[cursor - 1];
        if (prevtoken === 'pas') {
          tokenScore = -tokenScore;
        }
      }
      return tokenScore;
    }
  }
};
sentiment.registerLanguage('fr', frLanguage);

var result = sentiment.analyze('Le chat n\'est pas stupide', { language: 'fr' });
console.dir(result);    // Score: 2, Comparative: 0.4

Adding and overwriting words

You can append and/or overwrite values from AFINN by simply injecting key/value pairs into a sentiment method call:

var options = {
  extras: {
    'cats': 5,
    'amazing': 2
  }
};
var result = sentiment.analyze('Cats are totally amazing!', options);
console.dir(result);    // Score: 7, Comparative: 1.75

API Reference

var sentiment = new Sentiment([options])

Argument Type Required Description
options object false Configuration options (no options supported currently)

sentiment.analyze(phrase, [options], [callback])

Argument Type Required Description
phrase string true Input phrase to analyze
options object false Options (see below)
callback function false If specified, the result is returned using this callback function

options object properties:

Property Type Default Description
language string 'en' Language to use for sentiment analysis
extras object {} Set of labels and their associated values to add or overwrite

sentiment.registerLanguage(languageCode, language)

Argument Type Required Description
languageCode string true International two-digit code for the language to add
language object true Language module (see Adding new languages)

How it works

AFINN

AFINN is a list of words rated for valence with an integer between minus five (negative) and plus five (positive). Sentiment analysis is performed by cross-checking the string tokens (words, emojis) with the AFINN list and getting their respective scores. The comparative score is simply: sum of each token / number of tokens. So for example let's take the following:

I love cats, but I am allergic to them.

That string results in the following:

{
    score: 1,
    comparative: 0.1111111111111111,
    calculation: [ { allergic: -2 }, { love: 3 } ],
    tokens: [
        'i',
        'love',
        'cats',
        'but',
        'i',
        'am',
        'allergic',
        'to',
        'them'
    ],
    words: [
        'allergic',
        'love'
    ],
    positive: [
        'love'
    ],
    negative: [
        'allergic'
    ]
}
  • Returned Objects
    • Score: Score calculated by adding the sentiment values of recognized words.
    • Comparative: Comparative score of the input string.
    • Calculation: An array of words that have a negative or positive valence with their respective AFINN score.
    • Token: All the tokens like words or emojis found in the input string.
    • Words: List of words from input string that were found in AFINN list.
    • Positive: List of positive words in input string that were found in AFINN list.
    • Negative: List of negative words in input string that were found in AFINN list.

In this case, love has a value of 3, allergic has a value of -2, and the remaining tokens are neutral with a value of 0. Because the string has 9 tokens the resulting comparative score looks like: (3 + -2) / 9 = 0.111111111

This approach leaves you with a mid-point of 0 and the upper and lower bounds are constrained to positive and negative 5 respectively (the same as each token! ๐Ÿ˜ธ). For example, let's imagine an incredibly "positive" string with 200 tokens and where each token has an AFINN score of 5. Our resulting comparative score would look like this:

(max positive score * number of tokens) / number of tokens
(5 * 200) / 200 = 5

Tokenization

Tokenization works by splitting the lines of input string, then removing the special characters, and finally splitting it using spaces. This is used to get list of words in the string.


Benchmarks

A primary motivation for designing sentiment was performance. As such, it includes a benchmark script within the test directory that compares it against the Sentimental module which provides a nearly equivalent interface and approach. Based on these benchmarks, running on a MacBook Pro with Node v6.9.1, sentiment is nearly twice as fast as alternative implementations:

sentiment (Latest) x 861,312 ops/sec ยฑ0.87% (89 runs sampled)
Sentimental (1.0.1) x 451,066 ops/sec ยฑ0.99% (92 runs sampled)

To run the benchmarks yourself:

npm run test:benchmark

Validation

While the accuracy provided by AFINN is quite good considering it's computational performance (see above) there is always room for improvement. Therefore the sentiment module is open to accepting PRs which modify or amend the AFINN / Emoji datasets or implementation given that they improve accuracy and maintain similar performance characteristics. In order to establish this, we test the sentiment module against three labelled datasets provided by UCI.

To run the validation tests yourself:

npm run test:validate

Rand Accuracy

Amazon:  0.726
IMDB:    0.765
Yelp:    0.696

Testing

npm test

More Repositories

1

color

A collection of categories and utilities that extend UIColor
Objective-C
537
star
2

troll

Language sentiment analysis and neural networks... for trolls.
JavaScript
332
star
3

cam

A โ€œkeep it simpleโ€ approach to handling photo and video capture with AVFoundation.
Objective-C
280
star
4

queue

A persistent background job queue for iOS.
Objective-C
268
star
5

storage

An iOS library for fast, easy, and safe threaded disk I/O.
Objective-C
257
star
6

semver

Semantic Versioning library for Objective-C
Objective-C
115
star
7

washyourmouthoutwithsoap

A list of bad words in many languages.
JavaScript
96
star
8

fastly

Fastly API client for Node.js
JavaScript
70
star
9

conduit

JS to Objective-C... and back again.
Objective-C
52
star
10

parallax

Objective-C library for implementation of CoreMotion-controlled parallax distortion.
Objective-C
44
star
11

generator

Language agnostic project bootstrapping with an emphasis on simplicity.
JavaScript
30
star
12

fork-pool

A generic child process pool for Node.js.
JavaScript
27
star
13

logo

A streaming parser for the LOGO programming language.
JavaScript
23
star
14

micron-throttle

Token bucket based HTTP request throttle for Node.js
JavaScript
16
star
15

trebuchet

A node.js module for throwing email around using the Postmark API.
JavaScript
15
star
16

rodeo

Realtime notifications with Redis and Node.js
JavaScript
14
star
17

turtle

A collaborative programming environment for the LOGO programming language.
JavaScript
14
star
18

orchestra

Keyboard-based instruments designed for MaKey MaKey
10
star
19

basic

HTTP Basic Authentication for Node.js
JavaScript
10
star
20

dpla

Node.js API client for the Digital Public Library of America
JavaScript
9
star
21

graffle-json

A node.js utility for converting OmniGraffle .OO3 files into structured JSON
JavaScript
9
star
22

simple

A simple static HTTP server
JavaScript
8
star
23

tineye

Node.js client for the Tineye search API
JavaScript
8
star
24

strainer

Simple filtering of arrays and object streams.
JavaScript
8
star
25

namebot

A node.js module for creating usernames based on a specified corpus
JavaScript
7
star
26

baseit

A node.js module for simple(r) handling of radix 2 through 36 base encodings.
JavaScript
5
star
27

phidget

Node.js bindings for the Phidget line of USB sensor and control interfaces.
JavaScript
5
star
28

friendly-phonemes

A kid friendly corpus in both JSON and phonetic "DICT" formats
5
star
29

assert

Assertion extensions and utilities for OCUnit
Objective-C
4
star
30

cc-client

Node.js client for the Constant Contact API
JavaScript
3
star
31

3d-mixer

OpenFrameworks based 8-channel 3D sound mixer prototype
C
3
star
32

php-console

PHP Console is a MacOS X (10.6+) Cocoa application that provides users with a simple environment in which to execute arbitrary PHP code.
Objective-C
3
star
33

up-client

Node.js client for the (unofficial) Jawbone UP API
JavaScript
3
star
34

rij

Safe and sensible work queue for Node.js
JavaScript
3
star
35

localq

A persistent job queue for the browser.
JavaScript
3
star
36

cork

An API utility belt for request.
JavaScript
2
star
37

dotfiles

My dotfiles. There are many like them, but these are mine.
Shell
2
star
38

apostle

Node.js API client for Apostle.io
JavaScript
2
star
39

sublime

A collection of handy Sublime Text snippets & build scripts
Python
2
star
40

hipchat-cli

A Hipchat CLI using curl
Shell
2
star
41

vouch

JSON schema validation ... for humans.
JavaScript
2
star
42

micron

Minimalist extensions to the Node.js core HTTP server.
JavaScript
2
star
43

badgecrawler

Search provider for Mozilla Open Badges
JavaScript
2
star
44

randy

Socket.io based realtime notifications with Rodeo.
JavaScript
2
star
45

teach-presentation

How to Teach (Almost) Anything - Presentation Slides
1
star
46

dashboard

gMail to servo = wat
JavaScript
1
star
47

ios-blinkrc-control

Quick prototype iOS control application for the "Insurance Liability Bot" (BlinkRC servo controller). Requires Sparrow framework (http://www.sparrow-framework.org).
Objective-C
1
star
48

uiimage-io

A category for UIImage that provides naive methods for saving UIImage objects to disk. For demo purposes only.
Objective-C
1
star
49

dscripts

A collection of dtrace scripts
D
1
star