• Stars
    star
    337
  • Rank 121,080 (Top 3 %)
  • Language
    TypeScript
  • License
    MIT License
  • Created over 7 years ago
  • Updated over 3 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

πŸ“¦ English inflection library for noun (plural to singular and singular to plural), verb (gerund, present & past) and adjectives (comparative & superlative) transformations/conjugation.

English Inflectors Library

For noun (plural to singular and singular to plural), verb (gerund, present & past) and adjective (comparative, superlative) transformations.

npm npm license David

Demo

Here's a quick demo: http://en-inflectors.surge.sh/

Installation

npm install en-inflectors --save

Usage

  • Import the library
// javascript
const Inflectors = require("en-inflectors").Inflectors;
// typescript
import { Inflectors } from "en-inflectors";
  • Instantiate the class
let instance = new Inflectors("book");
  • Adjective Inflection
let instance = new Inflectors("big");
instance.comparative(); // bigger
instance.superlative(); // biggest
  • Verb Conjugation
new Inflectors("rallied").conjugate("VBP"); // rally
new Inflectors("fly").conjugate("VBD"); // flew
new Inflectors("throw").conjugate("VBN"); // thrown
new Inflectors("rally").conjugate("VBS"); // rallies
new Inflectors("die").conjugate("VBP"); // dying

// or you can use the aliases
new Inflectors("rallied").toPresent(); // rally
new Inflectors("fly").toPast(); // flew
new Inflectors("throw").toPastParticiple(); // thrown
new Inflectors("rally").toPresentS(); // rallies
new Inflectors("die").toGerund(); // dying
  • Noun Inflection
const instanceA = new Inflectors("bus");
const instanceB = new Inflectors("ellipses");
const instanceC = new Inflectors("money");

instanceA.isCountable(); // true
instanceB.isCountable(); // true
instanceC.isCountable(); // false

instanceA.isNotCountable(); // false
instanceB.isNotCountable(); // false
instanceC.isNotCountable(); // true

instanceA.isSingular(); // true
instanceB.isSingular(); // false
instanceC.isSingular(); // true

instanceA.isPlural(); // false
instanceB.isPlural(); // true
instanceC.isPlural(); // true

// note that uncountable words return true
// on both plural and singular checks


instanceA.toSingular(); // bus (no change)
instanceB.toSingular(); // ellipsis
instanceC.toSingular(); // money (no change)


instanceA.toPlural(); // buses
instanceB.toPlural(); // ellipses (no change)
instanceC.toPlural(); // money (no change)

How does it work

  • Adjective inflection

    1. Checks against a dictionary of known irregularities (e.g. little/less/least)
    2. Applies inflection based on:
      • Number of syllables
      • word ending
  • Noun inflection

    1. Dictionary lookup (known irregularities e.g. octopus/octopi & uncountable words)
    2. Identifies whether the word is plural or singular based on:
      • Dictionary
      • Machine learned regular expressions
    3. Applies transformation based on ending and word pattern (vowels, consonants and word endings)
  • Verb conjugation

    1. Dictionary lookup (known irregularities + 4000 common verbs)
    2. If the passed verb is identified as infinitive, it then applies regular expression transformations that are based on word endings, vowels and consonant phonetics.
    3. Tries to trim character from the beginning of the verb, thus solving prefixes (e.g. undergoes, overthrown)
    4. Tries to stem the word and get the infinitive form, then apply regular expression transformations.
    5. Applies regular expressions.

How accurate is it?

First of all, unless you have a dictionary of all the words and verbs that exist in English, you can't really write a regular expression or an algorithm and expect to have a 100% success rate. English has been adopting words from a lot of different languages (French, Greek and Latin for example), and each one of these languages has its own rules of pluralization and singularization, let alone verb conjugation.

Even with dictionaries you'll have the problem of complex and made up words like maskedlocation, and you might have to add dictionaries for specialties (like medicine which does actually have its own dictionary).

However, I think what you'll find in this library is what can be achieved with the least amount of compromise.

I've used a set of rules (for detection/transformation) in combination with an exceptions list.

However, testing the library was more challenging than anticipated. If you have any case inaccuracy or false positives please submit an issue.

And of course, You can clone this repository, install mocha and test it for yourself, and you'll see how it passes the 9900 tests successfully.

License

License: The MIT License (MIT) - Copyright (c) 2017 Alex Corvi

More Repositories

1

synonyms

πŸ“¦ JavaScript library to return the synonyms of the word ~ 27779 words
JavaScript
64
star
2

en-pos

βš™οΈ [Processor] A better English POS tagger written in JavaScript
TypeScript
52
star
3

fin

πŸš€ Node.js Natural Language Processor written in TypeScript
TypeScript
44
star
4

humannames

πŸ“¦ A list, huge one (~200K) of human male/female first/last names.
JavaScript
34
star
5

spelling-variations

πŸ“¦ Spelling variations library, with US & UK variations, frequency scores & preferred spellings.
TypeScript
13
star
6

lemmatizer

πŸ“¦ English word lemmatizer
TypeScript
13
star
7

cities-list

πŸ“¦ A list, huge one (~80K), of cities
JavaScript
11
star
8

en-lexicon

πŸ“¦ Extensible English language lexicon for POS tagging with Emojis and around 110K words
TypeScript
7
star
9

fin-sentiment

πŸ” [Detector] sentiment detection for FIN NLP
TypeScript
4
star
10

en-parse

βš™οΈ [Processor] English dependency parser written in javascript (work in progress)
TypeScript
4
star
11

lexed

βš™οΈ [Processor] Multi-lingual, extensible word and sentence tokenizer, for natural language processing.
TypeScript
3
star
12

fin-slang

πŸ”© [extension] Slang decoder
TypeScript
3
star
13

fin-emphasis

πŸ” [Detector] Emphasis detection for FIN NLP
TypeScript
2
star
14

fin-urls

πŸ” [detector] Detecting URLs IPs and emails
TypeScript
2
star
15

strdistance

String distance calculator
TypeScript
1
star
16

en-norm

English tokens normalizer
TypeScript
1
star
17

fin-sentence-type

πŸ” [Detector] Sentence type detection for FIN
TypeScript
1
star
18

fin-negation

πŸ” [Detector] Negation detector for FIN NLP
TypeScript
1
star
19

documentation

πŸ“„ Fin Natural language processing documentation
1
star
20

fin-ukus

πŸ” [detector] UK US spelling detection extension for FIN NLP
TypeScript
1
star