• Stars
    star
    340
  • Rank 124,317 (Top 3 %)
  • Language
    TypeScript
  • License
    MIT License
  • Created almost 8 years ago
  • Updated almost 4 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

📦 English inflection library for noun (plural to singular and singular to plural), verb (gerund, present & past) and adjectives (comparative & superlative) transformations/conjugation.

English Inflectors Library

For noun (plural to singular and singular to plural), verb (gerund, present & past) and adjective (comparative, superlative) transformations.

npm npm license David

Demo

Here's a quick demo: http://en-inflectors.surge.sh/

Installation

npm install en-inflectors --save

Usage

  • Import the library
// javascript
const Inflectors = require("en-inflectors").Inflectors;
// typescript
import { Inflectors } from "en-inflectors";
  • Instantiate the class
let instance = new Inflectors("book");
  • Adjective Inflection
let instance = new Inflectors("big");
instance.comparative(); // bigger
instance.superlative(); // biggest
  • Verb Conjugation
new Inflectors("rallied").conjugate("VBP"); // rally
new Inflectors("fly").conjugate("VBD"); // flew
new Inflectors("throw").conjugate("VBN"); // thrown
new Inflectors("rally").conjugate("VBS"); // rallies
new Inflectors("die").conjugate("VBP"); // dying

// or you can use the aliases
new Inflectors("rallied").toPresent(); // rally
new Inflectors("fly").toPast(); // flew
new Inflectors("throw").toPastParticiple(); // thrown
new Inflectors("rally").toPresentS(); // rallies
new Inflectors("die").toGerund(); // dying
  • Noun Inflection
const instanceA = new Inflectors("bus");
const instanceB = new Inflectors("ellipses");
const instanceC = new Inflectors("money");

instanceA.isCountable(); // true
instanceB.isCountable(); // true
instanceC.isCountable(); // false

instanceA.isNotCountable(); // false
instanceB.isNotCountable(); // false
instanceC.isNotCountable(); // true

instanceA.isSingular(); // true
instanceB.isSingular(); // false
instanceC.isSingular(); // true

instanceA.isPlural(); // false
instanceB.isPlural(); // true
instanceC.isPlural(); // true

// note that uncountable words return true
// on both plural and singular checks


instanceA.toSingular(); // bus (no change)
instanceB.toSingular(); // ellipsis
instanceC.toSingular(); // money (no change)


instanceA.toPlural(); // buses
instanceB.toPlural(); // ellipses (no change)
instanceC.toPlural(); // money (no change)

How does it work

  • Adjective inflection

    1. Checks against a dictionary of known irregularities (e.g. little/less/least)
    2. Applies inflection based on:
      • Number of syllables
      • word ending
  • Noun inflection

    1. Dictionary lookup (known irregularities e.g. octopus/octopi & uncountable words)
    2. Identifies whether the word is plural or singular based on:
      • Dictionary
      • Machine learned regular expressions
    3. Applies transformation based on ending and word pattern (vowels, consonants and word endings)
  • Verb conjugation

    1. Dictionary lookup (known irregularities + 4000 common verbs)
    2. If the passed verb is identified as infinitive, it then applies regular expression transformations that are based on word endings, vowels and consonant phonetics.
    3. Tries to trim character from the beginning of the verb, thus solving prefixes (e.g. undergoes, overthrown)
    4. Tries to stem the word and get the infinitive form, then apply regular expression transformations.
    5. Applies regular expressions.

How accurate is it?

First of all, unless you have a dictionary of all the words and verbs that exist in English, you can't really write a regular expression or an algorithm and expect to have a 100% success rate. English has been adopting words from a lot of different languages (French, Greek and Latin for example), and each one of these languages has its own rules of pluralization and singularization, let alone verb conjugation.

Even with dictionaries you'll have the problem of complex and made up words like maskedlocation, and you might have to add dictionaries for specialties (like medicine which does actually have its own dictionary).

However, I think what you'll find in this library is what can be achieved with the least amount of compromise.

I've used a set of rules (for detection/transformation) in combination with an exceptions list.

However, testing the library was more challenging than anticipated. If you have any case inaccuracy or false positives please submit an issue.

And of course, You can clone this repository, install mocha and test it for yourself, and you'll see how it passes the 9900 tests successfully.

License

License: The MIT License (MIT) - Copyright (c) 2017 Alex Corvi

More Repositories

1

synonyms

📦 JavaScript library to return the synonyms of the word ~ 27779 words
JavaScript
64
star
2

en-pos

⚙️ [Processor] A better English POS tagger written in JavaScript
TypeScript
53
star
3

fin

🚀 Node.js Natural Language Processor written in TypeScript
TypeScript
44
star
4

humannames

📦 A list, huge one (~200K) of human male/female first/last names.
JavaScript
36
star
5

spelling-variations

📦 Spelling variations library, with US & UK variations, frequency scores & preferred spellings.
TypeScript
14
star
6

lemmatizer

📦 English word lemmatizer
TypeScript
13
star
7

cities-list

📦 A list, huge one (~80K), of cities
JavaScript
11
star
8

en-lexicon

📦 Extensible English language lexicon for POS tagging with Emojis and around 110K words
TypeScript
7
star
9

fin-sentiment

🔍 [Detector] sentiment detection for FIN NLP
TypeScript
4
star
10

en-parse

⚙️ [Processor] English dependency parser written in javascript (work in progress)
TypeScript
4
star
11

lexed

⚙️ [Processor] Multi-lingual, extensible word and sentence tokenizer, for natural language processing.
TypeScript
3
star
12

fin-slang

🔩 [extension] Slang decoder
TypeScript
3
star
13

fin-emphasis

🔍 [Detector] Emphasis detection for FIN NLP
TypeScript
2
star
14

fin-urls

🔍 [detector] Detecting URLs IPs and emails
TypeScript
2
star
15

strdistance

String distance calculator
TypeScript
1
star
16

en-norm

English tokens normalizer
TypeScript
1
star
17

fin-sentence-type

🔍 [Detector] Sentence type detection for FIN
TypeScript
1
star
18

fin-negation

🔍 [Detector] Negation detector for FIN NLP
TypeScript
1
star
19

documentation

📄 Fin Natural language processing documentation
1
star
20

fin-ukus

🔍 [detector] UK US spelling detection extension for FIN NLP
TypeScript
1
star