Detect the language of text.
What’s so cool about franc?
- franc can support more languages(†) than any other library
- franc is packaged with support for 82, 187, or 414 languages
- franc has a CLI
† - Based on the UDHR, the most translated copyright-free document in the world.
What’s not so cool about franc?
franc supports many languages, which means it’s easily confused on small samples. Make sure to pass it big documents to get reliable results.
Install
👉 Note: this installs thefranc
package, with support for 187 languages (languages which have 1 million or more speakers).franc-min
(82 languages, 8m or more speakers) andfranc-all
(all 414 possible languages) are also available. Finally, usefranc-cli
to install the CLI.
This package is ESM only. In Node.js (version 14.14+, 16.0+), install with npm:
npm install franc
In Deno with esm.sh
:
import {franc, francAll} from 'https://esm.sh/franc@6'
In browsers with esm.sh
:
<script type="module">
import {franc, francAll} from 'https://esm.sh/franc@6?bundle'
</script>
Use
import {franc, francAll} from 'franc'
franc('Alle menslike wesens word vry') //=> 'afr'
franc('এটি একটি ভাষা একক IBM স্ক্রিপ্ট') //=> 'ben'
franc('Alle menneske er fødde til fridom') //=> 'nno'
franc('') //=> 'und' (language code that stands for undetermined)
// You can change what’s too short (default: 10):
franc('the') //=> 'und'
franc('the', {minLength: 3}) //=> 'sco'
console.log(francAll('Considerando ser essencial que os direitos humanos'))
//=> [['por', 1], ['glg', 0.771284519307895], ['spa', 0.6034146900423971], …123 more items]
console.log(francAll('Considerando ser essencial que os direitos humanos', {only: ['por', 'spa']}))
//=> [['por', 1 ], ['spa', 0.6034146900423971]]
console.log(francAll('Considerando ser essencial que os direitos humanos', {ignore: ['spa', 'glg']}))
//=> [['por', 1], ['cat', 0.5367251059928957], ['src', 0.47461899851037015], …121 more items]
API
This package exports the identifiers franc
, francAll
.
There is no default export.
franc(value[, options])
Get the most probable language for the given value.
Parameters
value
(string
) — value to testoptions
(Options
, optional) — configuration
Returns
The most probable language (string
).
francAll(value[, options])
Get the most probable language for the given value.
Parameters
value
(string
) — value to testoptions
(Options
, optional) — configuration
Returns
Array containing language—distance tuples (Array<[string, number]>
).
Options
Configuration (Object
, optional) with the following fields:
options.only
Languages to allow (Array<string>
, optional).
options.ignore
Languages to ignore (Array<string>
, optional).
options.minLength
Minimum length to accept (number
, default: 10
).
CLI
Install:
npm install franc-cli --global
Use:
CLI to detect the language of text
Usage: franc [options] <string>
Options:
-h, --help output usage information
-v, --version output version number
-m, --min-length <number> minimum length to accept
-o, --only <string> allow languages
-i, --ignore <string> disallow languages
-a, --all display all guesses
Usage:
# output language
$ franc "Alle menslike wesens word vry"
# afr
# output language from stdin (expects utf8)
$ echo "এটি একটি ভাষা একক IBM স্ক্রিপ্ট" | franc
# ben
# ignore certain languages
$ franc --ignore por,glg "O Brasil caiu 26 posições"
# src
# output language from stdin with only
$ echo "Alle mennesker er født frie og" | franc --only nob,dan
# nob
Data
Supported languages
Package | Languages | Speakers |
---|---|---|
franc-min |
82 | 8M or more |
franc |
187 | 1M or more |
franc-all |
414 | - |
Language code
👉 Note: franc returns ISO 639-3 codes (three letter codes). Not ISO 639-1 or ISO 639-2. See also GH-10 and GH-30.
To get more info about the languages represented by ISO 639-3, use
iso-639-3
.
There is also an index available to map ISO 639-3 to ISO 639-1 codes,
iso-639-3/to-1.json
, but note that not all 639-3 codes can
be represented in 639-1.
Types
These packages are fully typed with TypeScript.
They export the additional types TrigramTuple
and Options
.
Compatibility
These package are at least compatible with all maintained versions of Node.js. As of now, that is Node.js 14.14+ and 16.0+. They also works in Deno and modern browsers.
Ports
Franc has been ported to several other programming languages.
- Elixir —
paasaa
- Erlang —
efranc
- Go —
franco
,whatlanggo
- R —
franc
- Rust —
whatlang-rs
- Dart —
francd
- Python —
pyfranc
The works franc is derived from have themselves also been ported to other languages.
Derivation
Franc is a derivative work from guess-language (Python, LGPL), guesslanguage (C++, LGPL), and Language::Guess (Perl, GPL). Their creators granted me the rights to distribute franc under the MIT license: respectively, Kent S. Johnson, Jacob R. Rideout, and Maciej Ceglowski.
Contribute
Yes please! See How to Contribute to Open Source.
Security
This package is safe.