Jamdict
Jamdict is a Python 3 library for manipulating Jim Breen's JMdict, KanjiDic2, JMnedict and kanji-radical mappings.
Documentation: https://jamdict.readthedocs.io/
Main features
- Support querying different Japanese language resources
- Japanese-English dictionary JMDict
- Kanji dictionary KanjiDic2
- Kanji-radical and radical-kanji maps KRADFILE/RADKFILE
- Japanese Proper Names Dictionary (JMnedict)
- Fast look up (dictionaries are stored in SQLite databases)
- Command-line lookup tool (Example)
Contributors are welcome!
Try Jamdict out
Jamdict is used in Jamdict-web - a web-based free and open-source Japanese reading assistant software. Please try out the demo instance online at:
https://jamdict.herokuapp.com/
There also is a demo Jamdict virtual machine online for trying out Jamdict Python code on Repl.it:
https://replit.com/@tuananhle/jamdict-demo
Installation
Jamdict & Jamdict database are both available on PyPI and can be installed using pip
pip install --upgrade jamdict jamdict-data
Sample jamdict Python code
from jamdict import Jamdict
jam = Jamdict()
# use wildcard matching to find anything starts with ι£γΉ and ends with γ
result = jam.lookup('ι£γΉ%γ')
# print all word entries
for entry in result.entries:
print(entry)
# [id#1358280] γγΉγ (ι£γΉγ) : 1. to eat ((Ichidan verb|transitive verb)) 2. to live on (e.g. a salary)/to live off/to subsist on
# [id#1358300] γγΉγγγ (ι£γΉιγγ) : to overeat ((Ichidan verb|transitive verb))
# [id#1852290] γγΉγ€γγ (ι£γΉδ»γγ) : to be used to eating ((Ichidan verb|transitive verb))
# [id#2145280] γγΉγ―γγγ (ι£γΉε§γγ) : to start eating ((Ichidan verb))
# [id#2449430] γγΉγγγ (ι£γΉζγγ) : to start eating ((Ichidan verb))
# [id#2671010] γγΉγͺγγ (ι£γΉζ
£γγ) : to be used to eating/to become used to eating/to be accustomed to eating/to acquire a taste for ((Ichidan verb))
# [id#2765050] γγΉγγγ (ι£γΉγγγ) : 1. to be able to eat ((Ichidan verb|intransitive verb)) 2. to be edible/to be good to eat ((pre-noun adjectival (rentaishi)))
# [id#2795790] γγΉγγγΉγ (ι£γΉζ―γΉγ) : to taste and compare several dishes (or foods) of the same type ((Ichidan verb|transitive verb))
# [id#2807470] γγΉγγγγ (ι£γΉεγγγ) : to eat together (various foods) ((Ichidan verb))
# print all related characters
for c in result.chars:
print(repr(c))
# ι£:9:eat,food
# ε°:12:eat,drink,receive (a blow),(kokuji)
# ι:12:overdo,exceed,go beyond,error
# δ»:5:adhere,attach,refer to,append
# ε§:8:commence,begin
# ζ:11:hang,suspend,depend,arrive at,tax,pour
# ζ
£:14:accustomed,get used to,become experienced
# ζ―:4:compare,race,ratio,Philippines
# ε:6:fit,suit,join,0.1
Command line tools
To make sure that jamdict is configured properly, try to look up a word using command line
python3 -m jamdict lookup θ¨θͺε¦
========================================
Found entries
========================================
Entry: 1264430 | Kj: θ¨θͺε¦ | Kn: γγγγγ
--------------------
1. linguistics ((noun (common) (futsuumeishi)))
========================================
Found characters
========================================
Char: θ¨ | Strokes: 7
--------------------
Readings: yan2, eon, μΈ, NgΓ΄n, NgΓ’n, γ²γ³, γ΄γ³, γ.γ, γγ¨
Meanings: say, word
Char: θͺ | Strokes: 14
--------------------
Readings: yu3, yu4, eo, μ΄, Ngα»―, Ngα»©, γ΄, γγ.γ, γγ.γγ
Meanings: word, speech, language
Char: ε¦ | Strokes: 8
--------------------
Readings: xue2, hag, ν, HoΜ£c, γ¬γ―, γΎγͺ.γΆ
Meanings: study, learning, science
No name was found.
Using KRAD/RADK mapping
Jamdict has built-in support for KRAD/RADK (i.e. kanji-radical and radical-kanji mapping). The terminology of radicals/components used by Jamdict can be different from else where.
- A radical in Jamdict is a principal component, each character has only one radical.
- A character may be decomposed into several writing components.
By default jamdict provides two maps:
- jam.krad is a Python dict that maps characters to list of components.
- jam.radk is a Python dict that maps each available components to a list of characters.
# Find all writing components (often called "radicals") of the character ι²
print(jam.krad['ι²'])
# ['δΈ', 'ι¨', 'δΊ', 'εΆ']
# Find all characters with the component ιΌ
chars = jam.radk['ιΌ']
print(chars)
# {'ιΌ', 'ιΌ', 'ιΌ', 'ιΌ', 'ιΌ'}
# look up the characters info
result = jam.lookup(''.join(chars))
for c in result.chars:
print(c, c.meanings())
# ιΌ ['cover of tripod cauldron']
# ιΌ ['large tripod cauldron with small']
# ιΌ ['incense tripod']
# ιΌ ['three legged kettle']
# ιΌ []
Finding name entities
# Find all names with ι΄ζ¨ inside
result = jam.lookup('%ι΄ζ¨%')
for name in result.names:
print(name)
# [id#5025685] γγ₯γΌγγ£γΌγγγ (γγ₯γΌγγ£γΌι΄ζ¨) : Kyu-ti- Suzuki (1969.10-) (full name of a particular person)
# [id#5064867] γγγ€γ€γγγ (γγγ€γ€ι΄ζ¨) : Papaiya Suzuki (full name of a particular person)
# [id#5089076] γ©γΈγ«γ«γγγ (γ©γΈγ«γ«ι΄ζ¨) : Rajikaru Suzuki (full name of a particular person)
# [id#5259356] γγ€γγγγγγγ²γͺγ (ηε΄ι΄ζ¨ζ₯ε) : Kitsunezakisuzukihinata (place name)
# [id#5379158] γγγγ (ε°ι΄ζ¨) : Kosuzuki (family or surname)
# [id#5398812] γγΏγγγ (δΈι΄ζ¨) : Kamisuzuki (family or surname)
# [id#5465787] γγγγγ (ε·ι΄ζ¨) : Kawasuzuki (family or surname)
# [id#5499409] γγγγγ (倧ι΄ζ¨) : Oosuzuki (family or surname)
# [id#5711308] γγγ (ι΄ζ¨) : Susuki (family or surname)
# ...
Exact matching
Use exact matching for faster search.
Find the word θ±η« by idseq (1194580)
>>> result = jam.lookup('id#1194580')
>>> print(result.names[0])
[id#1194580] γ―γͺγ³ (θ±η«) : fireworks ((noun (common) (futsuumeishi)))
Find an exact name θ±η« by idseq (5170462)
>>> result = jam.lookup('id#5170462')
>>> print(result.names[0])
[id#5170462] γ―γͺγ³ (θ±η«) : Hanabi (female given name or forename)
See jamdict_demo.py
and jamdict/tools.py
for more information.
Useful links
- JMdict: http://edrdg.org/jmdict/edict_doc.html
- kanjidic2: https://www.edrdg.org/wiki/index.php/KANJIDIC_Project
- JMnedict: https://www.edrdg.org/enamdict/enamdict_doc.html
- KRADFILE: http://www.edrdg.org/krad/kradinf.html
Contributors
- Le Tuan Anh (Maintainer)
- alt-romes
- Matteo Fumagalli
- Reem Alghamdi
- Techno-coder