• Stars
    star
    350
  • Rank 121,229 (Top 3 %)
  • Language
    Python
  • License
    MIT License
  • Created about 9 years ago
  • Updated about 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

A Python Wiktionary Parser

Wiktionary Parser

A python project which downloads words from English Wiktionary (en.wiktionary.org) and parses articles' content in an easy to use JSON format. Right now, it parses etymologies, definitions, pronunciations, examples, audio links and related words.

Downloads

JSON structure

[{
    "pronunciations": {
        "text": ["pronunciation text"],
        "audio": ["pronunciation audio"]
    },
    "definitions": [{
        "relatedWords": [{
            "relationshipType": "word relationship type",
            "words": ["list of related words"]
        }],
        "text": ["list of definitions"],
        "partOfSpeech": "part of speech",
        "examples": ["list of examples"]
    }],
    "etymology": "etymology text",
}]

Installation

Using pip
  • run pip install wiktionaryparser
From Source
  • Clone the repo or download the zip
  • cd to the folder
  • run pip install -r "requirements.txt"

Usage

  • Import the WiktionaryParser class.
  • Initialize an object and use the fetch("word", "language") method.
  • The default language is English, it can be changed using the set_default_language method.
  • Include/exclude parts of speech to be parsed using include_part_of_speech(part_of_speech) and exclude_part_of_speech(part_of_speech)
  • Include/exclude relations to be parsed using include_relation(relation) and exclude_relation(relation)

Examples

>>> from wiktionaryparser import WiktionaryParser
>>> parser = WiktionaryParser()
>>> word = parser.fetch('test')
>>> another_word = parser.fetch('test', 'french')
>>> parser.set_default_language('french')
>>> parser.exclude_part_of_speech('noun')
>>> parser.include_relation('alternative forms')

Requirements

  • requests==2.20.0
  • beautifulsoup4==4.4.0

Contributions

If you want to add features/improvement or report issues, feel free to send a pull request!

License

Wiktionary Parser is licensed under MIT.

More Repositories

1

soundcloud-dl

A Python project that downloads tracks from soundcloud.com, complete with metadata and album art
Python
119
star
2

SoftwareOscilloscope

A software oscilloscope for Arduino made with Python and PyQtGraph
Python
119
star
3

wordbot

A Telegram dictionary bot written in Python
Python
50
star
4

LeapProjects

Experiments with the Leap Motion API for Python
Python
33
star
5

autoindex

A command line tool to automatically create a navigable index for e-books
Python
5
star
6

greenscreen

A tool for video meetings which adds a virtual webcam that blurs backgrounds
JavaScript
4
star
7

enjoyable-learning

A collection of resources on various topics that are not just thorough but also fun
3
star
8

twilio-audio-streaming

JavaScript
2
star
9

MCUProjects

Microcontroller projects
C
2
star
10

ActivityClassification

Final year B.Tech project on activity classification using a wrist worn wearable
Arduino
2
star
11

lumos-app

A react-native for the Lumos project
TypeScript
2
star
12

lumos

A Bluetooth Controlled LED Strip Project
C++
2
star
13

blog

JavaScript
1
star
14

nptel-dl

A small command-line program to download video lectures from nptel.ac.in
Python
1
star
15

Flask_Microblog

Microblog created with Flask. (Standard tutorial)
Python
1
star
16

aoc-2020

Solutions to Advent of Code 2020 in Rust
Rust
1
star
17

autoindex-site

Web app for the autoindex project
TypeScript
1
star
18

rust_exercism

Solutions to problems on the Rust track from Exercism
Rust
1
star
19

termscope

A command line tool that plots data from sockets/serial ports
Rust
1
star
20

weekly-clock

A beginner vanilla JS project that shows how long you (probably) have left to live in weeks
JavaScript
1
star