• Stars
    star
    220
  • Rank 180,422 (Top 4 %)
  • Language
    Python
  • License
    MIT License
  • Created about 11 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

A small module meant for use in text generators that lets you filter strings for bad words.

wordfilter

A small module meant for use in text generators. It lets you filter strings for bad words.

Getting Started

Install the module with: npm install wordfilter

var wordfilter = require('wordfilter');
wordfilter.blacklisted('does this string have a bad word in it?'); // "false"

// clear the list entirely
wordfilter.clearList();

// add new words
wordfilter.addWords(['zebra','elephant']);
wordfilter.blacklisted('this string has zebra in it'); // "true"

// remove a word
wordfilter.removeWord('zebra');
wordfilter.blacklisted('this string has zebra in it'); // "false"

Or with Python: Install the module with: pip install wordfilter

from wordfilter import Wordfilter
wordfilter = Wordfilter()
wordfilter.blacklisted('does this string have a bad word in it?')  # False

# clear the list entirely
wordfilter.clearList()

# add new words
wordfilter.addWords(['zebra','elephant'])
wordfilter.blacklisted('this string has zebra in it')  # True

Documentation

This is a word filter adapted from code that I use in a lot of my twitter bots. It is based on a list of words that I've hand-picked for exclusion from my bots: essentially, it's a list of things that I would not say myself. Generally speaking, they are "words of oppression", aka racist/sexist/ableist things that I would not say.

The list is not all-inclusive, and I'm always adding words to it. If you'd like to file an issue or a pull request to add more words, please do so, but understand that this is primarily for use in my own projects, and I may not agree to add certain words. (For example, I have no problem with scatological words, so "shit" and "fuck" will never be on this list.)

Words are case insensitive.

Also note that due to the complexities of the English language, I am considering anything containing the substring of a bad word to be blacklisted. For example, even though "homogenous" is not a bad word, it contains the substring "homo" and it gets filtered. The reason for this is that new slang pops up all the time using compound words and I can't possibly keep up with it. I'm willing to lose a few words like "homogenous" and "Pakistan" in order to avoid false negatives.

Contributing

In lieu of a formal styleguide, take care to maintain the existing coding style. Add unit tests for any new or changed functionality. Lint and test your code using Grunt.

License

Copyright (c) 2013 Darius Kazemi Licensed under the MIT license.

More Repositories

1

corpora

A collection of small corpuses of interesting data for the creation of bots and similar stuff.
JavaScript
4,914
star
2

express-activitypub

A very simple reference implementation of an ActivityPub server using Express.js
JavaScript
588
star
3

rss-to-activitypub

An RSS to ActivityPub converter.
JavaScript
557
star
4

NaNoGenMo-2015

National Novel Generation Month, 2015 edition.
340
star
5

twitter-archiver

Make your own simple, public, searchable Twitter archive
JavaScript
299
star
6

NaNoGenMo-2014

National Novel Generation Month, 2014 edition.
257
star
7

examplebot

A simple example Twitter bot using NodeJS.
JavaScript
225
star
8

metaphor-a-minute

Metaphor a Minute! You too can write an annoying philosophy twitter bot.
JavaScript
210
star
9

NaNoGenMo

National Novel Generation Month. Because.
183
star
10

ja2

The source code for Jagged Alliance 2. I didn't write this; see the Strategy First license agreement for details. Supplementary material for the Jagged Alliance 2 Boss Fight Book.
C
114
star
11

rapbot

JavaScript
64
star
12

grunt-init-twitter-bot

A grunt init template for making Twitter bots, preloaded with some useful libs.
JavaScript
60
star
13

ea-thesaurus

The Edinburgh Associative Thesaurus (EAT) is a set of word association norms showing the counts of word association as collected from subjects.
45
star
14

sorting-bot

The Sorting Hat Bot (@SortingBot on Twitter)
JavaScript
40
star
15

twoheadlines

@twoheadlines
CSS
38
star
16

latourswag

Bruno Latour + #swag = Twitter bot!
JavaScript
33
star
17

gender-probability

Providing gender probabilities for US/UK names using Open Gender Tracker's [Global Name Data](https://github.com/OpenGenderTracking/globalnamedata) resource.
JavaScript
27
star
18

TheEthicalAdBlocker

This browser extension provides a 100% guaranteed ethical ad blocking experience.
JavaScript
25
star
19

gaunt

Simple, versatile, achingly beautiful.
JavaScript
24
star
20

spewer

A reverse part-of-speech tagger. Give it a list of tags and it spews out matching language.
JavaScript
23
star
21

gutencorpus

This is a simple tool that lets you search the top 100-ish Project Gutenberg ebooks for text.
JavaScript
21
star
22

projects

A listing of my projects.
JavaScript
20
star
23

reverseocr

A bot that attempts to draw words.
JavaScript
19
star
24

wordnik-bb

A node.js interface to the Wordnik API, which lets you get dictionary definitions, random words, pronunciation, and more!
JavaScript
18
star
25

bracket-meme-bot

A bot that make "bracket memes".
JavaScript
16
star
26

farewell

Employee farewell letter generator.
JavaScript
16
star
27

museumbot

Tweeting the Met.
JavaScript
16
star
28

roof-slapping-bot

*slaps roof of source code* this bad boy can fit so many bugs in it
JavaScript
12
star
29

painterly-textures

JavaScript
11
star
30

harpooneers

Code for "HARPOONEERS AND SAILORS", a novel I generated for NaNoGenMo 2015.
JavaScript
10
star
31

mastodon-autoreply

A bot that replies to new followers, ideally saying "I've moved! Follow me (here)."
JavaScript
10
star
32

grunt-init-textgen

A grunt-init template for text generating pages with twitter/link sharing.
JavaScript
10
star
33

outslide

A random slide generator. I'm sorry.
JavaScript
10
star
34

teamsnake-simple

Early network build of Team Snake.
JavaScript
8
star
35

tweetYourArchive

Set up a bot to tweet your twitter archive, on a delay.
JavaScript
8
star
36

cyberfiction

It was the best of cybertimes, it was the worst of cybertimes.
JavaScript
7
star
37

hottestStartups

Really hot startup ideas.
JavaScript
7
star
38

very-simple-whiteboard

Very simple whiteboard, tuned for a Chrome Pixel.
JavaScript
6
star
39

corpora-project

This is the NPM package to access the latest corpora data.
JavaScript
5
star
40

wordnik-hackathon

The Wordnik / Bot Summit Hackathon
5
star
41

overzealous-autocomplete

Overzealous autocomplete.
JavaScript
5
star
42

chum-corpus

Occasionally updated chumbox images and headlines.
4
star
43

youMustBe

Software, you must be a generator because you are a thing that generates output.
JavaScript
4
star
44

amen-chopper

This is a little toy that takes the Amen Break and chops it up into slices of different lengths and offsets, playing the slices in random order at a certain bpm and running the whole thing through a filter. You can get very different beats just by adjusting these few settings.
JavaScript
3
star
45

intersections

Venn Diagrams.
JavaScript
3
star
46

dialogue

Generative dialogue.
JavaScript
3
star
47

allthethings

Verb ALL the nouns!
JavaScript
3
star
48

slowtext

s l o w w w w t e x t
JavaScript
2
star
49

4myrealfriends

Source code for my real friends, real code for my source friends.
JavaScript
2
star
50

wolf3d

wolf3d hacks
JavaScript
2
star
51

lastwords

Last words of executed Texas death row inmates that contain "love".
JavaScript
2
star
52

pennyarcade

JavaScript
2
star
53

integers

29 Positive Integers Under 30
1
star
54

documentationPlayground

CSS
1
star
55

netboard-server

JavaScript
1
star
56

dariusbots

An account that RTs tweets from my twitter list that reach a certain number of favs+RTs.
JavaScript
1
star
57

netboard

JavaScript
1
star
58

fmk

Fuck, marry, or kill? A Twitter bot.
JavaScript
1
star
59

lmmtfy

Let Me Moogle That For You
JavaScript
1
star
60

TwineOnline

Web-based port of Twine
JavaScript
1
star
61

doctorwhat

Doctor Who speculation generator.
JavaScript
1
star
62

jqProclamations

everything is the jQuery of everything
JavaScript
1
star
63

ao3

God help me.
JavaScript
1
star
64

gengen

HTML
1
star
65

gqTest

gameQuery test project
1
star
66

generateShare

A template I can use for generators that has twitter sharing built in
JavaScript
1
star
67

ColorSprite

Grab colors from an image using Color Thief, create a sprite. Proof of concept.
JavaScript
1
star
68

spinny-machine

A spinny machine.
JavaScript
1
star
69

fuckvideogames

Fuck videogames.
JavaScript
1
star