• Stars
    star
    112
  • Rank 312,240 (Top 7 %)
  • Language
    Go
  • License
    MIT License
  • Created about 6 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Library for detecting profanities in Go

go-away

go-away

test Go Report Card codecov Go Reference Follow TwiN

go-away is a stand-alone, lightweight library for detecting and censoring profanities in Go.

This library must remain extremely easy to use. Its original intent of not adding overhead will always remain.

Installation

go get -u github.com/TwiN/go-away

Usage

package main

import (
    "github.com/TwiN/go-away"
)

func main() {
    goaway.IsProfane("fuck this shit")                // returns true
    goaway.ExtractProfanity("fuck this shit")         // returns "fuck"
    goaway.Censor("fuck this shit")                   // returns "**** this ****"
    
    goaway.IsProfane("F   u   C  k th1$ $h!t")        // returns true
    goaway.ExtractProfanity("F   u   C  k th1$ $h!t") // returns "fuck"
    goaway.Censor("F   u   C  k th1$ $h!t")           // returns "*   *   *  * th1$ ****"
    
    goaway.IsProfane("@$$h073")                       // returns true
    goaway.ExtractProfanity("@$$h073")                // returns "asshole"
    goaway.Censor("@$$h073")                          // returns "*******"
    
    goaway.IsProfane("hello, world!")                 // returns false
    goaway.ExtractProfanity("hello, world!")          // returns ""
    goaway.Censor("hello, world!")                    // returns "hello, world!"
}

Calling goaway.IsProfane(s), goaway.ExtractProfanity(s) or goaway.Censor(s) will use the default profanity detector, but if you'd like to disable leet speak, numerical character or special character sanitization, you have to create a ProfanityDetector instead:

profanityDetector := goaway.NewProfanityDetector().WithSanitizeLeetSpeak(false).WithSanitizeSpecialCharacters(false).WithSanitizeAccents(false)
profanityDetector.IsProfane("b!tch") // returns false because we're not sanitizing special characters

By default, the NewProfanityDetector constructor uses the default dictionaries for profanities, false positives and false negatives. These dictionaries are exposed as goaway.DefaultProfanities, goaway.DefaultFalsePositives and goaway.DefaultFalseNegatives respectively.

If you need to load a different dictionary, you could create a new instance of ProfanityDetector on this way:

profanities    := []string{"ass"}
falsePositives := []string{"bass"}
falseNegatives := []string{"dumbass"}

profanityDetector := goaway.NewProfanityDetector().WithCustomDictionary(profanities, falsePositives, falseNegatives)

You may also specify custom character replacements using WithCustomCharacterReplacements on a ProfanityDetector. By default, this is set to goaway.DefaultCharacterReplacements.

Note that all character replacements with a value of ' ' are considered as special characters while all characters with a value that is not ' ' are considered to be leetspeak characters. This means that using profanityDetector.WithSanitizeSpecialCharacters(bool) and profanityDetector.WithSanitizeLeetSpeak(bool) will let you toggle which character replacements are executed during the sanitization process.

Limitations

Currently, go-away does not support UTF-8. As such, if the strings you are feeding to this library come from unsanitized user input, you are advised to filter out all non-ASCII characters.

If you'd like to add support for UTF-8, see #43 and #47.

In the background

While using a giant regex query to handle everything would be a way of doing it, as more words are added to the list of profanities, that would slow down the filtering considerably.

Instead, the following steps are taken before checking for profanities in a string:

  • Numbers are replaced to their letter counterparts (e.g. 1 -> L, 4 -> A, etc)
  • Special characters are replaced to their letter equivalent (e.g. @ -> A, ! -> i)
  • The resulting string has all of its spaces removed to prevent w ords lik e tha t
  • The resulting string has all of its characters converted to lowercase
  • The resulting string has all words deemed as false positives (e.g. assassin) removed

In the future, the following additional steps could also be considered:

  • All non-transformed special characters are removed to prevent s~tring li~ke tha~~t
  • All words that have the same character repeated more than twice in a row are removed (e.g. poooop -> poop)
    • NOTE: This is obviously not a perfect approach, as words like fuuck wouldn't be detected, but it's better than nothing.
    • The upside of this method is that we only need to add base bad words, and not all tenses of said bad word. (e.g. the fuck entry would support fucker, fucking, etc.)

More Repositories

1

gatus

⛑ Automated developer-oriented status page
Go
3,741
star
2

go-color

A lightweight, simple and cross-platform package to colorize text in terminals
Go
78
star
3

g8

⛩️ Go library for protecting your HTTP handlers
Go
54
star
4

aws-eks-asg-rolling-update-handler

Handles rolling upgrades for AWS ASGs on EKS
Go
44
star
5

gocache

High performance and lightweight in-memory cache library with LRU and FIFO support as well as memory-usage-based-eviction
Go
24
star
6

spring-security-oauth2-client-example

Minimal configuration required for a Spring Boot project using Spring Security with OAuth2 client
Java
24
star
7

k8s-ttl-controller

Kubernetes controller that enables timed resource deletion using TTL annotation
Go
22
star
8

discord-music-bot

Minimal Discord music bot in Go
Go
18
star
9

go-pastebin

A Pastebin.com API wrapper in Go.
Go
5
star
10

under-maintenance

A very small Docker image that returns "Under maintenance" for every request made on the port 80.
Go
5
star
11

kevent

Simple library for creating Kubernetes events
Go
4
star
12

aws-eks-auto-tagger

Automatically tags EBS volumes created by Persistent Volumes within an AWS EKS cluster
Go
4
star
13

go-choice

A very simple library for interactively selecting an option on a terminal
Go
3
star
14

deepmerge

Go library for deep merging YAML or JSON
Go
3
star
15

helm-charts

Smarty
3
star
16

gdstore

Simple Key-Value store library in Go that persists data on disk
Go
3
star
17

spring-as-backend

A ready-to-go secure Spring backend
Java
3
star
18

whois

Lightweight WHOIS client in Go
Go
2
star
19

cat-for-windows

A port of the popular 'cat' utility used to concatenate file(s) to standard output in Go compiled as a Windows executable.
Go
2
star
20

r8limit

Dead simple rate limiter for Rust
Rust
2
star
21

dotfiles

Dotfiles (and more) for Archlinux on my Yoga C930
Shell
2
star
22

health

Simple Go health handler
Go
2
star
23

discord-reminder-bot

Lightweight Discord bot for managing reminders.
Go
1
star
24

template

Template repository
1
star
25

rpi-rasptank-pro

Go
1
star
26

doodle

Android application for drawing
Java
1
star
27

terraform-kubernetes-gatus

Terraform module for deploying Gatus in Kubernetes
HCL
1
star
28

dyr

Do you remember? is a tool meant to help you remember these nifty tricks you once learned
Go
1
star
29

gatus-controller

Configure Gatus with Annotations (https://github.com/TwiN/gatus/issues/393)
1
star
30

intellij-fluent-setter-generator

Allows you to generate fluent setters with or without a prefix.
Kotlin
1
star
31

PythonXatBot

DEPRECATED A Xat bot in Python (2.7) that does not require the (bot) power.
Python
1
star
32

TwinUtils

My personal utility library
Java
1
star
33

telegram-music-bot

Minimal music bot for Telegram
Go
1
star
34

ssh-tunnel

A simple project made for a friend
Java
1
star
35

LazyReusableFunctions

A collection of functions that are reusable
Java
1
star
36

decapit8

1
star