• Stars
    star
    226
  • Rank 176,514 (Top 4 %)
  • Language
    Jupyter Notebook
  • License
    Apache License 2.0
  • Created over 4 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Persian Swear Dataset - you can use in your production to filter unwanted content. دیتاست کلمات نامناسب و بد فارسی برای فیلتر کردن متن ها

Persian-Swear-Words

Persian (Farsi) Swear Words + .json Datasets

  • Author: Amir Shokri
  • Author Email: [email protected]
  • Last Update: 11 October, 2021
  • Data format: JSON Data
  • Functions Availabe :
    • Java
    • PHP
    • Python
    • Javascript
    • Swift
  • Contribute: Fork and Push Requests :)
  • DOI : 10.34740/kaggle/dsv/2094967

Note: This is a to-be-complete list of Persian Swears you can use in your production to filter unwanted content. Wordlist is available in JSON format.

یادداشت ها:

برخی از کلمات در ظاهر کلمات بد به حساب نمیان ولی برای کاربردهای خاص ممکنه نیاز به فیلتر شدن داشته باشن که هر کس با توجه به نیاز باید شخصی سازی انجام بده و از این دیتاست استفاده کنه

در صورت علاقه، به تکمیل شدن این دیتاست کمک کنید

از این دیتاست در فیلتر کردن متن ها در پروژه های خود استفاده کنید و متون پاک و سالمی را داشته باشید

از ارسال PR های کوچک خودداری کنید و مشارکت مفیدتری داشته باشید.

به جز مشارکت در دیتاست می توانید به زبان برنامه نویسی مورد نظر خودتان class یا function با کمک این دیتاست بنویسید و به پروژه اضافه کنید. در حال حاضر توابع مربوط به زبان های زیر موجود است:

  • Java
  • PHP
  • Python
  • Javascript
  • C#
  • Swift

Installation

composer

composer require amirshnll/persian-swear-words

npm

npm i persian-swear-words

Usage

Java 🔗 Class

var persianSwear = new PersianSwear();

// add word(s) to DataSet
persianSwear.addWord("word");
persianSwear.addWords(new String[]{"word1", "word2"});

// remove word(s) from DataSet
persianSwear.removeWord("word");
persianSwear.removeWords(new String[]{"word1", "word2"});

// check single word
persianSwear.isBad("الا.غ "); // true
persianSwear.isBad("امروز"); // false

// check existing bad word in text
persianSwear.hasSwear("تو هیز هستی");     // true
persianSwear.hasSwear("تو دوست من هستی"); // false

// replace bad words in text
persianSwear.filterWords("تو هیز هستی");      // تو * هستی
persianSwear.filterWords("تو هیز هستی", "&"); // تو & هستی

PHP 🔗 Class

require('PersianSwear.php');
$persianswear = new PersianSwear();

// is bad
if($persianswear->is_bad('خر'))
	echo 'is bad';
else
	echo 'not bad';

// not bad
if($persianswear->is_bad('امروز'))
	echo 'is bad';
else
	echo 'not bad';

// not bad
if($persianswear->is_bad('چرت و پرت'))
	echo 'is bad';
else
	echo 'not bad';

$persianswear->add_word('چرت و پرت');
// is bad

if($persianswear->is_bad('چرت و پرت'))
	echo 'is bad';
else
	echo 'not bad';

// is bad
if($persianswear->is_bad('گاو'))
	echo 'is bad';
else
	echo 'not bad';

$persianswear->remove_word('گاو');

// not bad
if($persianswear->is_bad('گاو'))
	echo 'is bad';
else
	echo 'not bad';

// not bad
if($persianswear->has_swear('تو دوست من هستی'))
	echo 'is bad';
else
	echo 'not bad';

// is bad
if($persianswear->has_swear('تو هیز هستی'))
	echo 'is bad';
else
	echo 'not bad';

echo $persianswear->filter_words('تو دوست من هستی'); // تو دوست من هستی 
echo $persianswear->filter_words('تو هیز هستی'); // تو * هستی 
echo $persianswear->filter_words('تو هیز هستی', "&"); // تو & هستی 

echo $persianswear->tostring(); // show all swear words

Python 🔗 Class

persianswear = PersianSwear()

print(persianswear.is_bad(,'خر',ignoreOT=False )) # True

print(persianswear.is_bad('امروز',ignoreOT=False )) # False

print(persianswear.is_bad('چرت و پرت',ignoreOT=False )) # False

persianswear.add_word('چرت و پرت')
print(persianswear.is_bad('چرت و پرت' , ignoreOT=False )) # True

print(persianswear.has_swear('تو دوست من هستی' , ignoreOT=False )) # False

print(persianswear.has_swear('تو هیز هستی' , ignoreOT=False )) # True

print(persianswear.filter_words('تو دوست من هستی' , ignoreOT=False )) # تو دوست من هستی

print(persianswear.filter_words('تو هیز هستی' , ignoreOT=False )) # تو * هستی

print(persianswear.filter_words('تو هیز هستی', '&' , ignoreOT=False )) # تو & هستی


print(persianswear.is_bad('خ.ر' , ignoreOT=True )) # True

print(persianswear.is_bad( 'ام.روز' , ignoreOT=True )) # False

print(persianswear.has_swear('تو دو.ست من هستی' , ignoreOT=True )) # False

print(persianswear.has_swear('تو اسک.ل هستی' , ignoreOT=True )) # True

print(persianswear.filter_words('تو دو.ست من هستی',ignoreOT=True )) # تو دو.ست من هستی

print(persianswear.filter_words('تو هی.ز هستی',ignoreOT=True )) # تو * هستی

print(persianswear.filter_words('تو هی.ز هس.تی' , ignoreOT=True )) # تو * هس.تی

print(persianswear.tostring()) # show all swear words

Javascript 🔗 Function

console.log(PersianSwear.is_bad('خر')); // true
console.log(PersianSwear.is_bad('امروز')); // false

console.log(PersianSwear.is_bad('چرت و پرت')); // false
PersianSwear.add_word('چرت و پرت');
console.log(PersianSwear.is_bad('چرت و پرت')); // true

console.log(PersianSwear.is_bad('گاو')); // true
PersianSwear.remove_word('گاو');
console.log(PersianSwear.is_bad('گاو')); // false

console.log(PersianSwear.has_swear('تو دوست من هستی')); // false
console.log(PersianSwear.has_swear('تو هیز هستی')); // true

console.log(PersianSwear.filter_words('تو دوست من هستی')); // تو دوست من هستی 
console.log(PersianSwear.filter_words('تو هیز هستی')); // تو * هستی 
console.log(PersianSwear.filter_words('تو هیز هستی', '&')); // تو & هستی 

C# 🔗 Helper

Create Filter

First of All You Need To Create Instance of FilterPersianWords

var filter = new FilterPersianWords();

if you have any optional json file path you can pass it down to constructor.

Use Functions

  • Is a single word bad?

var isBadWord = filter.IsBadWord("yourWord");

  • Is a multi line string bad?

var isBadSentence = filter.IsBadSentence("your long sentence");

  • Get all bad words inside of string

var badList = filter.GetBadWords("your long sentence");

  • Remove All Bad words From String

var clearedString = filter.RemoveBadWords("your bad sentence");

This Method Will not change any data from string except the bad words.



Swift 🔗 Classes and Protocol

کلاس اصلی PersianSwear هست، که متدها داخلش پیاده‌سازی شده:
// add word(s) to DataSet
PersianSwear.shared.addWord("bad-word")
PersianSwear.shared.addWords(["bad-word-1", "bad-word-2"])

// remove word(s) from DataSet
PersianSwear.shared.removeWord("bad-word")
PersianSwear.shared.removeWords(["bad-word-1", "bad-word-2"])

// check single word
let isBadWord = PersianSwear.shared.isBadWord("single word")

// check existing bad word in text
let hasBadWord = PersianSwear.shared.hasBadWord("long text")

// existing bad word in text
let badWords = PersianSwear.shared.badWords(in: "long text")

// replace bad words in text
let newText = PersianSwear.shared.replaceBadWords(in: "long text", with: "****")

یه پروتکل هم با اسم PersianSwearDataLoader داریم که کارش لود کردن کلمات هست:


protocol PersianSwearDataLoader {
	func loadWords(
		_ completion: @escaping (Result<PersianSwear.Words, Error>) -> Void
	)
}

برای نمونه، تایپ لود کننده کلمات از روی گیت‌هاب پیاده‌سازی شده. نمونه استفاده هم بصورت زیر هست:


let loader = GithubPersianSwearDataLoader()
PersianSwear.shared.loadWords(using: loader) { result in
	switch result {
	case .failure(let error):
		print("Error:", error.localizedDescription)
	case .success(let words):
		print("Words:", words.count)
	}
}



Related Link

More Repositories

1

custom-device-emulation-chrome

custom device emulation chrome | How To Add Custom Device on Chrome Emulation ?
289
star
2

rtbf.ir

Right to be forgotten - directory of direct links to delete your account from persian web services. دایرکتوری حق فراموش شدن سرویس های ایرانی
HTML
119
star
3

corona

به فکر خودتان باشید...
HTML
17
star
4

Personal-blog-php

Personal-blog-php
6
star
5

Coronavirus-simple-detection

Coronavirus simple detection with Java
Java
6
star
6

calendar-widget

persian calendar widget with html+css+js
HTML
6
star
7

speed-reading-test

Speed Reading Test Tools
JavaScript
5
star
8

PHPemoji

emoji library for PHP.
PHP
5
star
9

BigDump

Staggered import of large and very large MySQL Dumps (like phpMyAdmin dumps) even through the web servers with hard runtime limit and those in safe mode. | Persian Translation Version
PHP
5
star
10

Tetris

بازی با کلمات (Tetris) - persian language
HTML
4
star
11

coronavirus

coronavirus Detect & Online Tools
HTML
4
star
12

color-converter

تبدیل کد رنگ RGB به HEX و بلعکس
JavaScript
4
star
13

Popular-Maple-codes

Popular Maple codes - By Amir Shokri
4
star
14

product-glossary

https://productschool.com/product-glossary/
4
star
15

earthquake-system

earthquake system with php
JavaScript
4
star
16

persian-quotes

persian quotes generator | جملات تصادفی و نقل قول هایی را به زبان فارسی بخوانید
HTML
4
star
17

cryptocurrency-Price-API

Cryptocurrency Price API - with PHP
JavaScript
3
star
18

cv-theme

html cv theme
JavaScript
3
star
19

Solve-Google-I-O-2021

This is a series of interactive puzzles brought to you by Google Developers. Complete the challenges to reveal details about an upcoming event!
3
star
20

testament

Send Messages after Death
PHP
2
star
21

hugops-facebook

hugops facebook 10-04-2021
2
star
22

Persian-Visual-Question-Answering

Visual Question Answering in Persian Based on deep learning techniques (paper code)
Python
2
star
23

nll-LoremIpsum

Lorem ipsum Multi Language generator in PHP without dependencies. Compatible with PHP 5.3+.
PHP
2
star
24

shortcuts

کلیدهای میانبر نرم افزارها
2
star
25

hjfr

HTML
2
star
26

spiral-community

spiral design center community
JavaScript
2
star
27

Data-Science-Use-Cases

Data Science Use Cases Background, Use Cases By Function, Use Cases By Vertical, Use Cases That Need Fleshing Out | English And Persian Translation
2
star
28

truth-or-dare

Truth or Dare Online Game
HTML
1
star
29

image-meorability-based-on-local-images

Quantifying image memorability for adult Iranians
1
star
30

simple-book-shop

simple book shop with php | persian
JavaScript
1
star
31

indirect-download-with-multi-server

how to use indirect download file with multi-server on PHP
PHP
1
star
32

amirshnll

1
star
33

countdown

my personal countdown
HTML
1
star
34

PO-wordpress-translate

فایل ترجمه قالب ها و افزونه های وردپرس
1
star
35

skimage-persian-userguide

ترجمه ی فارسی راهنمای کاربران scikit image
1
star