• Stars
    star
    314
  • Rank 133,353 (Top 3 %)
  • Language
    HTML
  • License
    Apache License 2.0
  • Created over 8 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

โŒจ๏ธ Wordlists, Dictionaries and Other Data Sets for Writing Software Security Test Cases

Twitter: @decalresponds Ask Me Anything! werdlists stars forks Apache License 2.0 repo-size last commit watchers made-with-bash PayPal


werdlists


"Word Lists" for Software Security Test Cases

Word lists, Dictionary Files, Attack Strings, Miscellaneous Datasets and Proof-of-Concept Test Cases With a Collection of Tools for Penetration Testers

Brief Introduction to werdlists โœ‚๏ธ

This project is a collection of word lists--they are mostly whitespace-delimited or line-based. Although the passes-dicts folder contains inputs for password cracking, overall the files amassed here are intended to be useful in facilitating the creation of insecure program state (with the help of a black-box fuzzer or scanning tool.) The vast majority of files are simply ASCII with the UNIX style newline. Beware that this project does not attempt in any way to be minimalist or lack verbosity!

Inspiration Taken From Similar Projects ๐Ÿ’ญ

werdlists is very similar to fuzzdb and SecLists. SecLists is maintained by my former colleague at IOActive, Daniel Miessler. Admittedly, werdlists is quite similar in mission as it's a centralized attack strings and input data resource. Regardless, werdlists expands on a number of concepts: it has its own unique style, organization, original hand-crafted contents, dataset creation/management/validation scripts, scanner springboards, etc.

Unique Features Only Available With werdlists ๐Ÿ’ฏ

werdlists cross-references between the code repositories of third-party scanners and its own datasets that each tool will benefit from. Moreover, there are specialized parsing scripts exclusive to werdlists that extract results produced through pairing test tools with its own data. Output strings are gathered from those results and fed back into the test tools. In other words, there are a number of interactive and/or tunable feedback loops implemented. Quite a few of the werdlists data files were created this way.

Repository Directory Hierarchy and Structure ๐Ÿ”ฉ

The scripts folder consists of shell scripts used for repository maintenance. There is a sub-directory of scripts called init where scripts that initialize data files are stored. If a script filename stored in init contains two dashes, then it's output should reflect the contents of the associated data file. For example, compare manpages-environ and clib-package-names. All scripts were written using bash syntax. The contrib folder is for storing scripts contributed via pull request and the utils folder contains utilities that aren't necessarily specific to the werdlists project, such as scripts for managing any wordlist file. Other data files were manually composed by hand and a small handful were created by recycling output strings back into input parameter lists, i.e. dirbdirs-feedback The tools folder lists security tools that the datasets contained in this repository can be provided as input for. Individual folders are detailed in the Folder Names and Description of Contents section below. All files in each dataset directory are detailed in the local README.md file for that folder (as opposed to the global README.md in the root directory being read now.)

Naming Scheme, Syntax and Meaning ๐Ÿ’ฌ

Most files have the *.txt extension signifying the text/plain MIME type Often used formats besides plain text include: Comma-Separated Values (text/csv), Extended Markup Language (application/xml), Hyper Text Markup Language (application/html), etc. Any file that is larger than 1MB uncompressed will be compressed with xz according to the commands in the scripts/xzlarge-files bash script. Other file extensions in use are: *.ans, *.asc, *.bin, *.c, *.conf, *.cpp, *.csv, *.html, *.inf, *.ini, *.json, *.md, *.rpz, *.rst, *.sh, *.txt, *.xml, *.yaml, *.yml, *.zip, and *.zone.

Folder Names and Description of Contents ๐Ÿ“‹

ย ย ย ย Folderย ย Nameย ย ย ย  Description of Contents
apple-paths ๐Ÿš€ Pathnames found on MacOS file systems
apple-data ๐ŸŽ Data identifiers and such from Apple's MacOS operating system
arpa-headers ๐Ÿ“ง Header fields transmitted over RFC2822 style protocols like SMTP
ascii-art ๐ŸŽจ "Low bit" a.k.a. 7-bit ASCII art items without control characters
biology-info ๐Ÿ”ฌ Reference information useful in the study of biological issues
browser-data ๐Ÿšช Data related to GUI browser software like Chrome, FireFox, etc.
cert-data ๐Ÿ“œ Information commonly utilized by cryptographic certificate materials
char-encodes ๐Ÿ‰ Various character encodings provided by different locales/charsets
char-sequence โœ’๏ธ various character sequences modeled after ctype.h
chat-data ๐Ÿ˜ฎ Additional data on IRC, XMPP and other such messaging protocols
cipher-data ๐Ÿก Data denoting or used by cryptographic algorithm implementations
cmd-usage ๐Ÿ”จ Help text shown in a terminal when attempting to execute CLI programs
code-keywords โ˜• Computer language identifiers, reserved words and similar syntax
cpu-arch ๐Ÿญ Low-level computer architecture and hardware subjects
crypt-output โœจ Cipher text string outputs created by cryptographic hash functions
database-strs ๐Ÿ’พ Strings often encountered when working with database software
dns-domains ๐ŸŒ A list of domains that may have been found in the live DNS tree at one point
dns-hostnames ๐Ÿ”ฆ The host name part of an FQDN
dns-records ๐ŸŽซ Data specific to RR's in the DNS system
dns-servers ๐Ÿ”‹ Data provided to, produced by or related to DNS name servers
dns-toplevel ๐Ÿ” TLD's or Top Level Domains in the uppermost part of the DNS hierarchy
environ-vars โ›บ Environment variable names, settings, etc.
exploit-info ๐ŸŽฑ Technical information on exploitation of security vulnerabilities
file-extens โš“ Stuff on Filename extensions, i.e. the part after the dot
file-specs ๐Ÿ“ File format specifications as distributed by vendor(s)/author(s)
ftp-data ๐Ÿ“ค Various FTP datum from RFC's and elsewhere
glibc-data โš™๏ธ Data taken from the source code of the GNU C Library
html-words โŒจ๏ธ Words not uncommon to come across when parsing HTML dialects
http-agents ๐ŸŽ Software version banners for HTTP User Agents also known as browsers
http-headers ๐Ÿช Header fields sent in requests/responses by browser/server software
http-methods โ–ถ๏ธ Names Request methods browsers send in the first line of HTTP
http-params ๐Ÿ”ก Parameters browsers sometimes send when requesting server URI paths
http-security ๐Ÿ” HTTP security info such as Content Security Policy
http-servers ๐Ÿข Information related to the usage of web server software
http-status ๐ŸŽฐ Numeric HTTP status codes in server reply as RFC7231 specifies
inet-addrs ๐Ÿ”Œ Numeric Internet addresses a.k.a. IP addresses--mostly version 4
inet-routes โ˜๏ธ Data useful in the maintenance and use of an Internet routing table
inet-services โ›ฒ Lists of Internet protocols/daemons--similar to /etc/services
infosec-people :neckbeard: Noteworthy individuals known from information security communities
iso-codes โœ”๏ธ Codes, numbers and such as standardized by ISO
java-data โ˜€๏ธ Data found in or related to source code of programs written with Java
linux-data ๐Ÿ”Ÿ Data identifiers and such from the Linux operating system
linux-paths ๐Ÿ–‡๏ธ Pathnames found on file systems created by Linux installations
malware-iocs ๐Ÿ’€ IOC for identification of malware infections
mobile-devs ๐Ÿ“ฑ Mobile device development for "handheld" form factors
net-attacks โ™จ๏ธ Info about attacks on telecommunications and Internetworks
net-ifaces ๐Ÿ–ฅ๏ธ Detailed information which can be extracted from network interfaces
ntfs-paths ๐Ÿ“‚ File paths expected to be seen in NTFS folders
owasp-data ๐Ÿ Data from or for OWASP
passes-dicts ๐Ÿ”‘ Dictionary files for brute-force attacks against account passwords
passes-sites ๐Ÿ”“ Hashed or unencrypted passwords that were publicized after the breach of a well-known site
perl-data ๐Ÿซ Data often seen in PERL (Practical Extraction and Report Language)
php-data ๐Ÿ“„ Files containing information about the PHP programming language
postal-data ๐Ÿ“ฌ United States Postal Service information
python-data ๐Ÿ Data used by the Python scripting language interpreter at runtime
radio-data ๐Ÿ“ป Things commonly used in radio frequency transmissions
regex-data ๐Ÿ’ฌ Regular expression patterns used to launch/detect attacks
ruby-data ๐Ÿ’Ž Data typically seen within the syntax of the Ruby scripting language
search-dorks ๐Ÿ”Ž General purpose search-engine queries likely to find insecure sites
smtp-messages โœ‰๏ธ Messages (i.e. signatures, auto-replies, etc.) sent by SMTP servers
soap-messages ๐Ÿ“จ SOAP (Simple Object Access Protocol) messages
social-data ๐Ÿ‘€ Sociological or social media related data sets including logins and user names
software-strs ๐Ÿ’ฝ Strings describing software engineering, programming languages, etc.
string-enums ๐ŸŽก Enumerations of values that aren't too terribly unusual
system-admin ๐Ÿ‘” System administration and BOFH related materials
system-notices โš ๏ธ Disclaimer/warning messages shown by networked computer systems
telco-data ๐Ÿ“ž Voice telecommunications technologies: POTS, PCS, VoIP, SMS etc.
text-files ๐Ÿ“Œ zine articles and such like those archived at Jason Scott's textfiles.com
text-words โœ๏ธ Lists of words likely to be found in an actual hard copy dictionary
top-secret ๐Ÿ‘ฝ Files and/or data related to documents that were/are classified
unicode-data ๐Ÿ”ฃ Unicode character usage and representation
unix-data ๐Ÿš Data associated with various flavors of the UNIX OS and its clones
unix-paths ๐Ÿ—„๏ธ File path names found in various UNIX file systems
uri-attacks ๐Ÿ’ฅ Malicious URI materials specially crafted for attack targets
uri-schemes ๐Ÿ“Ž Lists containing references for URI schemes (part before colon)
uri-data ๐Ÿ”— Universal Resource Identifier related data
vuln-data ๐Ÿ“Š Information about security vulnerabilities found in server software
webapp-attacks ๐Ÿ’‰ Proof-of-concept samples demonstrating attacks against web applications
webapp-data ๐Ÿ’ผ Data associated with applications hosted on web servers
webapp-dirs ๐Ÿ“‘ Directories related to applications running on a web server
webapp-files ๐Ÿ“‡ Files related to applications running on a web server
webapp-paths ๐Ÿ„ Path names related to applications running on a web server
webapp-words ๐Ÿ’ญ Words related to applications running on a web server
web-sites ๐ŸŒŽ Addresses to and/or information on significant WWW sites
wifi-networks ๐Ÿ“ก IEEE 802.11 Wi-Fi network information
windows-data ๐Ÿ’ผ Data only found within the Microsoft Windows series of OSes

ans asc bin c conf cpp csv html inf ini json md rpz rst sh txt xml yaml yml zip zone

More Repositories

1

ssltest-stls

๐Ÿ› ๏ธ Proof-of-concept code for Heartbleed a.k.a. CVE2014-0160 with STARTTLS support for various protocols
Python
43
star
2

pathgro

๐ŸŒฑ combinatoric pathname wordlist expansion--it's like Miracle-Gro(tm) for your dirbusting technique!
Scheme
13
star
3

irc-uncloak-nse

๐Ÿ‘ฅ NSE (NMap Scripting Engine) Lua code for detecting IRC networks with IP address uncloaking weakness
Lua
8
star
4

scevron

SCan EVerything with ruby RONin (Derbycon 4.0 "Family Rootz" Code)
Ruby
7
star
5

bounty-targets

๐ŸŽฏ Information About Bug Bounty Program Targets
7
star
6

hex4vbs

๐Ÿ‘จโ€๐Ÿ’ป Hex-encoded VBScript Statements Pair Obfuscation & Circumvention of MSIE or Its Sister-Product IIS
Python
7
star
7

zap-attack

โšก Conduct attacks based on information gathered from the OWASP ZAP API
Ruby
6
star
8

pafogenz

Pathname generator for directory brute-forcing web apps
C
6
star
9

nmap2csv

Convert default nmap scan output to comma-seperated values format..
Shell
4
star
10

fjorge

netcat-like program for HTTP(S)
C
4
star
11

cgiaudit

๐Ÿ“ฆ general-purpose, "black box" CGI auditing tool (ARCHIVE)
C
4
star
12

inetaddr

:electron: Bash functions to convert numeric Internet addresses from/to single value and dotted-quad representations with decimal, octal and/or hexadecimal
Shell
3
star
13

combos_permutedirs

๐Ÿ•ธ๏ธ Use the combinatorics gem to permute the power set of slash-delimited directory names in local and network path names as a penetration testing reconnaissance technique
Ruby
3
star
14

iisleakenv

๐Ÿ’ง IIS CGI Environment Variable Leakage Proof-of-Concept
HTML
3
star
15

uncloakirc-freenode

๐Ÿ‘ฅ Ruby script to uncloak users' virtual hosts on the FreeNode IRC network
Ruby
2
star
16

githose

๐Ÿšฐ BASH Helper Scripts for Mapping Parallelized GitHub API Functions to User Names in Technicolor!
Shell
2
star
17

strglob

โฉ Shared library that expands supplied globbing pattern syntax into multiple strings
C
2
star
18

cmdlog

Command Logging LSM (Linux Security Module)
C
1
star
19

uri-encodings-fuzzer

๐Ÿ˜ผ Fuzz a given URI with encoding techniques provided by LibWhisker a.k.a. LW2
Perl
1
star
20

tac

๐Ÿพ tac (reverse cat) brief bash script
Shell
1
star
21

dotfiles

๐Ÿ—„๏ธ Configuration and runcom files often hidden in $HOME
Shell
1
star
22

twitter-userid-enum

๐Ÿฆ Enumerate Twitter Users Concurrently Based on ID Numbers
Ruby
1
star
23

ip6cat

The Swiss Army Knife of IPv6 Addressing
1
star
24

porkbind

๐Ÿท Nameserver security scanner (ARCHIVE)
C
1
star