• Stars
    star
    156
  • Rank 239,589 (Top 5 %)
  • Language
    Python
  • Created over 4 years ago
  • Updated about 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Create your Custom Wordlist For Fuzzing

CWFF - Custom wordlists for fuzzing

CWFF is a tool that creates a special High quality fuzzing/content discovery wordlist for you at the highest speed possible using concurrency and it's heavily inspired by @tomnomnom's Who, What, Where, When, Wordlist #NahamCon2020.

Usage

CWFF [-h] [--threads] [--github] [--subdomains] [--recursive] [--js-libraries] [--connected-websites] [--juicy-files] [--use-filter-model] [-o] domain

positional arguments:
  domain                Target website(ofc)

optional arguments:
  -h, --help            Show this help message and exit
  --threads             The number of maximum concurrent threads to use (Default:1000)
  --github              Collect endpoints from a given github repo (ex:https://github.com/google/flax)
  --subdomains          Extract endpoints from subdomains also while search in the wayback machine!
  --recursive           Work on extracted endpoints recursively (Adds more endpoints but less accurate sometimes)!
  --js-libraries        Extract endpoints from JS libraries also, not just the JS written by them!
  --connected-websites  Include endpoints extracted from connected websites
  --juicy-files         Include endpoints extracted from juicy files like sitemap.xml and robots.txt
  --use-filter-model    Filter result endpoints with filter_model file
  -o                    The output directory for the endpoints and parameters. (Default: website name)

Description (Important)

So it basically collects endpoints and parameters of the target and its subdomains using many sources we will talk about them now:

  1. Archive wayback machine: it goes through all records of the target website and its subdomains and pulls urls that gives 200 status code.

A lot of tools goes through the top page only of wayback to save time but here we go through all records at little time but this also makes it takes a lot of time when you use --subdomains flag.

  1. Javascript files that's collected during the wayback phase and the ones collected by parsing the target page for <script> tag

CWFF tries to separate the JS libraries from the JS files actually written by website developers and it does that by looking into JS files names. By default, CWFF extracts endpoints from the JS files written by developers only, to use JS libraries (Mostly not helpful) activate the --js-libraries flag.

  1. Common crawl CDX index and Alien vault OTX (Open Threat Exchange)
  2. If you gave CWFF the --juicy-files flag, it would also extract endpoints from files like Sitemap.xml and robots.txt (Could add more in the future)
  3. If you gave CWFF a github repository using the --github flag, it would extract paths from that repo using Github API (No API key needed).

Just to make it clear, CWFF would use the files and directories paths only so it won't extract endpoints from inside the files itself!

  1. With using the --connected-websites flag, CWFF would use builtwith website API (Needs key but it's free) to extract the connected websites to the target from the relationship profile then extracts endpoints from these websites source.

Note: you can get your API key from this page and set the variable at API_keys.py file.

After collecting endpoints from all these endpoints if you used the --recursive flag, CWFF would recursively extract parts from collected endpoints.

  • Example: an endpoint like parseq/javadoc/1.1.0/com will become all these endpoints:
    parseq/javadoc/1.1.0/com
    parseq/javadoc/1.1.0/
    parseq/javadoc/
    parseq/
    javadoc/
    1.1.0/
    com

Note: all endpoints/parameters collected are cleaned and sorted with no duplicates to have a unique result.

Filtering results

Of course after all these sources and this work, there would be a lot of unwanted/useless endpoints among the important ones and here filtering comes to play to save time and resources.

In CWFF you can detect and remove the unwanted endpoints using three methods:

  • Remove endpoints that ends with any string from a given list (extensions for example).
  • Remove endpoints that contains any string from a given list of strings.
  • And finally the big one, remove endpoints that a match any regular expressions from a given list also.

All this filter options can be given by setting the variables at filter_model.py file then use the --use-filter-model flag while starting CWFF. If you don't have an idea how to set this variables, see the comments I left in the file it's the one I mostly use and in the screenshot it lowered the number of collected endpoints from 26,177 to 3629. In case you forgot to use filtering while running CWFF, don't worry I got you covered πŸ˜„

You can use script filter.py to filter endpoints you have as the following way and it would load the filter_model.py file automatically without having to rerun CWFF:

python filter.py wordlist.txt output.txt

Requirements

  • Python 3.6+
  • It should work on any operating system but I only tested it on Linux Manjaro.
  • The following instructions

Installation

python3 -m pip install -r requirements.txt
python3 cwff.py --help

Contact

TODO

  • Merge endpoints recursively
  • Extract website unique words by comparing to RFC.

Donation

If my work has been useful for you, feel free to thank me by buying me a coffee.

Coffee

Disclaimer

CWFF is created to help in penetration testing and it's not responsible for any misuse or illegal purposes.

Copying a code from this tool or using it in another tool is accepted as you mention the source :smile

More Repositories

1

Cr3dOv3r

Know the dangers of credential reuse attacks.
Python
1,911
star
2

One-Lin3r

Gives you one-liners that aids in penetration testing operations, privilege escalation and more
Python
1,606
star
3

Dr0p1t-Framework

A framework that create an advanced stealthy dropper that bypass most AVs and have a lot of tricks
Python
1,345
star
4

elpscrk

An Intelligent wordlist generator based on user profiling, permutations, and statistics. (Named after the same tool in Mr.Robot series S01E01)
Python
673
star
5

Cuteit

IP obfuscator made to make a malicious ip a bit cuter
Python
530
star
6

PyLoggy

A python keylogger that does more than any other keylogger - Key logger, Clicks logger and Screenshots
Python
402
star
7

PyFlooder

A http flood python script that could stop a normal website in 10s
Python
351
star
8

PasteJacker

Hacking systems with the automation of PasteJacking attacks.
Python
343
star
9

Clickjacking-Tester

A python script designed to check if the website if vulnerable of clickjacking and create a poc
Python
122
star
10

bugz-tools

A collection of tools I wrote for bug bounty or hacking and don't mind publishing it πŸ˜„
Python
103
star
11

Chrome-Extractor

Python script that will extract all saved passwords from your google chrome database on windows only
Python
61
star
12

AdflyUrlGrabber

A python script designed to grab the original url from an adfly url without opening it :D
Python
57
star
13

Wormy

some python3 functions to add spreading features to any python backdoor
Python
56
star
14

Twitter-Info

A simple python script to grab twitter account info just by username or profile link
Python
47
star
15

Anti_Killer

Kill Any Antivirus Using Python For Windows Users .
Python
44
star
16

Palsy-Virus

Python virus that will make your pc paralyzed once it opened :D
Python
44
star
17

AirCracker

Basic python script for detect airdroid users in lan
Python
43
star
18

ReverseHttp

Python backdoor that uses http post/get requests to communicate
Python
39
star
19

Exif-Grabber

A python tool that will extract exif data from picture with two methods
Python
35
star
20

SSH-Honeypot

Create Basic SSH Honeypot With Python
Python
27
star
21

Paste2Web

A python3 script that uses cl1p website to send and receive secret messages
Python
26
star
22

My-laziness

Random scripts I write to automate non-hacking stuff
Python
21
star
23

Domain-Checker

Python tool to check the HTTP response code for a list of websites
Python
20
star
24

EagleEye

A host-based IDS and network monitoring system (My graduation project)
HTML
19
star
25

WifiPass

Dump the saved wifi passwords for windows using regular expressions and python 3
Python
18
star
26

XOE

Exploit XXE Out-Of-Band Vulnerability Easily
Python
16
star
27

Arr4ng3d_Sh4r3

Share wifi on windows with arranged password like every hour or every day
Python
15
star
28

Humax-CLI

An unofficial Humax IR4000HD terminal client with enhanced features.
Python
10
star
29

Insta-Crawler

A python script designed to generate a random instagram pictures ids and try it
Python
9
star
30

Some-fun-with-CPP

Here you gonna find some of the C++ scripts I created in many fields, someone could benefit from it πŸ˜„
C++
7
star
31

D4Vinci

2
star
32

Scrapling

Lightning-Fast, Adaptive Web Scraping for Python
Python
1
star