• Stars
    star
    158
  • Rank 237,131 (Top 5 %)
  • Language
    Python
  • Created almost 4 years ago
  • Updated almost 4 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Update of uncaptcha2 from 2019

YouTube Video Proof of Concept

I created a new YouTube Video with technical Explanation for breaking Google's Audio reCAPTCHAs:

Breaking Audio ReCaptcha Video

Click on the image below to see the bot in action:

Breaking Audio ReCaptcha Video

Conclusion

After a while, Google blocks you based on one of (or combination of):

  1. Your IP address
  2. Your Browser fingerprint
  3. Your Browser JavaScript configuration if using headless browsers
  4. Or the lack of human-like behavior such as mouse events or touch events

BUT: The Audio reCAPTCHA itself is completely, utterly broken. The normal, visual reCAPTCHA still works more or less. But it's a matter of time I guess.

My estimation is: In 2021 and the near future, there is no other way to tell humans apart from bots than to:

  1. Record massive amounts of real human website behavior (mouse movements, scrolling, touch events, window resizing)
  2. Train an advanced Artificial Neuronal Network with this data
  3. And classify live behavioral data of website users

Kinda interesting times.

Introduction

This repository uses the research work from the authors of uncaptcha2.

The original scientific paper can be found here.

The authors propose a method to solves Google's Audio ReCaptcha with Google's own Speech-to-Text API.

Yes you read that correctly: It is possible to solve the Audio version of ReCaptcha v2 with Google's own Speech-to-Text API.

Since the release of uncaptcha2 is from Janunary 18, 2019, the Proof of Concept code does not work anymore (as the authors predicted correctly).

This repository attempts to keep the proof of concept up to date and working.

Changes compared to uncaptcha2

Audio Download Option was removed

The ReCaptcha audio download link does not work anymore, Google removed the download option.

Therefore, the audio download link has to be obtained via the Developer Console and a small JavaScript snippet.

If I am not mistaken, ReCaptcha sanctions the opening of dev tools.

Therefore, the better way is to start the chrome browser in debug mode and to obtain the audio download url via puppeteer and the chrome remote debug protocol. This method is implemented in the script getCaptchaDownloadURL.js. This method is currently used.

However, I fear that there are ways for ReCaptcha to detect if the browser is started in debug mode with the command line flag --remote-debugging-port=9222.

Randomized Mouse Movements

I randomized the mouse movements a bit and created random intermediate mouse movements before going to the target destination.

Regarding this, there is much more possible.

Known Issues

Of course Google is not easily tricked. After all, ReCaptcha v3 is still based on ReCaptcha v2. When you think that 97% 91% of all captchas can be solved with this method in production, I need to warn you:

Google is very reluctant to serve the audio captcha. After all, audio captchas are supposed to be solved by visually impaired people.

I assume that there is a simple counter for serving audio captchas. If more than X audio captchas were served, Google will simply block you.

Even if you are navigating as real human being to the audio captcha, you will often get banned by ReCaptcha. If you are not logged into the Google account, you will get very often the following error when attempting to solve the audio captcha:

Google Says no to the audio captcha

I do not know how Google decides to block you, but I heavily assume that the very simple act of repeatingly prompting for the audio captcha is enough to become suspicious.

Installation

The code was developed and tested on Ubuntu 18.04.

The following software needs to be installed:

aplay
chromium-browser
xclip
ffmpeg
curl

In order to install the Python 3.7 dependencies, create an virtual environment with pipenv:

# create pipenv
pipenv --python 3.7

# install dependencies
pipenv install -r requirements.txt

# create pipenv shell
pipenv shell

After those commands, the program solveAudioCaptcha.py may be executed:

python solveAudioCaptcha.py

Adjust Coordinates

The captcha is solved with mouse pointer automation using the python module pyautogui. Coordinates are used to automate the captcha solving.

Your setup very likely differs from my setup.

Therefore, you need to adjust the coordinates in solveAudioCaptcha.py.

You can also modify the time.sleep() calls in order to speed up or slow down the bot.

More Repositories

1

GoogleScraper

A Python module to scrape several search engines (like Google, Yandex, Bing, Duckduckgo, ...). Including asynchronous networking support.
HTML
2,527
star
2

se-scraper

Javascript scraping module based on puppeteer for many different search engines...
HTML
518
star
3

Crawling-Infrastructure

Distributed crawling infrastructure running on top of severless computation, cloud storage (such as S3) and sophisticated queues.
TypeScript
373
star
4

zardaxt

Passive TCP/IP Fingerprinting Tool. Run this on your server and find out what Operating Systems your clients are *really* using.
Python
198
star
5

stealthy-scraping-tools

Minimal set of tools to conduct stealthy scraping.
Python
115
star
6

struktur

Module that extracts structured information from a rendered html site and outputs JSON. HTML to JSON.
JavaScript
64
star
7

scrapeulous

Cloud crawler functions for scrapeulous
JavaScript
43
star
8

IP-Address-API

Datacenter / Hosting IP Address API - Find out if an IP address belongs to a hosting provider such as AWS, Azure or Digitalocean
37
star
9

adblock-detect-javascript-only

Detecting uBlock Origin and Adblock Plus with JavaScript only
JavaScript
33
star
10

SVG-Captcha

A SVG Captcha library written in PHP with stunning Performance and independence of any third party software!
PHP
22
star
11

incolumitas

my static site blog using pelican
JavaScript
17
star
12

youtube-scraping

A Node library that scrapes YouTube video data
JavaScript
16
star
13

Dragonfly-SAE

Implementation of rfc7664 dragonfly key exchange using ECC
Python
13
star
14

dynamically-changing-puppeteer-proxies

The chrome browser controlled via puppeteer does not support switching proxies without restarting the browser. In this tutorial I show how to implement this functionality with the help of a third party module.
JavaScript
11
star
15

scraping-amazon-reviews

Scraping Amazon reviews using headless chrome and selenium
Python
10
star
16

aws-scraper-example

JavaScript
9
star
17

lichess_cheat

Cheating with stockfish engine for lichess. Works on Windows, Linux and Mac.
Python
8
star
18

Large-Primes-for-RSA

Finding large prime numbers for RSA
Python
8
star
19

chess-com-cheat

Library that hooks into PR_Write() and PR_Read() in firefox processes and manipulates WebSocket Messages to cheat on chess.com
C
8
star
20

Scripts

All my programming(scripting) work which doesn't make it to a standalone project but might be useful for the future...
PHP
7
star
21

db.js

In-Memory Key-Value Database with Persistent File Storage
JavaScript
7
star
22

dragonfuzz

Fuzz the WPA3 SAE authentication. We will fuzz the Auth-Commit frame and the Auth-Confirm frame.
Python
4
star
23

clearcontent

A basic but mighty wordpress theme built on underscore and bootstrap 3
PHP
3
star
24

detecting-brightdata

detecting-brightdata
HTML
3
star
25

proxychecker

Checks the status of an proxy server.
Python
2
star
26

fuzz_sae_hostap

Fuzzing of the sae handshake in hostapd via libFuzzer
C
2
star
27

CunningCaptcha

A simple, but complete (down to vector graphics) captcha implementation class. Wordpress plugin. In the beginning of development. September 2013.
PHP
2
star
28

lichess-bot

Bot playing hyper bullet on lichess
JavaScript
2
star
29

TraversingGraphs

Shows in Python how to traverse graphs with Depth-First-Search and Breadth-First-Search
Python
1
star
30

3proxy-docker

Dockerfile for 3proxy setup
Shell
1
star
31

ChatServer

Chat server in java for uni project
Java
1
star
32

probabilistic-sketches

Various implementations for probabilistic sketches
Python
1
star