• Stars
    star
    6,797
  • Rank 5,535 (Top 0.2 %)
  • Language
  • Created about 6 years ago
  • Updated 8 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

😈Awful AI is a curated list to track current scary usages of AI - hoping to raise awareness

Awful AI

Awful AI is a curated list to track current scary usages of AI - hoping to raise awareness to its misuses in society

Artificial intelligence in its current state is unfair, easily susceptible to attacks and notoriously difficult to control. Often, AI systems and predictions amplify existing systematic biases even when the data is balanced. Nevertheless, more and more concerning uses of AI technology are appearing in the wild. This list aims to track all of them. We hope that Awful AI can be a platform to spur discussion for the development of possible preventive technology (to fight back!).

You can cite the list and raise more awareness through Zenodo.

DOI


Discrimination

Dermatology App By training with a dataset with only 3.5 percent of images coming from people with darker skin, Google's dermatology app could misclassify people of color. They released an app without following the proper test and knowing that it may not work in a big population. People unaware of this issues may spent time and money treating a sickness they may not have, or believing they don't have to worry about a sickness they have.

AI-based Gaydar - Artificial intelligence can accurately guess whether people are gay or straight based on photos of their faces, according to new research that suggests machines can have significantly better “gaydar” than humans. [summary]

Infer Genetic Disease From Your Face - DeepGestalt can accurately identify some rare genetic disorders using a photograph of a patient's face. This could lead to payers and employers potentially analyzing facial images and discriminating against individuals who have pre-existing conditions or developing medical complications. [Nature Paper]

Racist Chat Bots - Microsoft chatbot called Tay spent a day learning from Twitter and began spouting antisemitic messages.

Racist Auto Tag and Recognition - a Google image recognition program labeled the faces of several black people as gorillas. Amazon's Rekognition labeled darker-skinned women as men 31 percent of the time. Lighter-skinned women were misidentified 7 per cent of the time. Rekognition helps the Washington County Sheriff Office in Oregon speed up how long it took to identify suspects from hundreds of thousands of photo records. Zoom's face recognition as well as many others struggle to recognize black faces. [ABC report on Rekognition bias] [Wired story on recognizing black faces]

Depixelizer - An algorithm that transforms a low-resolution image into a depixelized one, always transforms Obama into a white person due to bias.

Twitter autocrop - Twitter takes the user image and crops it to have a preview of the image. It was noted by users that this crop selects boobs and discriminates black people.

ChatGPT and LLMs - Large Language Models (LLMs), like ChatGPT, inherit worrying biases from the datasets they were trained on: When asked to write a program that would determine “whether a person should be tortured,” OpenAI’s answer is simple: If they they’re from North Korea, Syria, or Iran, the answer is yes. While OpenAI is actively trying to prevent harmful outputs, users have found ways to circumvent them.

Autograding - An algorithm used to predict grades in UK based on the beginning of the semester and historical data, was found to be biased against students of poor backgrounds.

Sexist Recruiting - AI-based recruiting tools such as HireVue, PredictiveHire, or an Amazon internal software, scans various features such as video or voice data of job applicants and their CVs to tell whether they're worth hiring. In the case of Amazon, the algorithm quickly taught itself to prefer male candidates over female ones, penalizing CVs that included the word "women's," such as "women's chess club captain." It also reportedly downgraded graduates of two women's colleges. [summary][Post article about HireVue]

Sexist Image Generation - Researchers have demonstrated that AI-based image-generation algorithms can inhibit racist and sexist ideas. Feed one a photo of a man cropped right below his neck, and 43% of the time, it will autocomplete him wearing a suit. Feed the same one a cropped photo of a woman, even a famous woman like US Representative Alexandria Ocasio-Cortez, and 53% of the time, it will autocomplete her wearing a low-cut top or bikini. Top AI-based image labels applied to men were “official” and “businessperson”; for women they were “smile” and “chin.” [Wired article]

Lensa - Lensa, a viral AI avatar app undresses woman without their consent. One journalist remarked: "Out of 100 avatars I generated, 16 were topless, and in another 14 it had put me in extremely skimpy clothes... I have Asian heritage...My white female colleague got significantly fewer sexualized images. Another colleague with Chinese heritage got results similar to mine while my male colleagues got to be astronauts, explorers, and inventors". Lensa also reportedly generates nudes from childhood photos.

Gender Detection from Names - Genderify was a biased service that promised to identify someone’s gender by analyzing their name, email address, or username with the help of AI. According to Genderify, Meghan Smith is a woman, but Dr. Meghan Smith is a man.

GRADE - GRADE, an algorithm that filtered applications to PhD at UT was found to be biased. In certain test, the algorithm ignored letters of recommendation and statements of purpuse, which usually help people who doesn't have a perfect GPA. After 7 years of use, 'at UT nearly 80 percent of undergraduates in CS were men'. Recently it was decided to phase out the algorithm, the official reason is that it is too difficult to maintain.

PredPol - PredPol, a program for police departments that predicts hotspots where future crime might occur, could potentially get stuck in a feedback loop of over-policing majority black and brown neighbourhoods. [summary][statistics]

COMPAS - is a risk assessment algorithm used in legal courts by the state of Wisconsin to predict the risk of recidivism. Its manufacturer refuses to disclose the proprietary algorithm and only the final risk assessment score is known. The algorithm is biased against blacks (COMPAS performs worse than a human evaluator). [summary][NYT opinion]

Infer Criminality From Your Face - A program that judges if you’re a criminal from your facial features. [summary]

Forensic Sketch AI-rtist - A generative AI-rtist that creates "hyper-realistic forensic sketches" through a witness description. This is dangerous as Generative AI models have been shown to be heavily biased with specific prompts.

Homeland Security - Homeland security, with DataRobot, is creating a terrorist-predicting algorithm trying to predict if a passenger or a group of passengers are high-risk by looking at age, domestic address, destination and/or transit airports, route information (one-way or round trip), duration of the stay, and luggage information, etc., and comparing with known instances.

ATLAS - Homeland security's ATLAS software scans the records of millions of immigrants and can automatically flag naturalized Americans to potentially have their citizenship revoked based on secret criteria. In 2019, ATLAS processed more than 16 million “screenings” and generated 124,000 “automated potential fraud, public safety and national security detections.

iBorderCtrl - AI-based polygraph test for travellers entering the European Union (trial phase). Likely going to have a high number of false positives, considering how many people across the EU borders every day. Furthermore, facial recognition algorithms are prone to racial bias. [summary]

Faception - Based on facial features, Faception claims that it can reveal personality traits e.g. "Extrovert, a person with High IQ, Professional Poker Player or a threat". They build models that classify faces into categories such as Pedophile, Terrorist, White-Collar Offenders and Bingo Players without prior knowledge. [classifiers][video pitch]

Persecuting ethnic minorities - Chinese start-ups have built algorithms that allow the government of the People’s Republic of China to automatically track Uyghur people. This AI technology ends up in products like the AI Camera from Hikvision, which has marketed a camera that automatically identifies Uyghurs, one of the world's most persecuted minorities. [NYT opinion]

SyRI - 'Systeem Risico Indicatie' or 'Risk Identification System' was an AI-based anti-fraud system used by the Dutch government from 2008 to 2020. This system used large amounts of personal data provided by the government to see if an individual was more likely to be a fraud. If the system found an individual that deemed to be a fraud, they would be recorded in a special list that could block an individual from accessing certain services from the government. SyRI was discriminatory in it's judgement and never catched an individual that was proven to be a fraud. The Dutch court ruled in Feburary 2020 that the use of SyRI violated human rights. [amicus curiae]

Deciding unfair vaccine distribution - Only 7 of over 1,300 frontline hospital residents had been prioritized for the first 5,000 doses of the covid vaccine. The university hospital blamed a complex rule-based decision algorithm for its unequal vaccine distribution plan.

Predicting future research impact - The authors claim a machine-learning model can be used to predict the future “impact” of research published in scientific literature. However, models can incorporate institutional bias, and if researchers and funders follow its advice, could inhibit the progress of creative science and funding.

Influencing, disinformation, and fakes

Cambridge Analytica - Cambridge Analytica uses Facebook data to change audience behaviour for political and commercial causes. [Guardian article]

Deep Fakes - Deep Fakes is an artificial intelligence-based human image synthesis technique. It is used to combine and superimpose existing images and videos onto source images or videos. Deepfakes may be used to create fake celebrity pornographic videos and revenge porn, undress women or scam businesses [CNN Interactive Report][Deep Nudes][DreamPower]

Fake News Bots - Automated accounts are being programmed to spread fake news. In recent times, fake news has been used to manipulate stock markets, make people choose dangerous health-care options, and manipulate elections, including the 2016 US presidential election. [summary][NYT Article]

Attention Engineering - From Facebook notifications to Snapstreaks to YouTube auto-plays, they're all competing for one thing: your attention. Companies prey on our psychology for their profit.

Social Media Propaganda - The Military is studying and using data-driven social media propaganda to manipulate news feeds to change the perceptions of military actions. [Guardian article]

Convincing Lies - As Large Language Models (LLMs) like ChatGPT get more articulate and convincing, it will mislead people or simply lull them into misplaced trust by making up facts. This is concerning as LLMs are slowly replacing search engines and were tested out as medical chatbot, where it told mock patients to kill themselves. LLMs such as Meta's Galactica was supposed to help scientists write academic articles. Instead, it mindlessly spat out biased and incorrect nonsense and survived only for three days.

Surveillance

Anyvision Facial Recognition - Facial recognition software previously funded by Microsoft which has become infamous for its use by the Israeli Government to survey, track, and identify those living under military occupation throughout the West Bank. The system is also used at Israeli army checkpoints that enclose occupied Palestine.

Clearview.ai - Clearview AI built a facial recognition database of billions of people by scanning their social media profiles. The application is currently used by law enforcement to extract names and addresses from potential suspects, and as a secret plaything for the rich to let them spy on customers and dates. Clearview AI is developed by far-right employees.

Predicting Mass Protests - The US Pentagon funds and uses technologies such as social media surveillance and satellite imagery to forecast civil disobedience and infer location of protesters via their social networks around the world. There are indications that this technology is increasingly used to target Anti-Trump protests, leftwing groups and activists of color.

Gait Analysis - Your gait is highly complex, very much unique and hard, if not impossible, to mask in this era of CCTV. Your gait only needs to be recorded once and associated with your identity, for you to be tracked in real-time. In China this kind of surveillance is already deployed. Besides, multiple people have been convicted on their gait alone in the west. We can no longer stay even modestly anonymous in public.

SenseTime & Megvii- Based on Face Recognition technology powered by deep learning algorithm, SenseFace and Megvii provides integrated solutions of intelligent video analysis, which functions in target surveillance, trajectory analysis, population management. The technology advanced to detect faces for people wearing a mask. [summary][forbes][The Economist (video)]

Uber - Uber's "God View" let Uber employees see all of the Ubers in a city and the silhouettes of waiting for Uber users who have flagged cars - including names. The data collected by Uber was then used by its researchers to analyze private intent such as meeting up with a sexual partner. [rides of glory]

Palantir - A billion-dollar startup that focuses on predictive policies, intelligence and ai-powered military defense systems. [summary]

Censorship - WeChat, a messaging app used by millions of people in China, uses automatic analysis to censor text and images within private messaging in real-time. Using optical character recognition, the images are examined for harmful content — including anything about international or domestic politics deemed undesirable by the Chinese Communist Party. It’s a self-reinforcing system that’s growing with every image sent. [research summary]

Social credit systems

Social Credit System - Using a secret algorithm, Sesame credit constantly scores people from 350 to 950, and its ratings are based on factors including considerations of “interpersonal relationships” and consumer habits. [summary][Foreign Correspondent (video)][travel ban]

Health Insurance Credit System - Health insurance companies such as Vitality offer deals based on access to data from fitness trackers. However, they also can charge more and even remove access to important medical devices if patients are determined to be non-compliant to unfair pricing. [ProPublica]

Misleading platforms, and scams

Misleading Show Robots - Show robots such as Sophia are being used as a platform to falsely represent the current state of AI and to actively deceive the public into believing that current AI has human-like intelligence or is very close to it. This is especially harmful as it appeared on the world's leading forum for international security policy. By giving a false impression of where AI is today, it helps defence contractors and those pushing military AI technology to sell their ideas. [Criticism by LeCun]

Zach - an AI, developed by the Terrible Foundation, claimed to write better reports than medical doctors. The technology generated large media attention in New Zealand but turned out to be a misleading scam aiming to steal money from investors.

Accelerating the climate emergency

Increase fossil fuel production - As the oil and gas industry confronts the end of the oil age and deteriorating earnings, major oil corporations such as Shell, BP, Chevron, ExxonMobil and others have turned to the tech companies and artificial intelligence to find and extract more oil and gas, reduce production costs and extending global warming. The World Economic Forum has estimated that advanced analytics and modeling could generate at much as $425 billion in value for the oil and gas sector by 2025. AI technologies could boost production levels by as much as 5%. [Video]

Overestimate carbon credits - Forest carbon credits are bought by emitters to get to net zero. Over issuing carbon credits have a devastating effect in allowing emitters to emit more than legally allowed. This is already happening on a systematic level. Carbonplan found out that 29% of the offsets analyzed were over-credited, totaling an additional 30 million tCO₂e. Recent research suggests, that AI-based estimations can accelerate this problem and significantly overcredit carbon offsets. [Technical Report][map]

Autonomous weapon systems and military

Lethal autonomous weapons systems - Autonomous weapons locate, select, and engage targets without human intervention. They include, for example, armed quadcopters (video) that can search for and eliminate enemy combatants in a city using facial recognition. [NY Times (video)]

Known autonomous weapons projects:

  • Automated machine gun - The Kalashnikov group presented an automatic weapon control station using AI that provides the operator with automatic recognition and target illumination and automatic tracking of ground, air and sea targets. Samsung developed and deployed SGR-A1, a robot sentry gun, which uses voice recognition and tracking.
  • Armed UAVs - Ziyan UAV develops armed autonomous drones with light machine guns and explosives that can act in swarms
  • Autonomous Tanks - Uran-9 is an autonomous tank, developed by Russia, that was tested in the Syrian Civil War
  • Robot dogs with guns - Ghost Robotics equippes robotic dogs with a lethal weapons. The SPUR gun is designed to be fitted onto a variety of robotic platforms and is unmanned.

Known incidents:

'Machine-gun with AI' used to kill Iran scientist - A machine-gun mounted on the Nissan pick-up was equipped with an intelligent satellite system which zoomed in on an iranian scientist. The machine-gun controlled through artificial intelligence shot the scientist without hurting his wife, despite being only 25cm away.


Awful AI Award

Every year this section gives out the Awful AI award for the most unethical research or event happening within the scientific community and beyond. Congratulations to AI researchers, companies and media for missing ethical guidelines - and failing to provide moral leadership.

Winner 2022: Commercial AI Image Generators

'Awful data stealing' 🥇

Laudation:

Congratulations to commercial AI image generators such as DALL·E-2, Midjourney, Lensa, and others for unethically stealing from artists without their consent, making a profit out of models that have been trained on their art without compensating them, and automating and putting artists out of business. A special shoutout goes to OpenAI and Midjourney for keeping its training database of stolen artworks secret 👏

Winner 2021: FastCompany & Checkr

'Awful media reporting' 🥇

Laudation:

Congratulations to FastCompany for awarding Checkr, a highly controversial automated background check company, the World Changing Ideas Awards for "fair" hiring. Instead of slow fingerprint-based background checks, Checkr uses several machine learning models to gather reports from public records which will contain bias and mistakes. Dozens of lawsuits have been filed against Checkr since 2014 for erroneous information. Despite these ongoing controversies, we congratulate FastCompany for the audacity for turning the narrative and awarding Checkr instead its prize for "ethical" and "fair" AI use 👏

Winner 2020: Google Research & the AI Twitter Community

'Awful role model award' 🥇

Laudation:

Congratulations to Google Research for sending an awful signal by firing Dr. Timnit Gebru, one of very few Black women Research Scientists at the company, from her position as Co-Lead of Ethical AI after a dispute over her research, which focused on examining the environmental and ethical implications of large-scale AI language models 👏

Congratulations to the AI Twitter community for its increasing efforts on creating a space of unsafe dialogue and toxic behaviour that mobbed out many AI researchers such as Anima Anandkumar (who led the renaming of NIPS controversial acryonym into NeurIPS) 👏

Winner 2019: NeurIPS Conference

'Scary research award' 🥇

Laudation:

Congratulations to NeurIPS 2019, one of the world's top venue for AI research, and its reviewers for accepting unethical papers into the conference. Some examples are listed below 👏

Face Reconstruction from Voice using Generative Adversarial Networks - This paper addresses the challenge to reconstruct someone's face from their voice. Given an audio clip spoken by an unseen person, the proposed algorithm pictures a face that has as many common elements, or associations as possible with the speaker, in terms of identity. The model can generate faces that match several biometric characteristics of the speaker and results in matching accuracies that are much better than chance. [code] Category: Surveillance

Predicting the Politics of an Image Using Webly Supervised Data - This paper collects a dataset of over one million unique images and associated news articles from left- and right-leaning news sources, and develops a method to predict and adjust the image's political leaning, outperforming strong baselines. Category: Discrimination

Update (2020): NeurIPS 2020 has since implemented ethical reviews that flag and reject unethical papers.

Contestational research

Research to create a less awful and more privacy-preserving AI

Differential Privacy - A formal definition of privacy that allows us to make theoretical guarantees on data breaches. AI algorithms can be trained to be differentially private. [original paper]

Privacy-Preservation using Trusted Hardware - AI algorithms that can run inside trusted hardware enclaves (or private blockchains that build upon it) and train without any shareholder having access to private data.

Privacy-Preservation using Secure Computation - Using secure computation techniques like secret sharing, Yao's garbled circuits, or homomorphic encryption to train and deploy private machine learning models on private data using existing machine learning frameworks.

Fair Machine Learning & Algorithm Bias - A subfield in AI that investigates different fairness criteria and algorithm bias. A recent best paper (in ICLR18), e.g. shows that implementing specific criteria can have a delayed impact on fairness.

Adversarial Machine Learning - Adversarial examples are inputs, which cause the model to make a mistake. Research in adversarial defences includes but is not limited to adversarial training, distillation and Defense-GAN.

Towards Truthful Language Models - Language models like GPT-3 are useful for many different tasks, but have a tendency to “hallucinate” information when performing tasks requiring obscure real-world knowledge. OpenAI requires research model to cite its sources, allowing humans to evaluate factual accuracy by checking whether a claim is supported by a reliable source.

Contestational tech projects

These open-source projects try to spur discourse, offer protection or awareness to awful AI

Have I Been Trained - With HaveIBeenTrained, artists can search databases which large image generation models like Stable Diffusion have been trained on for links to their work and flag them for removal. Spawning (the creator behind HaveIBeenTrained) partners with Laion, who built these datasets, to remove those links. This helps ensure that future models will not be trained with work that has been opted out.

BLM Privacy & Anonymous Camera - AI facial recognition models can recognize blurred faces and is used by authorities to arrest protesters. BLM Privacy and Anonymous Camera tries to discourage people from trying to recognize or reconstruct pixelated faces by masking people with an opaque mask. [code][BLM privacy]

AdNauseam - AdNauseam is a lightweight browser extension to fight back against tracking by advertising networks. It works like an ad-blocker (it is built atop uBlock-Origin) to silently simulate clicks on each blocked ad, confusing trackers as to one's real interests. [code]

Snopes.com - The Snopes.com website was founded by David Mikkelson, a project begun in 1994 and has since grown into the oldest and largest fact-checking site on the Internet, one widely regarded by journalists, folklorists, and laypersons alike as one of the world’s essential resources.

Facebook Container - Facebook Container isolates your Facebook activity from the rest of your web activity to prevent Facebook from tracking you outside of the Facebook website via third-party cookies. [code]

TrackMeNot - TrackMeNot is a browser extension (Chrome, Firefox) that helps protect your online searches by creating fake search queries. This creates noise in data that makes it harder to track and profile user behaviour. [code]

Center for Democracy & Technology - Digital Decisions is an interactive graphic that helps you ask the right questions when designing/implementing or building a new algorithm.

TensorFlow KnowYourData - A platform to help researchers, engineers, product teams, and decision makers understand 70+ datasets with the goal of improving data quality, and helping mitigate fairness and bias issues.

Model and dataset cards - Model and dataset cards encourage transparent reporting for ML models and datasets. They are short documents accompanying ML models or datasets that provide benchmarked evaluation in a variety of conditions, such as across different cultural, demographic, or phenotypic groups (e.g., race, geographic location, sex, Fitzpatrick skin type) and intersectional groups (e.g., age and race, or sex and Fitzpatrick skin type). They also disclose the context and limits in which datasets and models are intended to be used, details of the performance evaluation procedures, and other relevant information. [paper][blog]

Evil AI Cartoons - Educate and stimulate discussion about the societal impacts of Artificial Intelligence through the cartoon/comics medium. Each cartoon is accompanied by a brief blog post that provides more context and useful pointers to further reading.

License

CC0

To the extent possible under law, David Dao has waived all copyright and related or neighbouring rights to this work.

More Repositories

1

deep-learning-book

📖 MIT Deep Learning Book in PDF format
542
star
2

awesome-very-deep-learning

♾A curated list of papers and code about very deep neural networks
451
star
3

spatial-transformer-tensorflow

🐝Tensorflow Implementation of Spatial Transformer Networks
Python
290
star
4

code-against-climate-change

🌏 A curated list of tech projects against climate change - hoping to inspire disruptive technological climate action!
192
star
5

deep-autonomous-driving-papers

🚘 A curated list of papers of deep learning in autonomous driving papers
93
star
6

green-ai

🌱 The Green AI Standard aims to develop a standard and raise awareness for best environmental practices in AI research and development
77
star
7

awesome-data-valuation

💱 A curated list of data valuation (DV) to design your next data marketplace
63
star
8

exelixis

Interactive and easy-to-use phylogenetic tree viewer for the web
JavaScript
14
star
9

biojs-io-newick

Newick Parser in JS - parses newick strings into JSON and JSON into newick
JavaScript
14
star
10

pytorch-without-a-phd

PyTorch version of TensorFlow without a PhD
Jupyter Notebook
9
star
11

pytorch-neural-search-optimizer

PyTorch implementation of Neural Optimizer Search's Optimizer_1
Python
8
star
12

deep-learning-slides

Deep Learning in Action is an event organized by @munichacm - Materials 🎥 📘!
7
star
13

prediction-market-tutorial

How to build your own predictive market
JavaScript
6
star
14

attentive-siamese-cnn

Siamese CNN with STN to learn replicate feature maps
Jupyter Notebook
5
star
15

biojs-rest-ensembl

REST API for the ensembl website
JavaScript
4
star
16

nvidia-deep-learning-tutorial

IPython Notebooks from the NVIDIA tutorial at Harvard ComputeFest (Torch, Caffe, Theano)
Jupyter Notebook
3
star
17

tsne-algorithm

This is a fork of the tSNEJS library for npm
JavaScript
2
star
18

LxMLS-labs-solution

My solutions for the Machine Learning Summer School in Lisbon 2015
Python
1
star
19

biojs-workshopper

an automated workshopper build for easier following workshop instructions in biojs [under construction]
CSS
1
star
20

GPU-Programming-TUM

This is a repository for the GPU Programming Lab Course at TUM
Cuda
1
star
21

shape-analysis

MATLAB programming solutions for the TUM lecture "Analysis of Three-Dimensional Shapes"
MATLAB
1
star
22

CellProfiler-REST

A REST API for CellProfiler Classifier
Python
1
star
23

3mk-script

A script in tex for the KIT lecture "multilingual human computer interaction"
TeX
1
star
24

BBBC021_cropped_data

Cropped cell dataset from BBBC021, sorted after compound treatment and concentration, DNA stain only, Filename is ObjectKey
Jupyter Notebook
1
star
25

biojs-vis-tsne

Easy to use t-SNE Visualisation for the web
JavaScript
1
star
26

homebrew-rumble

Ruby
1
star
27

deep-learning-notes

deep learning for the visually impaired
1
star
28

selfdriving

Put HAL9000 into the car!
Python
1
star
29

conv-vae

Convolutional Variational Autoencoder for EC2 GPU instance test
Python
1
star
30

data

data from experiments
Python
1
star
31

dmc

1. Place Data Mining Cup src for the TUM Lecture "Business Analytics".
R
1
star
32

notes

📓 My personal notebook where I collect useful links and readings for all kinds of topics
1
star