In science fiction, facial recognition technology is a hallmark of a dystopian society. The truth of how it was created, and how it’s used today, is just as freaky.
In a new study, researchers conduct a historical survey of over 100 data sets used to train facial recognition systems compiled over the last 43 years. The broadest revelation is that, as the need for more data (i.e. photos) increased, researchers stopped bothering to ask for the consent of the people in the photos they used as data.
Researchers Deborah Raji of Mozilla and Genevieve Fried of AI Now published the study on Cornell University’s free distribution service, arXiv.org. The MIT Technology Review published its analysis of the paper Friday, describing it as “the largest ever study of facial-recognition data” that “shows how much the rise of deep learning has fueled a loss of privacy.”
Within the study’s charting of the evolution of facial recognition datasets, there are moments in history and facts about this technology’s development that are revealing. They show how the nature of facial recognition is that it’s a flawed technology when applied to real-world scenarios, created with the express purpose of expanding the surveillance state, with the effect of degrading our privacy.
Here are 9 scary and surprising takeaways from 43 years of facial recognition research.
1. The gulf between how well facial recognition performs in academic settings vs. real world applications is vast.
One of the reasons the researchers give for undertaking their study is to understand why facial recognition systems that perform at near 100 percent accuracy in testing are deeply flawed when they’re applied in the real world. For example, they say, New York City’s MTA halted a facial recognition pilot after it had a 100 percent error rate. Facial recognition, which has been proven to be less accurate on black and brown faces, recently led to the arrest of three Black men in who were incorrectly identified by the tech.
2. The Department of Defense is responsible for the original boom in this technology.
Though efforts to develop facial recognition began in academic settings, it took off in 1996 when the DoD and National Institute of Standards and Technology (NIST) allocated $6.5 million to create the largest dataset to date. The government got interested in this area because of its potential for surveillance that did not require people to actively participate, unlike fingerprinting.
3. The early photos used to create facial recognition data came from portrait sessions, which enabled big flaws.
It seems almost quaint, but before the mid-2000s, the way researchers amassed databases was by having people sit for portrait settings. Because some of the foundational facial recognition tech today came from these datasets, the flaws of the portrait technique resonate. Namely, a non-diverse set of participants, and staged settings that don’t accurately reflect real-world conditions.
4. When portrait sessions weren’t enough, researchers just started scraping Google — and stopped asking for consent.
Yep, when researchers wanted to expand datasets beyond portraits, this is literally what happened. A 2007 dataset called Labeled Faces in the Wild scraped Google, Flickr, YouTube, and other online repositories of photos. That included photos of children. While this led to a greater variety of photos, it also discarded the privacy rights of the subjects.
“In exchange for more realistic and diverse datasets, there was also a loss of control, as it became unmanageable to obtain subject consent, record demographic distributions, maintain dataset quality and standardize attributes such as image resolution across Internet-sourced datasets,” the paper reads.
5. The next boom in facial recognition came from Facebook.
The researchers cite a turning point in facial recognition when Facebook revealed the creation of its DeepFace database in 2014. Facebook showed how the collection of millions of photos could create neural networks that were far better at facial recognition tasks than previous systems, making deep learning a cornerstone of modern facial recognition.
6. Surprise surprise, Facebook’s massive facial recognition undertaking violated users’ privacy.
Facebook has since been fined by the FTC and paid a settlement to the state of Illinois for using the photos users uploaded to Facebook to enable its facial recognition without getting users’ affirmative consent. The way DeepFace manifested was through “Tag Suggestions,” a feature that was able to suggest the person in your photo you might want to tag. Accepting or rejecting tags in turn made Facebook’s systems smarter. Tag Suggestions were opt-out, which meant participating in this technology was the default.
7. Facial recognition has been trained on the faces of 17.7 million people — and that’s just in the public datasets.
In reality, we don’t know the number or identity of people whose photos made them unwitting participants in the development of facial recognition tech.
8. Automation in facial recognition has led to offensive labeling systems and unequal representation.
Facial recognition systems have evolved beyond identifying a face or a person. They can also label people and their attributes in offensive ways.
“These labels include the problematic and potentially insulting labels regarding size – ‘chubby’, ‘double chin’ – or inappropriate racial characteristics such as ‘Pale skin,’ ‘Pointy nose,’ ‘Narrow eyes’ for Asian subjects and ‘Big nose’ and ‘Big lips’ for many Black subjects,” the paper reads. “Additionally there is the bizarre inclusion of concepts, such as ‘bags under eyes,’ ‘5 o’clock shadow’ and objectively impossible labels to consistently define, such as ‘attractive.'”
Faces considered “western” became the default in training sets. And other datasets expressly created to increase diversity were problematic themselves: One such system’s purpose was to “train unbiased and discrimination-aware face recognition algorithms,” but the researchers point out it only “divide[d] human ethnic origins into only three categories.”
These faults go beyond just being offensive. Research has shown that discrimination in AI can reinforce discrimination in the real-world.
9. The applications of facial recognition tech today range from government surveillance to ad targeting.
Facial recognition has both stayed true to its roots and expanded beyond what its creators in the 1970s could possibly imagine.
“We can see from the historical context that the government promoted and supported this technology from the start for the purpose of enabling criminal investigation and surveillance,” the authors write. For example, Amazon has already sold its problematic Rekognition tech to an untold number of police departments.
On the other end of the spectrum, some training sets promise that it can help develop systems to analyze sentiment of shoppers and better track and understand potential customers.
Which is more dystopian: The surveillance state or an all-knowing capitalist advertising machine? You decide.