Facial Recognition Algorithms Struggle to Detect Faces Under Masks, NIST Study Finds

PeskyMonkey/Shutterstock

A new study from the National Institute of Standards and Technology found facial recognition algorithms developed pre-pandemic struggle to identify masked faces.

Unlocking devices just isn’t the same during the COVID-19 pandemic. With millions of Americans forced to wear face masks to prevent the spread of coronavirus, facial recognition software like the biometric readers that allow users to access their smartphones has become less effective, according to a new National Institute of Standards and Technology report

The study found that facial recognition algorithms developed prior to the pandemic had error rates as high as 50% when attempting to identify people wearing masks. Released Monday, the report is the first in a series of NIST studies investigating how face masks affect facial recognition software. 

“With the arrival of the pandemic, we need to understand how face recognition technology deals with masked faces,” Mei Ngan, a NIST computer scientist and an author of the report, said in a statement announcing the study. “We have begun by focusing on how an algorithm developed before the pandemic might be affected by subjects wearing face masks.”

Later this summer, Ngan said NIST plans to test algorithms designed with face masks in mind. 

The study, which was conducted in collaboration with the Department of Homeland Security’s Science and Technology Directorate, Office of Biometric Identity Management, and Customs and Border Protection, used two large datasets of photographs currently in government use to test the effectiveness of 89 algorithms. 

One dataset contained unmasked photos from applications for immigrant benefits. The second set of photos were taken from U.S. border crossings. Researchers digitally added masks of various shapes and sizes to the second dataset, and used algorithms to try to match these masked photos to the first set of unmasked application photos. 

The reason for this setup was to mimic real-life applications of facial recognition technology. Mask-wearing individuals need to be able authenticate their identity against unmasked passport or visa photos.

While the best facial recognition algorithms on the market usually fail to find a match when there is one about 0.3% of the time, these same algorithms failed to find a match when a digital mask was applied 5% of the time. Lower quality algorithms had even worse rates, failing to match the photos between 20% and 50% of the time. 

On top of the higher false negative rates, making a second attempt at identification may not help the algorithms. Sometimes users adjust the way they are standing or the expression on their face, and algorithms are able to positively match them with their identity photo. But the NIST study said this might not work for people wearing masks if “the failure is a systematic property of the algorithm.” 

Researchers also found the shape and color of masks matters for accuracy. The study used a set of nine digitally created masks that varied in how much of the face was covered, from smaller round masks to wide masks covering the entire bottom half of the face and nose.

“Unsurprisingly masks that occlude more of the face give larger false nonmatch rates,” the study reads.

When noses were covered, according to the study, algorithms struggled most. This is important because research suggests keeping noses covered is critical to preventing the spread of COVID-19. 

While the digitally created masks offer a number of advantages, masks in the real world come in a nearly endless variety of shapes, styles and colors. And people wear them in many different ways, so the study was somewhat limited in that respect. 

The study also did not take into account people wearing glasses and it did not include an analysis of how various demographic groups may be affected differently when interacting with facial recognition software while wearing masks. 

A NIST study published last year found that even without masks, facial recognition software isn’t as good at identifying the faces of people of color as it is identifying the faces of white people. For domestic law enforcement images used in that study, Native American, African American and Asian populations all had elevated false positive rates.