Lawmaker Asks DHS Secretary to Reassess Facial Recognition Programs After NIST Report


A National Institute of Standards and Technology report found age, race and gender could impact the accuracy in some commonly-used facial recognition algorithms.

The House Homeland Security Committee chairman urged the Homeland Security secretary to investigate whether the department should pause and assess its facial recognition operations after a National Institute of Standards and Technology report found potential for bias in some leading facial recognition algorithms. 

Rep. Bennie Thompson, D-Miss., penned a letter Friday raising concerns over NIST’s “troubling report,” which he noted captured inaccuracies across demographic groups of algorithms “like those [Homeland Security] uses for facial recognition.”

“The results of this study are shocking,” Thompson wrote. “Given the disparities found by NIST, DHS should conduct an immediate assessment of whether to halt current facial recognition operations and plans for future expansion until such disparities can be fully addressed.”

Unveiled Thursday, the study is the third in a series of reports NIST is executing on ongoing facial recognition vendor tests. In this case, researchers evaluated 189 software algorithms from 99 developers using four large datasets that encompassed around 18 million photos of roughly 8 million people. The images were collected in U.S. governmental applications that were provided by the State Department, Homeland Security Department and the FBI and consisted of mugshots, visa photos, images taken at border-crossings and others. The research aimed to assess how well each algorithm performs two tasks that are among two of the most common applications of facial recognition: one-to-one matching, which matches a different photo of the same person in a database; and one-to-many, which matches one image to any records in a larger database. To evaluate each algorithm’s performance, researchers measured the two classes of error the software can make: false positives and false negatives.

Patrick Grother, a NIST computer scientist and the report’s primary author explained in a statement that in the study officials “found empirical evidence” that features such as age, race and gender affected the performance and accuracy “in the majority of the face recognition algorithms we studied.” 

For instance, for one-to-one matching, researchers noticed higher rates of false positives—meaning the software wrongly considered photos of two different individuals to show the same person—for Asian and African American faces relative to photos of white faces. The team noted that in this case, Asian and African Americans were 10 to 100 times more likely to be misidentified than white people. For one-to-many matching, the study indicated that there were higher rates of false positives for black women. Women were more likely to be misidentified than men and relatively young and relatively old people were more likely to be misidentified than people in middle age, according to the NIST research. 

“While we do not explore what might cause these differentials, this data will be valuable to policymakers, developers and end-users in thinking about the limitations and appropriate use of these algorithms,” Grother said. 

In the letter, Thompson reiterated that Customs and Border Protection has already deployed facial recognition to verify passenger identities entering or exiting the U. S. at 26 major domestic airports and plans to expand its screening operations using the technology going forward. He also noted that in July, CBP’s Deputy Executive Assistant Commissioner for Field Operations John Wagner testified to Congress that internal Homeland Security tests validated the accuracy of the facial recognition systems the agency and its components use.

“The results of NIST’s study raise serious questions as to how DHS’s internal reviews could have missed such drastic disparities apparently inherent to these technologies,” Thompson wrote. “DHS must explain to the Committee and the American public how it failed to identify such troubling disparities prior to deploying these technologies.”