Many federal efforts that arguably use data mining might be flying under the radar because the law requiring agencies to report on such activities applies a very narrow definition of the practice, according to a Constitution Project report released on Tuesday. Broadening that definition was among several policy recommendations the nonprofit organization made in the report.
Data mining, as currently defined, is the technique of performing searches to uncover patterns that could identify individuals involved in terrorist or criminal activities. Agencies need not include subject-based searches, where officials scan data for content that meets specific parameters, in annual reports the 2007 Data Mining Reporting Act mandates.
The 2009 report from the Office of the Director of National Intelligence, for instance, did not include many of ODNI's counterterrorism efforts that use link analysis tools. Therefore, it lacked analyses of the effectiveness of the tools, and the technologies in use -- all elements agencies are required to disclose for data mining activities that meet the law's definition.
Unlike pattern-identification tools, link analysis programs start with a known person of interest and use methods to detect links between that known subject and potential associates.
"We recommend expanding the definition of data mining programs under the act to reflect the broader definition applied in this report, and thereby require reporting on a greater number of programs," the Constitution Project study stated. The project's senior counsel, Sharon Bradford Franklin, acknowledged Congress would have to mandate such a change, and that likely will take time.
The federal government has applied data mining tools to detect tax fraud, as well as to investigate potential misuse of economic stimulus funds. The technique has broad security applications, but critics are concerned the collection and retention of data for mining might violate privacy, due process and free speech rights.
Homeland Security Department Chief Privacy Officer Mary Ellen Callahan discussed the report during a briefing on Tuesday with other information policy experts, but was not at liberty to endorse or oppose any of the suggestions. Regarding the meaning of data mining, she said, "My reservation would be -- to expand the definition of data mining to cover all activities [would extend it] to essentially every time you set up a database."
Three DHS systems meet the legal definition of data mining, according to the department's 2009 report, which came out last December. The 2010 document is set to be distributed shortly and will not identify any new programs, she said. The three programs covered are the Automated Targeting System, which checks traveler and cargo information against intelligence and other enforcement data to prevent terrorists and weapons from entering the country; the Data Analysis and Research for Trade Transparency System, which generates leads for investigations into trade-based money laundering, contraband smuggling and other import-export crimes; and the Freight Assessment System, which identifies parcels that could pose a heightened risk to passenger aircraft.
Callahan said data-mining activities do not include queries into e-Verify, a DHS system that checks the immigration status of workers by matching information that employers enter in DHS, State Department and Social Security Administration databases.
She said transparency into data mining activities has had a beneficial effect on reining in potentially intrusive projects.
"When people have a program that I think is out of line, I say, 'OK, do you want that in the data mining report?' So I think disclosure is an important tool," Callahan said.