16 Agencies Create One Confidential Data Process to Rule Them All

NicoElNino/Getty Images

In three years, creators of the Standard Application Process condensed 16 different agencies’ processes into one central portal for confidential data requests.

Through three years, 16 federal agencies, more than 1,300 datasets and a cascade of privacy laws and interagency agreements underpinned by decades-old entrenched processes, a team of federal employees launched the Standard Application Process in December, creating for the first time a single portal for U.S. researchers to request access to the mountains of confidential statistical data generated by the federal government.

The Foundations for Evidence-Based Policymaking Act of 2018—better known as the Evidence Act—requires federal agencies to generate, use and share more data to improve all aspects of society. That includes sensitive and confidential data that could be useful for researchers in and out of government but must be treated with utmost care and only shared with responsible parties.

The Evidence Act acknowledges this issue and calls for the creation of a standardized application process to better enable researchers to apply for access to confidential data held by the federal government’s 16 statistical agencies.

Each agency is charged with collecting important data on its niche topic and maintaining those datasets for the public benefit. However, while the data is generated in order to be used, the subjects of that data—real people with privacy to protect—must be kept confidential and anonymous.

“The purpose of the SAP is to try to streamline the process by which researchers and data users request access to confidential data,” said Heather Madray, program director for the Data Access, Confidentiality and Quality Assessment project based out of the National Science Foundation’s National Center for Science and Engineering Statistics.

“This isn’t public use data,” she said. “This is data that’s protected by various confidentiality laws.”

That work began in September 2019 at the Census Bureau, where Madray and a team launched a pilot with a basic online application and data catalog. Over the next few years, the team grew to include a working group with representatives from all 16 statistical agencies, including a governance board and a policy board.

By mid-2020, the group was building out a metadata catalog to categorize all types of data held by the agencies—an effort that would turn out to be just as significant as the SAP itself.

The full catalog was completed in August 2022 and the first full version of the application went live in December.

“Before the SAP, every agency had their own way—their own application, their own process for people to apply for data. So, if you wanted data from five different agencies, you had to go to five different websites—five different applications, five different processes—[to] try to figure it all out. It was quite cumbersome and quite difficult trying to navigate all that,” Madray told Nextgov.

“What this does is bring all the principle statistical agencies and units—so, 16 right now—together under one common application and one standard process,” she said.

While establishing the SAP was the main goal of the project, the data catalog has been a huge achievement and value-add, Madray said. With the catalog in place, requestors can look in one place to discover all the types of data they can get access to, rather than having to search through each agency’s repository separately.

As of the end of February, 13 agencies had uploaded 1,337 datasets and the associated metadata to allow a user to quickly find the right data. That inventory will grow as three more agencies work to get their datasets included in the catalog.

“The metadata inventory is huge,” Madray said. “Having all of that restricted use data—and all of the information about that data—available in one location, it’s much easier for users to find information on data, contact an agency if they need additional information: That’s really a game-changer, I think.”

But getting to this point took three years of dogged work involving lots of people, process and technology challenges.

The first major issue was synthesizing processes and requirements from 16 distinct federal agencies, each of which had developed its own way of doing things over the years.

“But the great thing about that was everybody coming together. We learned a lot about each other’s processes; we learned a lot about what the different challenges that different agencies face with their own research programs,” Madray said. “We had to make a lot of compromises but I think it built relationships.”.

In developing a single approach, it was important to weed out bad processes that had developed over time at individual agencies. The team went “back to basics” to establish what was required by statute and then what supported administrative needs at each agency.

Finally, the team looked at the standing agreements—deals between agencies and with non-profits and academic institutions—that “had to be honored.” Sometimes, those conversations got uncomfortable, Madray said, but the team just had to “talk it through.”

“We all—every single one of us that was working on this—we had our security blankets, things we’d be doing for years that we had to think about, ‘Do we really need that or can we work in a different space?’” she said.

After getting through the people and process issues came the last part of any technology modernization project: the technology.

The first technological issue was realizing that the process could only be standardized up to a point. Each of the statistical agencies represents a different sector, different set of data and different potential use cases. While the final application needed to have a standardized functionality, it also needed some customization options.

The other major technical hurdle was enabling a system that could securely uptake large documents. As the data requested through this process is sensitive and confidential, the participating agencies need to review the methodology of the research the datasets will be used for.

“The first criteria: The use has to be statistical in nature,” Madray explained. “No law enforcement uses, no regulatory uses, no commercial uses.

“Then, the use of the data has to be allowed under the agency’s laws, their individual statute or any agreements they have in place to use the data,” she said. “For example, if they have a data sharing agreement with an agency, the use of the data has to be consistent with the terms of that data sharing agreement.”

The approving agency must make sure that any output from the studies—including charts and tables—don’t compromise the confidentiality of the real people represented by that data. This becomes even more important when data is being requested from multiple agencies to avoid what’s known as the mosaic effect, in which real people can be reidentified by combining multiple sources of anonymous data.

From there, the agency has to validate that the research could not be done with publicly available data; that the research proposal is feasible; that the use of the data is consistent with and contributes to the agency’s mission; and that the work won’t jeopardize the public’s trust in the agency.

Some agencies are also required by law to ensure that the research work produces some public benefit, Madray added.

While each agency needs to go through each application using its own determination process, the SAP portal needed to be able to accommodate collection of all that relevant information.

With the initial SAP live and in use, Madray and the team are now looking to future capabilities, such as developing application programming interfaces, or APIs, to connect directly to individual agency systems. This will allow agencies to have a direct feed of incoming applications, rather than having to download them from the portal.

One of the big goals of future implementations will be to allow for a single application that requests data from multiple agencies.

“This is part technology, part working within the legal framework,” Madray said. “The ultimate down the road would be: Someone could go in and ask for data from five different agencies, they could link it and they could use it in one environment. We’re not quite there yet.”

Madray said the team has to work through conflicting statutes, security setups, data-sharing agreements and the like.

As it stands, the application can only process multi-agency requests if those agencies already have data sharing agreements in place.

Achieving a common framework for data requests would also allow the SAP to develop a platform on which the requestors could work with the data, allowing for more agency oversight and security.

“The multi-agency piece really is one of the Holy Grails,” Madray said.

Other capabilities in progress include a way to amend applications without having to withdraw and resubmit the request; an online appeal process so requestors can get more information when an application is denied; and establishment of “application windows” for agencies that are only able to accept applications at certain times of the year.

Further into the future, Madray said she hopes the program might include some identity and credential proofing, as well, to help agencies with the people side of the determination process.

For now, the team is soliciting feedback on the SAP and data catalog and providing metrics on whether the new process improves speed, ease of use and transparency.

X
This website uses cookies to enhance user experience and to analyze performance and traffic on our website. We also share information about your use of our site with our social media, advertising and analytics partners. Learn More / Do Not Sell My Personal Information
Accept Cookies
X
Cookie Preferences Cookie List

Do Not Sell My Personal Information

When you visit our website, we store cookies on your browser to collect information. The information collected might relate to you, your preferences or your device, and is mostly used to make the site work as you expect it to and to provide a more personalized web experience. However, you can choose not to allow certain types of cookies, which may impact your experience of the site and the services we are able to offer. Click on the different category headings to find out more and change our default settings according to your preference. You cannot opt-out of our First Party Strictly Necessary Cookies as they are deployed in order to ensure the proper functioning of our website (such as prompting the cookie banner and remembering your settings, to log into your account, to redirect you when you log out, etc.). For more information about the First and Third Party Cookies used please follow this link.

Allow All Cookies

Manage Consent Preferences

Strictly Necessary Cookies - Always Active

We do not allow you to opt-out of our certain cookies, as they are necessary to ensure the proper functioning of our website (such as prompting our cookie banner and remembering your privacy choices) and/or to monitor site performance. These cookies are not used in a way that constitutes a “sale” of your data under the CCPA. You can set your browser to block or alert you about these cookies, but some parts of the site will not work as intended if you do so. You can usually find these settings in the Options or Preferences menu of your browser. Visit www.allaboutcookies.org to learn more.

Sale of Personal Data, Targeting & Social Media Cookies

Under the California Consumer Privacy Act, you have the right to opt-out of the sale of your personal information to third parties. These cookies collect information for analytics and to personalize your experience with targeted ads. You may exercise your right to opt out of the sale of personal information by using this toggle switch. If you opt out we will not be able to offer you personalised ads and will not hand over your personal information to any third parties. Additionally, you may contact our legal department for further clarification about your rights as a California consumer by using this Exercise My Rights link

If you have enabled privacy controls on your browser (such as a plugin), we have to take that as a valid request to opt-out. Therefore we would not be able to track your activity through the web. This may affect our ability to personalize ads according to your preferences.

Targeting cookies may be set through our site by our advertising partners. They may be used by those companies to build a profile of your interests and show you relevant adverts on other sites. They do not store directly personal information, but are based on uniquely identifying your browser and internet device. If you do not allow these cookies, you will experience less targeted advertising.

Social media cookies are set by a range of social media services that we have added to the site to enable you to share our content with your friends and networks. They are capable of tracking your browser across other sites and building up a profile of your interests. This may impact the content and messages you see on other websites you visit. If you do not allow these cookies you may not be able to use or see these sharing tools.

If you want to opt out of all of our lead reports and lists, please submit a privacy request at our Do Not Sell page.

Save Settings
Cookie Preferences Cookie List

Cookie List

A cookie is a small piece of data (text file) that a website – when visited by a user – asks your browser to store on your device in order to remember information about you, such as your language preference or login information. Those cookies are set by us and called first-party cookies. We also use third-party cookies – which are cookies from a domain different than the domain of the website you are visiting – for our advertising and marketing efforts. More specifically, we use cookies and other tracking technologies for the following purposes:

Strictly Necessary Cookies

We do not allow you to opt-out of our certain cookies, as they are necessary to ensure the proper functioning of our website (such as prompting our cookie banner and remembering your privacy choices) and/or to monitor site performance. These cookies are not used in a way that constitutes a “sale” of your data under the CCPA. You can set your browser to block or alert you about these cookies, but some parts of the site will not work as intended if you do so. You can usually find these settings in the Options or Preferences menu of your browser. Visit www.allaboutcookies.org to learn more.

Functional Cookies

We do not allow you to opt-out of our certain cookies, as they are necessary to ensure the proper functioning of our website (such as prompting our cookie banner and remembering your privacy choices) and/or to monitor site performance. These cookies are not used in a way that constitutes a “sale” of your data under the CCPA. You can set your browser to block or alert you about these cookies, but some parts of the site will not work as intended if you do so. You can usually find these settings in the Options or Preferences menu of your browser. Visit www.allaboutcookies.org to learn more.

Performance Cookies

We do not allow you to opt-out of our certain cookies, as they are necessary to ensure the proper functioning of our website (such as prompting our cookie banner and remembering your privacy choices) and/or to monitor site performance. These cookies are not used in a way that constitutes a “sale” of your data under the CCPA. You can set your browser to block or alert you about these cookies, but some parts of the site will not work as intended if you do so. You can usually find these settings in the Options or Preferences menu of your browser. Visit www.allaboutcookies.org to learn more.

Sale of Personal Data

We also use cookies to personalize your experience on our websites, including by determining the most relevant content and advertisements to show you, and to monitor site traffic and performance, so that we may improve our websites and your experience. You may opt out of our use of such cookies (and the associated “sale” of your Personal Information) by using this toggle switch. You will still see some advertising, regardless of your selection. Because we do not track you across different devices, browsers and GEMG properties, your selection will take effect only on this browser, this device and this website.

Social Media Cookies

We also use cookies to personalize your experience on our websites, including by determining the most relevant content and advertisements to show you, and to monitor site traffic and performance, so that we may improve our websites and your experience. You may opt out of our use of such cookies (and the associated “sale” of your Personal Information) by using this toggle switch. You will still see some advertising, regardless of your selection. Because we do not track you across different devices, browsers and GEMG properties, your selection will take effect only on this browser, this device and this website.

Targeting Cookies

We also use cookies to personalize your experience on our websites, including by determining the most relevant content and advertisements to show you, and to monitor site traffic and performance, so that we may improve our websites and your experience. You may opt out of our use of such cookies (and the associated “sale” of your Personal Information) by using this toggle switch. You will still see some advertising, regardless of your selection. Because we do not track you across different devices, browsers and GEMG properties, your selection will take effect only on this browser, this device and this website.