Sensitive Data Must Be Protected. That Doesn’t Mean It Can’t Also Be Used.

Yuri Hoyda/Shutterstock.com

Two promising tools may allow agencies to share data while maintaining privacy and security requirements.

Earlier this year, the Health and Human Services Department announced a multimillion-dollar contract for artificial intelligence tools, including machine learning, natural language processing, and more. It was just the latest example of a government agency making a significant investment in AI and leveraging big data to drive actionable insights that can guide future decision making.

That agreement was also another sign that data has become one of the hottest commodities in federal IT. It’s a centerpiece of the President’s Management Agenda, which calls for “leveraging data as a strategic asset.” It’s the focus of the Office of Management and Budget’s Federal Data Strategy. And it’s the nutrient that nourishes AI models and algorithms, which help IT professionals and data scientists extract information from massive troves of data they have at their disposal.

The data set challenge: Trust no one?

But while data is the foundation of any successful AI initiative, identifying trusted and verifiable data sets can be an enormous challenge. Agencies must ensure that the data they’re using is clean and reliable. If it isn’t, the results derived from the AI will be inaccurate—the old “garbage in, garbage out” principle. 

Data that has been cultivated within other government organizations is, ideally, trustworthy, but it may also be considered highly sensitive or proprietary. Many agencies may be understandably hesitant (if not outright prohibited) from sharing this data, even among counterparts who may use it to further research that benefits the U.S. This is particularly true if the data in question includes personally identifiable or classified information. 

How can agencies manage the delicate balance between using data to its fullest extent without compromising privacy and security requirements? Two exciting and innovative data analysis tools—homomorphic encryption, or HE, and federated learning—may hold the answers to this question. Let’s take a look at what these tools are, how they work, and what agencies need to know as they consider implementing these models.

What is HE?

HE allows an AI algorithm to analyze encrypted data, letting data scientists glean valuable insights without having to decrypt sensitive data sets. Essentially, HE enables fundamental algebraic operations on encrypted data that are equivalent to running the same operations on unencrypted data. 

Put another way, HE could be considered a form of x-ray vision that allows machines to “see” the underlying statistics within the encrypted data while still keeping that data private. This can be enormously beneficial for government agencies. They can gain valuable insights hidden within encrypted data sets without compromising the security or privacy of the information contained within.

Once the doorway to preserving sensitive data sets has been opened by HE, agencies may feel more comfortable sharing encrypted data and collaborating with one another.

What is federated learning?

Yet their ability to share sensitive data may still be regulated or restricted. This is where federated learning comes into play.

Google introduced the concept of federated learning in 2017 as a means to allow mobile phones to collaboratively learn a shared prediction model while keeping a person’s private data on their device. The result was the popular Google keyboard prediction algorithm that “predicts” what a user is going to type while they enter a phrase. 

The application offers significant potential benefits to public sector agencies. Organizations can gain insights from different data sources without having to move data from one siloed location to a centralized server. Essentially, the algorithm goes to the data, while the data itself stays put. 

Teams working with data housed in different locations can still collaborate on deep learning projects without having to share their data. For instance, a group working at the Defense Logistics Agency can collaborate with colleagues working at the Defense Information Systems Agency on a joint DoD project—without their data having to leave their respective organizations. 

Keeping the data at-rest and in the possession of the various teams protects the integrity of the data, reduces risk and improves security. Meanwhile, teams can continue to iterate and collaborate on their projects without fear of their sensitive data being compromised.

What do agencies need to do?

HE and federated learning are very compelling and powerful options for working with sensitive data, but there are some requirements that agencies will need to consider before implementation. 

First, it’s important to factor in the storage costs associated with HE. A good rule of thumb is to estimate the amount of storage that would be needed to store the data unencrypted—and then double it. Agencies that use HE may end up dealing with large-scale and highly complex data sets. They’ll want to make sure they’ve got the space to store these sets.

Agencies must also ensure that their compute resources are up to the task. Running analytics on HE data can be very demanding and cause latency issues due to the intense nature of the analytics processes. Organizations that have already invested in high-performance computing will be set up well to handle these processes. 

It’s also worth considering utilizing a flexible, open-source software stack with support for multiple machine learning and deep learning frameworks. Such stacks can create an abstraction layer that allows data scientists in different agencies to work in their favorite deep learning framework to create models that can be distributed to seamlessly share information—important for federated learning. 

But the most important thing to know is that there are now ways to share and gain insights from data without compromising that data’s integrity. HE and federated learning provide agencies with the ability protect data while also using it to its fullest extent. 

Sean McPherson is a deep learning data scientist at Intel. 

X
This website uses cookies to enhance user experience and to analyze performance and traffic on our website. We also share information about your use of our site with our social media, advertising and analytics partners. Learn More / Do Not Sell My Personal Information
Accept Cookies
X
Cookie Preferences Cookie List

Do Not Sell My Personal Information

When you visit our website, we store cookies on your browser to collect information. The information collected might relate to you, your preferences or your device, and is mostly used to make the site work as you expect it to and to provide a more personalized web experience. However, you can choose not to allow certain types of cookies, which may impact your experience of the site and the services we are able to offer. Click on the different category headings to find out more and change our default settings according to your preference. You cannot opt-out of our First Party Strictly Necessary Cookies as they are deployed in order to ensure the proper functioning of our website (such as prompting the cookie banner and remembering your settings, to log into your account, to redirect you when you log out, etc.). For more information about the First and Third Party Cookies used please follow this link.

Allow All Cookies

Manage Consent Preferences

Strictly Necessary Cookies - Always Active

We do not allow you to opt-out of our certain cookies, as they are necessary to ensure the proper functioning of our website (such as prompting our cookie banner and remembering your privacy choices) and/or to monitor site performance. These cookies are not used in a way that constitutes a “sale” of your data under the CCPA. You can set your browser to block or alert you about these cookies, but some parts of the site will not work as intended if you do so. You can usually find these settings in the Options or Preferences menu of your browser. Visit www.allaboutcookies.org to learn more.

Sale of Personal Data, Targeting & Social Media Cookies

Under the California Consumer Privacy Act, you have the right to opt-out of the sale of your personal information to third parties. These cookies collect information for analytics and to personalize your experience with targeted ads. You may exercise your right to opt out of the sale of personal information by using this toggle switch. If you opt out we will not be able to offer you personalised ads and will not hand over your personal information to any third parties. Additionally, you may contact our legal department for further clarification about your rights as a California consumer by using this Exercise My Rights link

If you have enabled privacy controls on your browser (such as a plugin), we have to take that as a valid request to opt-out. Therefore we would not be able to track your activity through the web. This may affect our ability to personalize ads according to your preferences.

Targeting cookies may be set through our site by our advertising partners. They may be used by those companies to build a profile of your interests and show you relevant adverts on other sites. They do not store directly personal information, but are based on uniquely identifying your browser and internet device. If you do not allow these cookies, you will experience less targeted advertising.

Social media cookies are set by a range of social media services that we have added to the site to enable you to share our content with your friends and networks. They are capable of tracking your browser across other sites and building up a profile of your interests. This may impact the content and messages you see on other websites you visit. If you do not allow these cookies you may not be able to use or see these sharing tools.

If you want to opt out of all of our lead reports and lists, please submit a privacy request at our Do Not Sell page.

Save Settings
Cookie Preferences Cookie List

Cookie List

A cookie is a small piece of data (text file) that a website – when visited by a user – asks your browser to store on your device in order to remember information about you, such as your language preference or login information. Those cookies are set by us and called first-party cookies. We also use third-party cookies – which are cookies from a domain different than the domain of the website you are visiting – for our advertising and marketing efforts. More specifically, we use cookies and other tracking technologies for the following purposes:

Strictly Necessary Cookies

We do not allow you to opt-out of our certain cookies, as they are necessary to ensure the proper functioning of our website (such as prompting our cookie banner and remembering your privacy choices) and/or to monitor site performance. These cookies are not used in a way that constitutes a “sale” of your data under the CCPA. You can set your browser to block or alert you about these cookies, but some parts of the site will not work as intended if you do so. You can usually find these settings in the Options or Preferences menu of your browser. Visit www.allaboutcookies.org to learn more.

Functional Cookies

We do not allow you to opt-out of our certain cookies, as they are necessary to ensure the proper functioning of our website (such as prompting our cookie banner and remembering your privacy choices) and/or to monitor site performance. These cookies are not used in a way that constitutes a “sale” of your data under the CCPA. You can set your browser to block or alert you about these cookies, but some parts of the site will not work as intended if you do so. You can usually find these settings in the Options or Preferences menu of your browser. Visit www.allaboutcookies.org to learn more.

Performance Cookies

We do not allow you to opt-out of our certain cookies, as they are necessary to ensure the proper functioning of our website (such as prompting our cookie banner and remembering your privacy choices) and/or to monitor site performance. These cookies are not used in a way that constitutes a “sale” of your data under the CCPA. You can set your browser to block or alert you about these cookies, but some parts of the site will not work as intended if you do so. You can usually find these settings in the Options or Preferences menu of your browser. Visit www.allaboutcookies.org to learn more.

Sale of Personal Data

We also use cookies to personalize your experience on our websites, including by determining the most relevant content and advertisements to show you, and to monitor site traffic and performance, so that we may improve our websites and your experience. You may opt out of our use of such cookies (and the associated “sale” of your Personal Information) by using this toggle switch. You will still see some advertising, regardless of your selection. Because we do not track you across different devices, browsers and GEMG properties, your selection will take effect only on this browser, this device and this website.

Social Media Cookies

We also use cookies to personalize your experience on our websites, including by determining the most relevant content and advertisements to show you, and to monitor site traffic and performance, so that we may improve our websites and your experience. You may opt out of our use of such cookies (and the associated “sale” of your Personal Information) by using this toggle switch. You will still see some advertising, regardless of your selection. Because we do not track you across different devices, browsers and GEMG properties, your selection will take effect only on this browser, this device and this website.

Targeting Cookies

We also use cookies to personalize your experience on our websites, including by determining the most relevant content and advertisements to show you, and to monitor site traffic and performance, so that we may improve our websites and your experience. You may opt out of our use of such cookies (and the associated “sale” of your Personal Information) by using this toggle switch. You will still see some advertising, regardless of your selection. Because we do not track you across different devices, browsers and GEMG properties, your selection will take effect only on this browser, this device and this website.