A NASA initiative aims to teach the public how to effectively use open-source tools and software to answer data-centered questions on their own.
America’s space agency captures massive volumes of data from remote platforms each day—and with plans to launch a couple of new satellites in 2022, its information collection will likely soon be expanding at about 50 petabytes per year.
Over the last 18 to 24 months, according to the chief science data officer within its science mission directorate, NASA recognized a need to broaden the diversity of people who can answer local questions via this global data it gathers.
“Public access to that information is critical, right? So it's public data, it's available to anybody for any purpose. And we do all sorts of things with it—we can discover new planets, we discover new activities on Mars, we look at the Earth in new and different ways and use it to help manage how we respond to climate change or natural disasters,” NASA’s Kevin Murphy said Tuesday during the AWS Summit in Washington D.C. “And so actually having people be able to access that information is incredibly important, especially if they can put the processing next to that.”
One way to meet that goal is to teach the public how to effectively use open-source tools and software, which involves source code that is made freely available and is modifiable, as well as other resources to answer data-centered questions on their own. So, NASA is preparing to start a new initiative called Transform to Open Science, Murphy said, to “train the next generation of scientists and computer scientists to use [artificial intelligence, machine learning] and open source tools that are available to conduct scientific work across vast amounts of data pretty quickly.”
A workshop associated with this work, which is ultimately aimed at helping jumpstart the adoption of “open science” across NASA-related communities, is set for next month.
The agency has also been implementing DevOps approaches for a while now, Murphy noted. That’s proving useful in this context as it helps accelerate the movement of data in cloud environments and the compilation of diverse datasets, so scientists can gain more of a holistic systems perspective.
“It's not just looking at an ocean or land—it's how those two interact with the atmosphere,” Murphy said. “So putting those datasets together is really important.”
He added that while conducting scientific research is generally difficult, doing so in the mostly remote reality of the COVID-19 pandemic is even moreso, especially for those who don’t have access to proper collaboration tools. Many exist, and Murphy’s agency uses some, but not without limitations.
“We need to do better—not just at NASA, but I think across the government—to enable people to use those capabilities more efficiently,” he said. “I think another big thing is how you bring data from different agencies together, and synthesize information from that.”
Murphy was joined by experts from two other agencies for this discussion, which spanned many topics including experiences helping keep their organizations operating amid the ongoing pandemic.
The Veterans Affairs Department’s Enterprise Cloud Solutions Office Director Dave Catanoso said since the novel coronavirus emerged, the agency went from enabling around 25,000 telehealth sessions a month to now more than 45,000 per day. Now that veterans have come to expect that level and type of engagement, Catanoso said VA officials don’t see expectations for that heightened telehealth usage changing in the future.
Small Business Administration Chief Information Officer Keith Bluestein also detailed how his agency’s capital access-aligned officials paid more loans in 14 days than they had paid in the last 14 years near the start of the pandemic. Further, more economic injury disaster loans were put out in that period than in the agency’s entire 67 year history.
“People went from a normal way of doing business to suddenly being supersized—and they delivered on those capabilities,” he said.