NIH Unveils Centralized Resource for COVID-19 Patient Data

Bakhtiar Zein/Shutterstock.com

Insiders hope it will “serve as the foundation for addressing future public health emergencies.”

The National Institutes of Health on Monday unveiled a centralized platform that approved users can tap into to contribute, access and analyze data derived from COVID-19 patients’ electronic health records, as part of a dedicated effort to more quickly convert clinical information into insights that can accelerate research against the novel coronavirus.  

Funded by the National Center for Advancing Translational Sciences, or NCATS, NIH recently developed the National COVID Cohort Collaborative—or the N3C—effort. According to a news release regarding the work, N3C will systematically capture clinical, diagnostic and laboratory data from participating health care providers nationwide, aggregate that data into a more standardized, easily-accessible format, and swiftly enable users to leverage new, collaborative research insights from that harmonized information via the NCATS N3C Data Enclave.

“NCATS initially supported the development of this innovative collaborative technology platform to speed the process of understanding the course of diseases, and identifying interventions to effectively treat them,” NCATS Director Christopher Austin said in a statement. “This platform was deployed to stand up this important COVID-19 effort in a matter of weeks, and we anticipate that it will serve as the foundation for addressing future public health emergencies.”

The COVID-19 pandemic continues to generate massive heaps of clinical data that could improve medical experts’ understanding of, and abilities to effectively treat, the novel coronavirus. However, as NIH notes, those potentially helpful datasets “often become too large to share, and the networks for data management are so dissimilar that they cannot be combined easily.” Through the initiative, the agency hopes to help alleviate that issue by creating and refining a safe and centralized resource that integrates COVID-19-related EHR data from separate organizations in disparate formats into a common, seamless structure that can also be used to advance research to stop the pandemic and enhance treatments and intervention. 

“Having access to a centralized enclave of this magnitude will help researchers and health care providers answer clinically important questions they previously could not, such as, ‘Can we predict who might need dialysis because of kidney failure?’ or ‘Who might need to be on a ventilator because of lung failure?’” the agency noted. Insiders also add that “the information available via the N3C enclave will be rich in scope and scale.” 

Presently, NIH is collaborating with 35 sites across the U.S. who are contributing data to the work, and those entities will continuously add “demographics, symptoms, medications, lab test results, and outcomes data regularly”—for the next half-decade—which the agency notes will enable “both the immediate and long-term study of the impact of COVID-19 on health outcomes.” Though users do not have to contribute data to access the resource, the ability to do so is only available to entities that execute the NCATS Data Transfer Agreement and establish an official partnership with the N3C. Participating organizations can discontinue their collaborations at any time, but once data is contributed to the effort, it will not be removed. 

All data provided to NCATS for the initiative will come in as a Limited Data Set, which NIH repeatedly emphasized only retains two of 18 HIPAA-defined potentially identifying elements: the patients’ health care provider zip codes and dates of service. On the N3C’s frequently asked questions webpage, the agency also adds that “specific institutions will not be identified, though it might be possible to infer institutional identity.” 

“NCATS is taking multiple precautions for security and privacy to keep these data safe within its protected cloud infrastructure,” according to the agency. Residing in Amazon Web Services GovCloud, the Palantir platform in use is FedRAMP authorized at a Moderate impact level, NIH’s FAQ page notes. Approved users can also only investigate and analyze the data within the platform—it cannot be downloaded or removed. 

“The N3C data will be used only for COVID-19 research purposes, including clinical and translational research and public health surveillance,” the agency said.