Scientists Have a Sharing Problem

Flickr user AJC1

Competition and disorganization within their disciplines prevent many researchers from making their data publicly available, which is stunting scientific progress.

When it comes to sharing information, there seems to be quite a difference of opinion—across areas both trivial and serious—as to how much is enough. Some people broadcast their lives on Facebook; others poke fun at the oversharers in their feeds. The Edward Snowdens of the world fight for greater government transparency, even as some argue for less of it. Last year, some experts called for heightened secrecy in the technology sector, arguing that public expectation stifles creativity.

study published in October in BioScience revealed that sharing is a tricky topic in science as well. Michigan State University ecology professor Patricia Soranno and colleagues found that while many environmental-science researchers believe data sharing is beneficial—for the replication of analyses, the ability to confirm data integrity, and the overall advancement of science—few actually take the steps to make their own materials publicly available after their research is published.

Save for a few sentences in the “Methods and Analysis” sections, the data used to produce published manuscripts is often kept private—sometimes purposefully. When a scientist does make a request to obtain another researcher’s materials, their inquiries might be unanswered or denied, forcing them to delay or put on hold their own projects. And ecology isn’t the only branch of science grappling with too much secrecy: The same thing is happening in genetics, biology, chemistry, and engineering, too.

This may seem counterintuitive, given that science has traditionally been a field that prizes collaboration. After all, laboratories are rarely one-man operations, and some projects transcend geographic borders, as with global collaborations like the Human Genome Project and its sibling, the Human Brain Project.

The question, then, is why so many scientists are so stingy with their information. Because scientific progress relies so heavily on the process of validating and building upon prior material, it might seem counterproductive to withhold information from other researchers. But even science, a discipline grounded in reason, isn’t immune to the influence of ego and emotion.

The culture of innovation breeds fierce competition, and those on the brink of making a groundbreaking discovery want to be the first to publish their results and receive credit for their ideas. There’s more at stake than just the acknowledgement of being first and a metaphorical blue ribbon; being first to publish can mean invitations to national meetings, academic promotions, industry appointments, and research awards, including the Nobel Prize.

Physician David Blumenthal, now president of the Commonwealth Fund and a longtime researcher in the field of health-information technology and data-sharing practices, was working at Massachusetts General Hospital in the late 1970s where he witnessed the consequences of competition.

“I remember a peer who was sequencing a gene for a particular protein that had a lot of potential clinical application. He had spent many years working on it in a very prestigious lab, but days before he was about to finish someone beat him to publication,” Blumenthal said. “He lost five years of work at a critical time in his career and ended up leaving research for clinical practice. It wasn’t that he didn’t do good work, but if you’re not first, you don’t get the credit.”

It’s not to say everyone who gets beaten to the finish line drops out or that all researchers are strictly rewards-driven, but if sharing data paves the way for an expert to build upon or dispute other scientists’ results in a revolutionary way, it’s easy to see why some might choose to withhold.

One could pin the problem entirely on the competitive culture, but it’s only one of many reasons why scientists choose not to share their data, even after their studies are published. Among them, of course, is the lack of funding. Transferring data can be expensive: A 2002 study published in the Journal of American Medical Association by Blumenthal and colleagues found that among geneticists, 45 percent withheld data because it cost too much to send the materials to the scientists who had requested them.

“This was something that we did not anticipate, but when data is a physical thing such as a reagent, an antibody, a chemical, a mouse, or a reengineered organism, the cost and administrative difficulties are an important obstacle,” he said.

In the same study, 80 percent of respondents also reported that the effort required to produce their data prevented them from sending it to other researchers who asked for it. The underlying cause is most likely something more than sheer laziness: According to a 2012 study in the Journal of Computational Science Education that conducted in-depth interviews of researchers in 11 fields, including biology, ecology, and physics, some disciplines don’t even have formal digital repositories for data storage, and others don’t yet have standardized methods of interpreting and annotating it.

The consequences here are twofold: First, the lack of a centralized digital storage space means that data might only be kept on a personal computer or exist solely in paper form, so digging it up and sending it to a requesting party can be time-consuming, especially for scientists who have hundreds of studies under their belts. And second, the absence of uniform methods to record or describe data creates its own challenges.

While scientists do publish the results of experiments, other researchers may also need descriptions of data, called "metadata," including things like the temperature of samples, the make and calibration of equipment, time of day samples were taken, or rate of error of the samples. Though they might appear obvious to that particular researcher, these details would provide important additional information or spur additional research questions. Unfortunately, many fields don't have set rules in place as to how much metadata is required. Having to explain these specifics after publication requires extra time on the part of the original party, which could explain why requests for data are ignored.

Another study published in Academic Medicine in 2006 found more reasons for scientists’ reluctance to share, including protecting industry relationships or being less familiar with the investigators requesting the data. These findings show that data withholding isn’t always motivated by vengeance or the desire to get ahead; in some cases, the lack of resources makes it difficult to share it.

Irrespective of the motive, data withholding has produced documented consequences. In his 2002 study, Blumenthal and colleagues found that 28 percent of those surveyed were unable to replicate research as a direct result of another scientist’s refusal to share, 24 percent had a publication significantly delayed, and 21 percent had to abandon a research interest altogether. Despite individual costs, Blumenthal acknowledged that science (genetics, in the case of this study) was still thriving, but wondered “whether [progress] is as rapid as it could be if data sharing were maximized.” Could any of the world’s most pressing scientific or medical problems be solved, or at least greatly ameliorated, if data were fully accessible?

There are a few steps that could be taken to increase transparency, though the issue would have to be tackled not only by scientists, who are the purveyors of information, but also by journals, publishers, universities, funding agencies, and industry professionals. Scientists would need a centralized place to store their data, meaning more digital repositories would need to be created. The field of astronomy has reaped the benefits of designing communal data banks early on.

“Scientists started sharing data from the Hubble telescope and the Sloan Digital Sky Survey because they were collecting it at such high volumes that they needed a place to put it,” Soranno said. “Millions of users have been exposed to the data, which has resulted in thousands of studies being published, even by scientists not affiliated with those two projects.”

The scientific community would also need to establish protocols on how data should be stored, so that it becomes less time-consuming for other researchers to locate and interpret results. Some scientists have even advocated for data to be peer-reviewed and accepted by other collaborators—in the same type of procedure now used for journal articles—to ensure that the data meets scientific standards, is reliable, and was collected using logical methods.

Many experts support a mandate to require data-sharing after publication, a practice that the American Psychological Association (APA) and Public Library of Science (PLoS) currently require for all studies published in their journals. Though it’s not always strictly enforced, implementing such a regulation across the board would at least put greater pressure on researchers to release the information underlying their work.

There is also a push to examine the efficiency of the publication process. Both Soranno and Blumenthal agreed that data should not be shared prior to publication, but they also agree that it would be beneficial to look into developing a process that allows scientists to copyright material even before publication.

“In the commercial sector, if you file a patent, as soon as you file, you are protected and you can start to share,” Blumenthal explained. “In a similar fashion, you could create opportunities at the level of presentation to register the content so that for the record, your work is recognized publicly once you display it.”

Because many manuscripts are presented at conferences and symposiums before they are published, this method would patent a researcher’s work and protect their first-to-publish privilege, while simultaneously allowing other scientists to build upon their ideas and hypotheses much sooner.

The good news is that many disciplines are already embracing scientific openness, in part due to the influence of social media.

“I see a lot of budding scientists who are blogging, tweeting, and creating websites for their data and presentations,” Soranno said. “They see it as a way to get ideas out and feedback on their projects. I personally have found it very inspiring.”

The practice is trickling into the pharmaceutical industry as well: In a progressive move last month, industry giant GlaxoSmithKline released a sharing system with raw data from 200 clinical trials in order to enhance transparency for new drugs.

Inevitably, the biggest challenge will be changing the culture of secrecy in disciplines that are less prone to collaboration. Blumenthal believes it can happen, but only when certain practices, such as sharing mandates and repositories, are put into place.

“You are not going to limit secrecy just by calling on scientists to be altruists. Some will be, of course, but you need to implement processes and methods to make it easier and less costly to share data,” he said. “You want to make sure their personal interest, that of receiving recognition, and the ethical requirement are aligned.”

In her paper, Soranno calls environmental scientists’ ethics “out of date”: “[They] are increasingly concerned about the ethical importance of promoting inclusivity, including groups that are traditionally underrepresented in science, such as women and racial minorities,” she told me. “But if inclusivity is a central ethical value, then data sharing should also be a central ethical value, because data sharing is essential to promoting inclusivity.”

It could take some time to change existing sharing practices, but once customs of transparency are in place and continue to be nurtured, they can stay put for generations. Blumenthal told me about the successful tradition of transparency in the field of yeast genetics.

“The discipline has a very familial feel to it. It goes back to a group of seminal figures that believed in sharing, and they trained their colleagues and subordinates to embrace openness. So they then passed down this ethic of sharing that’s been thriving ever since,” he said. “I do think it’s worth teaching an ethic of sharing, because a young scientist’s early approach to sharing will likely become their approach for life.”

Another paper, published last year in Bioscience, calls for responsibility to be shared between data suppliers and data users, arguing that it’s not enough for people to share their materials; the individuals who use the data also need to provide attribution or co-authorship. Proper recognition where due would help data sharers feel more comfortable and make them more willing to provide information.

The solution isn’t to eliminate withholding entirely. It’s clear that not all materials should be shared: Data that has not yet been published, violates a patient’s privacy or breaks an industry agreement should remain confidential. But there is a lot of information in the scientific community that has the potential to improve, cure, and innovate, and it should get into the hands of scientists who need it and can use it for the greater good.

“If scientists’ role in society is to generate knowledge for both knowledge’s sake and the good of society, then scientists should be sharing both their ideas and their data for everyone to access,” Soranno said. “These practices will ensure that everyone has the opportunity to contribute to moving knowledge forward.”

X
This website uses cookies to enhance user experience and to analyze performance and traffic on our website. We also share information about your use of our site with our social media, advertising and analytics partners. Learn More / Do Not Sell My Personal Information
Accept Cookies
X
Cookie Preferences Cookie List

Do Not Sell My Personal Information

When you visit our website, we store cookies on your browser to collect information. The information collected might relate to you, your preferences or your device, and is mostly used to make the site work as you expect it to and to provide a more personalized web experience. However, you can choose not to allow certain types of cookies, which may impact your experience of the site and the services we are able to offer. Click on the different category headings to find out more and change our default settings according to your preference. You cannot opt-out of our First Party Strictly Necessary Cookies as they are deployed in order to ensure the proper functioning of our website (such as prompting the cookie banner and remembering your settings, to log into your account, to redirect you when you log out, etc.). For more information about the First and Third Party Cookies used please follow this link.

Allow All Cookies

Manage Consent Preferences

Strictly Necessary Cookies - Always Active

We do not allow you to opt-out of our certain cookies, as they are necessary to ensure the proper functioning of our website (such as prompting our cookie banner and remembering your privacy choices) and/or to monitor site performance. These cookies are not used in a way that constitutes a “sale” of your data under the CCPA. You can set your browser to block or alert you about these cookies, but some parts of the site will not work as intended if you do so. You can usually find these settings in the Options or Preferences menu of your browser. Visit www.allaboutcookies.org to learn more.

Sale of Personal Data, Targeting & Social Media Cookies

Under the California Consumer Privacy Act, you have the right to opt-out of the sale of your personal information to third parties. These cookies collect information for analytics and to personalize your experience with targeted ads. You may exercise your right to opt out of the sale of personal information by using this toggle switch. If you opt out we will not be able to offer you personalised ads and will not hand over your personal information to any third parties. Additionally, you may contact our legal department for further clarification about your rights as a California consumer by using this Exercise My Rights link

If you have enabled privacy controls on your browser (such as a plugin), we have to take that as a valid request to opt-out. Therefore we would not be able to track your activity through the web. This may affect our ability to personalize ads according to your preferences.

Targeting cookies may be set through our site by our advertising partners. They may be used by those companies to build a profile of your interests and show you relevant adverts on other sites. They do not store directly personal information, but are based on uniquely identifying your browser and internet device. If you do not allow these cookies, you will experience less targeted advertising.

Social media cookies are set by a range of social media services that we have added to the site to enable you to share our content with your friends and networks. They are capable of tracking your browser across other sites and building up a profile of your interests. This may impact the content and messages you see on other websites you visit. If you do not allow these cookies you may not be able to use or see these sharing tools.

If you want to opt out of all of our lead reports and lists, please submit a privacy request at our Do Not Sell page.

Save Settings
Cookie Preferences Cookie List

Cookie List

A cookie is a small piece of data (text file) that a website – when visited by a user – asks your browser to store on your device in order to remember information about you, such as your language preference or login information. Those cookies are set by us and called first-party cookies. We also use third-party cookies – which are cookies from a domain different than the domain of the website you are visiting – for our advertising and marketing efforts. More specifically, we use cookies and other tracking technologies for the following purposes:

Strictly Necessary Cookies

We do not allow you to opt-out of our certain cookies, as they are necessary to ensure the proper functioning of our website (such as prompting our cookie banner and remembering your privacy choices) and/or to monitor site performance. These cookies are not used in a way that constitutes a “sale” of your data under the CCPA. You can set your browser to block or alert you about these cookies, but some parts of the site will not work as intended if you do so. You can usually find these settings in the Options or Preferences menu of your browser. Visit www.allaboutcookies.org to learn more.

Functional Cookies

We do not allow you to opt-out of our certain cookies, as they are necessary to ensure the proper functioning of our website (such as prompting our cookie banner and remembering your privacy choices) and/or to monitor site performance. These cookies are not used in a way that constitutes a “sale” of your data under the CCPA. You can set your browser to block or alert you about these cookies, but some parts of the site will not work as intended if you do so. You can usually find these settings in the Options or Preferences menu of your browser. Visit www.allaboutcookies.org to learn more.

Performance Cookies

We do not allow you to opt-out of our certain cookies, as they are necessary to ensure the proper functioning of our website (such as prompting our cookie banner and remembering your privacy choices) and/or to monitor site performance. These cookies are not used in a way that constitutes a “sale” of your data under the CCPA. You can set your browser to block or alert you about these cookies, but some parts of the site will not work as intended if you do so. You can usually find these settings in the Options or Preferences menu of your browser. Visit www.allaboutcookies.org to learn more.

Sale of Personal Data

We also use cookies to personalize your experience on our websites, including by determining the most relevant content and advertisements to show you, and to monitor site traffic and performance, so that we may improve our websites and your experience. You may opt out of our use of such cookies (and the associated “sale” of your Personal Information) by using this toggle switch. You will still see some advertising, regardless of your selection. Because we do not track you across different devices, browsers and GEMG properties, your selection will take effect only on this browser, this device and this website.

Social Media Cookies

We also use cookies to personalize your experience on our websites, including by determining the most relevant content and advertisements to show you, and to monitor site traffic and performance, so that we may improve our websites and your experience. You may opt out of our use of such cookies (and the associated “sale” of your Personal Information) by using this toggle switch. You will still see some advertising, regardless of your selection. Because we do not track you across different devices, browsers and GEMG properties, your selection will take effect only on this browser, this device and this website.

Targeting Cookies

We also use cookies to personalize your experience on our websites, including by determining the most relevant content and advertisements to show you, and to monitor site traffic and performance, so that we may improve our websites and your experience. You may opt out of our use of such cookies (and the associated “sale” of your Personal Information) by using this toggle switch. You will still see some advertising, regardless of your selection. Because we do not track you across different devices, browsers and GEMG properties, your selection will take effect only on this browser, this device and this website.