PLOS’ New Data Policy: Public Access to Data

UPDATE 7 MARCH: Please see new blog post

UPDATE 26 FEBRUARY : A flurry of interest has arisen around the revised PLOS data policy that we announced in December and which will come into effect for research papers submitted next month. We are gratified to see a huge swell of support for the ideas behind the policy, but we note some concerns about how it will be implemented and how it will affect those preparing articles for publication in PLOS journals. We’d therefore like to clarify a few points that have arisen and once again encourage those with concerns to check the details of the policy or our FAQs, and to contact us with concerns if we have not covered them.

Is the policy about what to share, or about how and where to share it?

There is nothing new in the policy about what types and forms of data should be shared. As we said in December, “PLOS journals have requested data be available since their inception, but we believe that providing more specific instructions for authors regarding appropriate data deposition options, and providing more information in the published article as to how to access data, is important for readers and users of the research we publish.” As we have further clarified, “the Data Policy states the ‘minimal dataset’ consists “of the dataset used to reach the conclusions drawn in the manuscript with related metadata and methods, and any additional data required to replicate the reported study findings in their entirety. This does not mean that authors must submit all data collected as part of the research, but that they must provide the data that are relevant to the specific analysis presented in the paper.” The ‘minimal dataset’ does not mean, for example, all data collected in the course of research, or all raw image files, or early iterations of a simulation or model before the final model was developed. We continue to request that the authors provide the “data underlying the findings described in their manuscript”. Precisely what form those data take will depend on the norms of the field and the requests of reviewers and editors, but the type and format of data being requested will continue to be the type and format PLOS has always required.

What is changing is that authors need to indicate where the data are housed, at the time of submission. We want reviewers, editors and readers to have that information transparently available when they read the article. We strongly encourage deposition in subject area repositories (such as GenBank for sequences, clinicaltrials.gov for clinical trials data, and PDB for structures) where those exist, and in unstructured repositories such as Dryad or FigShare where there is no appropriate subject-domain repository. Some institutions provide appropriate centralized repositories for their researchers’ data; We recognize that for those with small amounts of data, they may be wholly included within the article itself as they are now, and that for some other smaller data types it might be most appropriate to include Supplementary Files with the article – although we would also like to ensure these files are used optimally.

What if my dataset is too large for any of these solutions?

We appreciate that some people now work with datasets that are too large for any of these solutions, and would like to work with them to develop methods of sharing that work in these instances. Authors should submit their manuscripts, noting the details of their situation, and we will work with you to arrive at a solution.

What about human patient data?

Like some other types of data, it is often not ethical or legal to share patient data universally, so we provide guidance on the routes available to authors of such data, and we encourage anyone with concerns of this type to contact the journal they would like to submit to, or the data team at data@plos.org.

Concerns about someone else benefiting from the data

Some raise the concern that, having collected data, they want to be the ones to analyze it and benefit from it. In our view, this sentiment applies to the period before publication. But after publication (in particular, after publication in an Open Access journal) the data should be available for re-use by others. This is not just our view: many institutions and funding agencies (e.g. NIH) now make data sharing a requirement. We understand that some authors will not want to share data, just as some choose not to make their articles available Open Access, but trust that most authors publish their work precisely in order to allow others to benefit from it.

Liz Silva, PLOS ONE
Theo Bloom, PLOS Biology
Emma Ganley, PLOS Biology
Maggie Winker, PLOS Medicine


ORIGINAL POST: Access to research results, immediately and800px-Open_Data_stickers without restriction, has always been at the heart of PLOS’ mission and the wider Open Access movement. However, without similar access to the data underlying the findings, the article can be of limited use. For this reason, PLOS has always required that authors make their data available to other academic researchers who wish to replicate, reanalyze, or build upon the findings published in our journals.

In an effort to increase access to this data, we are now revising our data-sharing policy for all PLOS journals: authors must make all data publicly available, without restriction, immediately upon publication of the article. Beginning March 3rd, 2014, all authors who submit to a PLOS journal will be asked to provide a Data Availability Statement, describing where and how others can access each dataset that underlies the findings. This Data Availability Statement will be published on the first page of each article.

What do we mean by data?

“Data are any and all of the digital materials that are collected and analyzed in the pursuit of scientific advances.” Examples could include spreadsheets of original measurements (of cells, of fluorescent intensity, of respiratory volume), large datasets such as

next-generation sequence reads, verbatim responses from qualitative studies, software code, or even image files used to create figures. Data should be in the form in which it was originally collected, before summarizing, analyzing or reporting.

What do we mean by publicly available?

All data must be in one of three places:

  • the body of the manuscript; this may be appropriate for studies where the dataset is small enough to be presented in a table
  • in the supporting information; this may be appropriate for moderately-sized datasets that can be reported in large tables or as compressed files, which can then be downloaded
  • in a stable, public repository that provides an accession number or digital object identifier (DOI) for each dataset; there are many repositories that specialize in specific data types, and these are particularly suitable for very large datasets

Do we allow any exceptions?

Yes, but only in specific cases. We are aware that it is not ethical to make all datasets fully public, including private patient data, or specific information relating to endangered species. Some authors also obtain data from third parties and therefore do not have the right to make that dataset publicly available. In such cases, authors must state that “Data is available upon request”, and identify the person, group or committee to whom requests should be submitted. The authors themselves should not be the only point of contact for requesting data.

Where can I go for more information?

The revised data sharing policy, along with more information about the issues associated with public availability of data, can be reviewed in full at:

http://www.plos.org/data-access-for-the-open-access-literature-ploss-data-policy/

http://www.plos.org/update-on-plos-data-policy/

Image: Open Data stickers by Jonathan Gray

This entry was posted in Aggregators, Open Access and tagged , , , , . Bookmark the permalink.

57 Responses to PLOS’ New Data Policy: Public Access to Data

  1. Bernard Beckerman says:

    Concerning in silico studies, are the authors expected to publish log and dump files that often add up to many terabytes, or is source code sufficient?

    VA:F [1.9.22_1171]
    Rating: 0 (from 0 votes)
  2. Pingback: PLoS – largest scientific journal in the world – now requires that authors must make all data publicly available, without restriction, immediately upon publication of the article | JME

  3. Pingback: PLoS One requires full data release with publication | Labrigger

  4. Greg Hale says:

    Thanks so much for this!! You guys are in the front of the pack, with these sorts of policies PLoS is really a publisher I believe in, and I hope you are rewarded for the forward-thinking by people sending you their very best work. I’m really excited to see where data-sharing and code-sharing will take us as a community.

    VA:F [1.9.22_1171]
    Rating: +1 (from 1 vote)
  5. Larry says:

    If I’m not mistaken, the invention of the internet was originally to distribute data between researchers.. Doesn’t that mean that the internet actually is doing what it was purposed to do after all? (Instead of showing funny cat pictures with everyone)

    VA:F [1.9.22_1171]
    Rating: 0 (from 0 votes)
  6. GANON says:

    What data about any endangered species could be considered “unethical”?

    VA:F [1.9.22_1171]
    Rating: 0 (from 0 votes)
  7. Michael Dellacava says:

    A sensible policy that will advance science faster and with less interference. And it should illuminate those papers with little substance a lot more brightly. Kudos Plos!

    VA:F [1.9.22_1171]
    Rating: +1 (from 1 vote)
  8. Pingback: PLOS ONE y los datos brutos | Como decíamos ayer...

  9. Pingback: PLoS要求论文作者公开数据 | 我爱互联网

  10. Pingback: PLOS’ New Data Policy || WNBTv - will not be televised

  11. Pingback: Online Accuracy: Not What It Seems : Stephen E. Arnold @ Beyond Search

  12. Pingback: Twitter Open Access Report – 25 Feb 2014 |

  13. Pingback: New PLOS Open data policy | The OpenScience Project

  14. This is all well and good, but I hope that the editors will also be encouraging the authors to make sure that data is in a form that is actually digestible by others, for example spreadsheets should be in text, not Excel or PDF, and any tables that are part of the publication should be available in alternative formats (i.e. text), that will be easier to get into an actual analysis software.

    VA:F [1.9.22_1171]
    Rating: +1 (from 1 vote)
  15. Pingback: PLOS New Data policy and how librarians can help | Dan Trout - Librarian

  16. Pingback: Open access science publisher demands full availability of data » Borg Prime

  17. Pingback: Open access science publisher demands full availability of data | Son Dakika İnternet

  18. Pingback: Open access science publisher demands full availability of data |

  19. Pingback: Open access science publisher demands full availability of data | Binary Reveux

  20. Pingback: PLoS done got it wrong | Pondering Blather

  21. Pingback: PLOS Revises Policy: Authors Must Make All Data Publicly Available Without Restriction Upon Publication of Article | LJ INFOdocket

  22. Pingback: The Blow Magazine | Open access science publisher demands full availability of data

  23. Pingback: PLOS verlangt Forschungsdaten | Schmalenstroer.net

  24. Pingback: Open Data – JumpSeek

  25. Pingback: Журнал «Научная периодика: проблемы и решения» » Новая политика PLoS: открытый доступ к данным

  26. Pingback: Public access to scientific research expanded | Hayleigh Gowans: Journalist

  27. Pingback: PLOS is changing their data policy this will… | Sharing useful stuff with peers

  28. Pingback: What we’re reading: Sex and the single endogenous retrovirus, extinction by hybridization, and the PLOS data-sharing policy | The Molecular Ecologist

  29. Pingback: Augenspiegel 09/14: Crowdfunding, Streitschriften und Open Data - Augenspiegel

  30. Pingback: PLOS’ New Data Policy: Public Access to D...

  31. Pingback: Week in Review – 28 February 2014 | USMA Library Blog

  32. Pingback: Fake papers are not the real problem in science | Achilleas Kostoulas

  33. Pingback: Cleverly Named Bunch o’ Links 3/1/14 | Gravity's Wings

  34. Pingback: Open access science publisher demands full availability of data | Cardiff Computer Rescue

  35. Pingback: Skeptický podcast - Pseudocast #127 - Homosexualita, tuky, Wikipédia

  36. Pingback: February highlights from the world of scientific publishing | sharmanedit

  37. Pingback: Woodruff Library Blog | New Open Data Policy at PLOS

  38. Pingback: New Open Data Policy at PLOS | Data Forwards

  39. Pingback: If you love your data, set it free. | Practical Data Management for Bug Counters

  40. Pingback: Should you share your data? | Peter Combs's Blog

  41. Pingback: Another Week of Anthropocene Antics, March 2, 2014 – A Few Things Ill Considered

  42. Pingback: PLOS’s open data fever dream | Kevin the Librarian

  43. Pingback: Another Week of Anthropocene Antics, March 2, 2014 [A Few Things Ill Considered] | Gaia Gazette

  44. Pingback: Data Sharing and Science — Contemplating the Value of Empiricism, the Problem of Bias, and the Threats to Privacy | The Scholarly Kitchen

  45. Pingback: Fake Papers are Not the Real Problem in Science

  46. Pingback: Hypub Links of the Week #6: new data, open access journals and research startups › Hybrid Publishing Lab Notepad

  47. Pingback: PLOS Opens Roundup (March 7) | PLOS OpensPLOS Opens

  48. Pingback: Following criticism, PLOS apologizes, clarifies new data policy | Retraction Watch

  49. Pingback: Following criticism, PLOS apologizes, clarifies new data policy – Nouvelles et satellite scientifique

  50. Pingback: strong opinions about data sharing mandates–mine included | [citation needed]

  51. Pingback: Lit Review: #PLOSFail and Data Sharing Drama | Data Pub

  52. Pingback: Publishing on PLOS journals requires public access to data | Psychology and Neuroscience Research

  53. Pingback: The PLOS Data Policy: An Overview | LSHTM Research Data Management Support Service

  54. Pingback: Big Data Comes to Life (Sciences) | Trifacta

  55. Pingback: Sharing research data: trends in journal publishing | RESEARCH NEWS from Swansea University Library

  56. Pingback: Why Scientists Need to Learn How to Share - Pacific Standard: The Science of Society

  57. Pingback: Results for Wyoming elk, deer, and antelope hunts draw are in! - Zahal IDF Blog News

Add Comment Register



Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>