Data Access for the Open Access Literature: PLOS’s Data Policy

Data are any and all of the digital materials that are collected and analyzed in the pursuit of scientific advances. In line with Open Access to research articles themselves, PLOS strongly believes that to best foster scientific progress, the underlying data should be made freely available for researchers to use, wherever this is legal and ethical. Data availability allows replication, reanalysis, new analysis, interpretation, or inclusion into meta-analyses, and facilitates reproducibility of research, all providing a better ‘bang for the buck’ out of scientific research, much of which is  funded from public or nonprofit sources. Ultimately, all of these considerations aside, our viewpoint is quite simple: ensuring access to the underlying data should be an intrinsic part of the scientific publishing process.

PLOS journals have requested data be available since their inception, but we believe that providing more specific instructions for authors regarding appropriate data deposition options, and providing more information in the published article as to how to access data, is important for readers and users of the research we publish. As a result, PLOS is now releasing a revised Data Policy that will come into effect on March 1, 2014, in which authors will be required to include a data availability statement in all research articles published by PLOS journals; the policy can be found below. This policy was developed after extensive consultation with PLOS in-house professional and external Academic Editors and Editors in Chief, who are practicing scientists from a variety of disciplines.

We now welcome input from the larger community of authors, researchers, patients, and others, and invite you to comment before March. We encourage you to contact us collectively at data@plos.org; feedback via Twitter and other sources will also be monitored. You may also contact individual PLOS journals directly.

Theo Bloom, Editorial Director for Biology

Emma Ganley, Senior Editor, PLOS Biology

Margaret Winker, Senior Research Editor, PLOS Medicine

for the PLOS Data Group

We thank all the members of the PLOS Data Policy team, PLOS staff, and Academic Editors and Editors in Chief for all their invaluable contributions to this policy and process. We particularly thank Emma Veitch, Senior Editor, PLOS ONE, for her leadership in bringing this policy to fruition.  

Image Credit: jonathangray.com

PLOS Data Policy

from  March 1, 2014

PLOS journals require authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception1.

When submitting a manuscript online, authors must provide a Data Availability Statement describing compliance with PLOS’s policy. The data availability statement will be published with the article if accepted.

Refusal to share data and related metadata and methods in accordance with this policy will be grounds for rejection. PLOS journal editors encourage researchers to contact them if they encounter difficulties in obtaining data from articles published in PLOS journals. If restrictions on access to data come to light after publication, we reserve the right to post a correction, to contact the authors’ institutions and funders, or in extreme cases to retract the publication.

Methods acceptable to PLOS journals with respect to data sharing are listed below, accompanied by guidance for authors as to what must be indicated in their data availability statement and how to follow best practices in reporting. If authors did not collect data themselves but used another source, this source must be credited as appropriate.

Authors who have questions or difficulties with the policy, or readers who have difficulty accessing data, are encouraged to contact the relevant journal office or data@plos.org

Acceptable data-sharing methods:

Data deposition (strongly recommended): All data and related metadata underlying the findings reported in a submitted manuscript should be deposited in an appropriate public repository2, unless already provided as part of the submitted article. Repositories may be either subject-specific (where these exist) and accept specific types of structured data, or generalist repositories that accept multiple datatypes, such asDryad. Guidance on acceptable repositories is included below2. The Data Availability Statement must specify that data are deposited publicly and list the name(s) of repositories along with digital object identifiers or accession numbers for the relevant datasets. In some cases authors may not be able to obtain DOIs or accession numbers until the manuscript is accepted; in these cases, the authors must provide these numbers at acceptance. In all other cases, these numbers must be provided at submission.

 

Data in supporting information files:

For smaller datasets and certain data types, authors may upload data as supporting information files accompanying the manuscript. Authors should take care to maximize the accessibility and reusability of the data by selecting a file format from which data can be efficiently extracted (for example, spreadsheets are preferable to PDF when providing tabulated data).

If data deposition or provision in supporting information is not ethical or legal (e.g., underlying data pose privacy or legal concerns, or include human participants3), the following two methods may be acceptable alternatives, subject to case-by-case evaluation:

Data made available to all interested researchers upon request. Data Availability Statement must specify “Data available on request” and identify the group to which requests should be submitted (e.g., a named data access committee or named ethics committee). The reasons for restrictions on public data deposition must also be specified. Note that it is not acceptable for the authors to be the sole named individuals responsible for ensuring data access.

Data available from third party. In the case of a primary dataset that was not originally generated by the authors of the submitted manuscript, appropriate data sharing may require that interested researchers obtain third-party data independently from the named original source. In this case, the Data Availability Statement must state the source of the data with full citation and, if the dataset cannot be provided, indicate “Data available from (named source).” The reasons for restrictions on public data deposition must also be specified.

Unacceptable data access restrictions: PLOS journals will not consider manuscripts where the following factors influence ability to share data:

- Authors will not share data because of personal interests, such as patents or potential future publications.

- The conclusions depend solely on the analysis of proprietary data (e.g., data owned by commercial interests, or copyrighted data). If proprietary data are used, the manuscript must include an analysis of public data that validates the conclusions so others can reproduce the analysis and build on the findings.

1. Definition of data that must be shared

PLOS defines the “minimal dataset” to consist of the dataset used to reach the conclusions drawn in the manuscript with related metadata and methods, and any additional data required to replicate the reported study findings in their entirety. Core descriptive data, methods, and study results should be included within the main paper, regardless of data deposition. PLOS does not accept references to “data not shown”. Authors who have datasets too large for sharing via repositories or uploaded files should contact the relevant journal for advice.

2. Guidance on data repositories

PLOS requires that authors comply with field-specific standards forpreparation and recording of data and to select repositories appropriate to their field, for exampledeposition of microarray data in ArrayExpress or GEO; deposition of gene sequences in GenBank, EMBL or DDBJ; and deposition of ecological data in Dryad. Authors are encouraged to select repositories that meet accepted criteria as trustworthy digital repositories, such as criteria of theCentre for Research Libraries orData Seal of Approval. Large, international databases are more likely to persist than small, local ones. Copyright licensing for data held in repositories may be unclear. If authors use repositories with stated licensing policies; the policies should not be more restrictive than CC-BY.

3. Guidance on sharing datasets that derive from clinical studies or other work involving human participants

For studies involving human participants, data must be handled so as to not compromise study participants’ privacy. PLOS recommends that researchers follow established guidance and applicable local laws in ensuring they do not compromise participant privacy. Resources which researchers may consult for guidance include:

US National Institutes of Health: Protecting the Rights and Privacy of Human Subjects

Canadian Institutes of Health Research Best Practices for Protecting Privacy in Health Research

UK Data Archive: Anonymisation Overview

Australian National Data Service: Ethics, Consent and Data Sharing

Steps necessary to protect privacy may include de-identification, blocking portions of the database, or license agreements directed specifically at privacy concerns. Authors should indicate, as part of the ethics statement, the ways in which the study participants’ privacy was preserved. If license agreements apply, authors should note the process necessary for other researchers to obtain a license.

Posted in In the News
0 comments on “Data Access for the Open Access Literature: PLOS’s Data Policy
19 Pings/Trackbacks for "Data Access for the Open Access Literature: PLOS’s Data Policy"
  1. […] PLOS appreciated the community input on its Data Policy during the comment period. The feedback identified the following points for clarification and required no policy changes to the December 12, 2013 Data Policy proposal. […]

  2. […] but I missed it at the time and was alerted to it recently by @Alexis_Verger: PLOS have released a revised data policy (coming into effect in March) in which authors will be required to include a ‘data […]

  3. […] and the statistics used for analysis.1 Similarly, open-access journal PLOS ONE announced a policy requiring its authors to submit relevant data during the review process and recommending they do so […]

  4. […] Data Access for the Open Access Literature: PLOS’s Data Policy by Theo Bloom. […]

  5. […] I can suggest why this is by “cherry picking” a couple of problematic parts of the policy. For example, the policy defines data that must be shared […]

  6. […] policy that will help clarify the policy and answer questions. You can read the entire data policy here and also an accompanying editorial […]

  7. […] new open data policy was announced in December. Starting March 1, PLoS journals will “require authors to make all data […]

  8. […] PLOS have always requested authors make their data available upon request, this policy revision formalises their intent to ensure their authors are actively […]

  9. […] 2013 – Mar 2014: PLOS , announced a revised data access policy in a Dec 2013 blog post. This was really a clarification of the policy that they have always had  to encourage access to […]

  10. […] get a Digital Object Identifier (DOI), so that they can be uniquely cited by others. PLOS recently annouced that authors must make all data underlying their findings fully available without restriction (with […]

  11. […] I’m on the fence regarding the calls for open data* (most recently by PLoSOne), this point by DrugMonkey is something that’s always bothered me (boldface […]

  12. […] This is supposed to be good for science and good for society—sharing is supposed to get us “a better ‘bang for the buck’ out of scientific research” that is primarily funded with public money. But this raises hard questions about the value […]

  13. […] The journal ecosystem is a powerful filter of scientific literature, promoting the best work into the best journals. Why not use a similar mechanism to encourage more comprehensive data sharing? Several journals have introduced policies mandating that data be shared on a public archive at publication (see, for example, here). […]

  14. […] Be that as it may, it is now March 6, 2014, six days since PLoS’s ‘revolutionary’ data sharing policy was revealed and only few people seem to observe the irony of avid social media participants […]

  15. […] things will change. PLoS is revolutionizing the publishing world in many ways, one of which is an open access data policy. That is clearly the best place to start (with the data), and perhaps some day the norm will be to […]

  16. […] has been a few months since the PLOS journals’ data policy was implemented. For the re-use and re-purpose of data by readers and by data miners, authors of […]