ISMB/ECCB 2013: much ado about data sharing

Sharing_gate_4

Last month PLOS ONE attended the ISMB/ECCB 2013 conference in Berlin on Intelligent Systems for Molecular Biology. More than 1,500 delegates attended what is the largest conference on computational biology in the world to discuss the latest developments in computational methods that address biological questions.

The opening keynote from PLOS ONE Academic Editor Gil Ast focused on alternative splicing, a mechanism by which several mRNA transcripts are generated from the same mRNA precursor, thus enhancing transcriptome and proteome diversity. He mentioned a paper his group published earlier this year in PLOS ONE, in which they showed that pre-mRNA splicing influences nucleosome organization, suggesting that there is a bi-directional interplay between chromatin organization and splicing. While it is widely accepted that chromatin organization and DNA modification regulate transcription, it is intriguing that splicing can in turn affect chromatin organization, and this may constitute an additional layer of regulation of gene expression. He also presented exciting recent findings showing how pre-mRNA splicing and the creation of new exons in the human genome may be linked to certain genetic disorders and types of cancers.

Picture1 Understanding the biology of complex human disease is also one of Goncalo Abecasis’s objectives, winner of the ISCB 2013 Overton Prize. Specifically, he is interested in better understanding genetic variation and its connections to human diseases using computational methods and statistical tools. In his talk, he emphasized that the identification and characterization of the genetic variants that affect human traits may be achieved by examining the link between these traits and the complete genome sequences of thousands of individuals. To collect DNA from as many people as possible, he wondered whether we should make use of social media to call for volunteers to send their DNA samples. Are Facebook and Twitter the key to understanding human genetics?

One topic that generated much discussion at the meeting was data sharing. In her talk, Carole Goble called for all scientists to share their data widely as to enable reproducibility, a principle underpinning the scientific method.  Several journals, including PLOS ONE, require that all data (including all relevant raw data) described in the manuscript be made freely available to any scientist wishing to use them for the purpose of academic, non-commercial research. Well established and widely supported public repositories already exist for certain types of data such as nucleic acid sequences, and in cases where an appropriate repository does not exist, there are also general data repositories such as Dryad. Assigned accession numbers or digital object identifiers (DOIs) facilitate data citation and ensure accountability. An increasing number of research funding agencies also now support data sharing in the life sciences. Whilst there is indeed increasing discussion to make primary data from published research publicly available, Goble mentioned a paper by Ioannidis and colleagues showing that a substantial proportion of articles published in high-impact journals do not comply (or only weakly comply) with data availability requirements. According to Goble, a lack of data sharing, and thus reproducibility, could lead to an increase in retracted scientific papers.

She also urged the computational biology community to release their “dark data”, i.e. data that is not published and remains hidden on various USB drives and computers, the point being that if shared more people will be able to use these results, increasing visibility, accountability and reproducibility. As highlighted by a recent study, data sharing is not an end in itself, but rather a crucial form of scientific knowledge dissemination.

 

Citations:

Keren-Shaul H, Lev-Maor G, Ast G (2013) Pre-mRNA Splicing Is a Determinant of Nucleosome Organization. PLoS ONE 8(1): e53506. doi:10.1371/journal.pone.0053506

Alsheikh-Ali AA, Qureshi W, Al-Mallah MH, Ioannidis JPA (2011) Public Availability of Published Research Data in High-Impact Journals. PLoS ONE 6(9): e24357. doi:10.1371/journal.pone.0024357

Wallis JC, Rolando E, Borgman CL (2013) If We Share Data, Will Anyone Use Them? Data Sharing and Reuse in the Long Tail of Science and Technology. PLoS ONE 8(7): e67332. doi:10.1371/journal.pone.0067332

Images:

Wikimedia by Angelineri

Modified from Schwartz S, Oren R, Ast G (2011) Detection and Removal of Biases in the Analysis of Next-Generation Sequencing Reads. PLoS ONE 6(1): e16685. doi:10.1371/journal.pone.0016685

 

This entry was posted in Aggregators, Conferences and tagged , , , , , , , . Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>