And This is Why We Should Always Provide Our Data. . .

For a long time now, I’ve been beating the drum of “provide your data.” If you’re willing to take take a whole mess of measurements and do a whole bunch of analyses for a published paper, why not share the raw data? New techniques and research questions continually arise, so it can be invaluable for other workers to be able to draw upon previously published databases. Although the situation is improving, it’s still far from perfect. Even today, I’m somewhat embarrassed to point out that some articles in PLOS ONE (a journal whose mission I support as advocate and volunteer academic editor) don’t provide relevant supporting data (recent examples here and here). But rather than dwell on the negatives, I want to point out a recent case study (also in PLOS ONE!) where data reuse benefited authors, journals, and science as a whole.

Derek Larson (now a graduate student at University of Toronto) and Phil Currie (a paleontologist at University of Alberta) published a massive paper in PLOS ONE concerning the identity of carnivorous dinosaur teeth. Paleontologists have a love/hate relationship with these teeth. On the one hand, they’re cool and pointy and fierce-looking. There’s nothing more thrilling than finding a tyrannosaur tooth! On the other hand, teeth kinda stink when it comes to species identification. A dromaeosaurid (“raptor”) tooth is a dromaeosaurid tooth, and you’ll never be able to get down to species level in most cases. So, museums have drawers and drawers of teeth identified as “Tyrannosauridae” or “Troodontidae” or “Dromaeosauridae”. To make matters worse, the delicate skulls and skeletons of many of these carnivores (which are useful for identifying species) are pretty stinkin’ rare.

A selection of carnivorous dinosaur teeth. These are representatives of eight tooth types that occur in rocks spanning 15 million years of evolutionary history–but the general forms surely represent more than eight biological species over this time! Figure 2 from Larson & Currie 2013.. CC-BY.

These vague tooth identifications are problematic for issues of paleoecology and evolutionary interpretations. Let’s say you have three different rock formations with dromaeosaurid teeth, spanning 10 million years of geological time. They all look pretty much the same, but it’s almost certain they represent multiple species (evidence from skeletons shows that dinosaur species just didn’t stick around for very long–a million years or so at most). This is a problem!

Skulls and skeletons of Saurornitholestes come from rocks around 76 million years old, but nearly identical teeth are found in rocks up to 10 million years younger. Odds are decent that they aren’t all the same species, though. Image by Emily Willoughby, CC-BY.

So the big question here is: can we actually find evidence that a dromaeosaurid tooth from the Hell Creek Formation (~66 million years old) is (or isn’t) the same species as a superficially similar tooth from the Dinosaur Park Formation (~76 million years old)? Or are teeth just totally useless? Fortunately, Derek and Phil found a clever work-around. By compiling measurements from over 1,200 different dinosaur teeth, they developed an analysis to look at the overall shapes of teeth from each formation. There is strength in numbers! It turns out that even though the teeth are superficially rather similar, there are subtle discrepancies in measurements between the teeth from rocks of different ages. This is thus consistent with different species at different time intervals. In other words, that dromaeosaurid tooth from 66 million years ago is probably not the same species as that tooth from 76 million years ago.

Here’s the really cool part: Derek and Phil were able to do their analysis so thoroughly and with such a large sample because other authors published measurements for theropod teeth! Although many measurements were original to the PLOS ONE paper, the great majority were from previous studies. Folks like Julia Sankey have released countless data tables of tooth measurements, mainly as a way to describe characteristics of particular specimens. Previous authors may not necessarily have been thinking of the type of analysis implemented by Derek and Phil, but nonetheless released data for others to see and use.

This is a win-win situation for everyone. Researchers Larson and Currie were able to merge the previously-published data with their own new data into a monster analysis (1,200+ data points, remember) that significantly advances science as a whole. These generous previous authors saved Larson and Currie perhaps months of work and thousands of dollars in museum travel! The previous authors also win, through increased utilization of their hard work as well as another citation for their papers. And at the basest level, the journals that allowed and encouraged massive data tables (either as supplementary information or in-text tables) win through an extra citation (which helps the almighty impact factor).

Derek and Phil are also paying it forward–all of their supporting data are accessible. If you want to re-run the analysis tonight, or add your own data, or whatever, you can do it! Here’s a big thank you to all of the folks who advance science by improving data sharing.

Citation
Larson DW, Currie PJ (2013) Multivariate analyses of small theropod dinosaur teeth and implications for paleoecological turnover through time. PLOS ONE 8(1): e54329. doi:10.1371/journal.pone.0054329 [open access]

Creative Commons License
The And This is Why We Should Always Provide Our Data. . . by The Integrative Paleontologists, unless otherwise expressly stated, is licensed under a Creative Commons Attribution 3.0 Unported License.

This entry was posted in Open Access, Open Data, Paleontology, PLOS ONE. Bookmark the permalink.

5 Responses to And This is Why We Should Always Provide Our Data. . .

  1. Good points, and good work.

    A nit-pick, however, regarding the caption on the tooth illustration from Larson & Curie: these are not just eight “tooth types,” they are eight particular, actual teeth. As such, though they arose from across a long span of time, they came from eight individual creatures, each of which belongs to its own characteristic species. Thus, it cannot be the case that “they surely represent more than eight biological species!”; in fact, because they represent eight individuals, they cannot represent more than eight species, and likely represent exactly eight.

    VA:F [1.9.22_1171]
    Rating: 0 (from 0 votes)
  2. Andrew Farke says:

    Thanks, Kevin, and good point! I’ll revise the caption. [caption now revised]

    VN:F [1.9.22_1171]
    Rating: 0 (from 0 votes)
  3. Pingback: The Permanent Vacation – Around the Web: A presentation on Open Data at the Ontario Library Association Super Conference [Confessions of a Science Librarian]

  4. Pingback: Around the Web: A presentation on Open Data at the Ontario Library Association Super Conference [Confessions of a Science Librarian] - lookfi

  5. Pingback: Why Should I Share My Data? » Data Ab Initio