Why can’t I reuse these tables and figures?

Tables and figures contain the data of a scientific paper in condensed (and often visually appealing) form. This is why they are among the first thing we look at, and why they are often reused when we discuss the paper in a presentation or blog post.

From doi:10.1371/journal.ppat.1001042. Image credit: Thomas J. Hannan, Washington University.

Electronic publication has dramatically simplified the reuse of tables and figures, and therefore reuse has become very common – you probably find reused material in most presentations given in academic institutions or at conferences.

Most authors will probably be happy that their results are disseminated, and reuse is likely to lead to more people reading the full paper and citing the work.

From doi:10.1186/1471-2407-10-286.

But this reuse has two problems. The first problem is copyright. Many journals own the copyright of the papers they publish and don’t allow reuse without prior permission. Unfortunately copyright is a complicated issue, and also differs between countries. Most researchers assume “Fair use“ when they reuse material, but this might not apply to all situations, e.g. presentations at conferences. And many researchers don’t understand that they have often given away the copyright to their own works, so that they can’t show a figure from one of their papers without permission.

Many publishers have automated the process of obtaining permissions for copyrighted work, e.g. using the Rightslink system of the Copyright Clearance Center. But it still requires a considerable investment in time (and often money) to obtain all permissions, especially since these are usually one-time permissions only. This combination of unawareness of the details of copyright law and the required extra work means that many researchers probably don’t obtain permissions prior to reuse.

The solution to the copyright problems is obviously to use material with a Creative Commons license whenever possible, as I have done in this blog post. And most Open Access papers are published under this license, so there is plenty of material to choose from.

But there is also a second problem with reusing tables and figures. They were designed to be part of a paper and often look terrible in a presentation, particularly tables.

From doi:10.1371/journal.pone.0006022.

The solution to this problem is to provide the data behind the table or figure, so that the information can be displayed in a way that makes sense in a presentation. Here we usually have to reduce the amount of information, but it could also mean that we remix the content with other sources. The Creative Commons licenses discussed above are not appropriate for data. Whenever possible, scientific data should be placed in the public domain.

It is important to distinguish the publication of table and figure data from the publication of the whole research dataset. The open questions with the latter (e.g. standard data formats, appropriate repositories, archiving) don’t apply to the former. This means that publishers could start providing these data immediately. I’m confident that they would see an increase in paper downloads and citations. But more importantly I hope this would lead to better presentations in seminars and at conferences.

Related Posts Plugin for WordPress, Blogger...
This entry was posted in Conferences, Interviews, Presentations, Recipes, ResearchBlogging, Reviews, Snippets, Thoughts and tagged , . Bookmark the permalink.

13 Responses to Why can’t I reuse these tables and figures?

  1. Shaddam IV says:

    This may be irreverent (and irrelevant), but those pathogens up there look like the Hand of God.

  2. Travis says:

    I don’t doubt that anything that makes it easier to demonstrate the results of a paper at conferences, will lead to increased citations. One question – can’t the data behind a table or figure typically be gleaned from the table or figure itself? There have been several times when I have used a summary table to create my own figure for our blog or a conference presentation. Do you mean that the data should simply be provided in a more user-friendly format (e.g. I wouldn’t have to guesstimate the concentration of plasma triglycerides from looking at a bar graph)?

  3. Neil says:

    “Can’t the data behind a table or figure typically be gleaned from the table or figure itself?”

    In a word: no. The vast majority of figures, tables or calculations in scientific publications cannot be reproduced. This is a big problem.

    If the data were available as say a CSV file, anyone could read it into their favourite plotting/stats package, generate an approximation of the published figure or do the analysis for themselves. Taking that idea further leads to literate programming, where a file contains both the code that performs the analysis and the formatting instructions to create a published report with the results.

  4. i am mystified as to why you are writing about this since the blogosphere has already gone through this — several times. we CAN use tables and figures in a commentary or “translation” of a scientific paper without asking for permission.

  5. Martin Fenner says:

    Grrl, I’m writing about this because it is not only about using tables and figures in blog posts. I have just spent 9 days and several emails to get a permission for a single table I want to use in a presentation.

    And, I really want the data that were used for the table or figure, and not the table/figure as image file.

  6. Ideally not only would the data in the table be part of the publication (such as in the form of a CSV file), but the table itself would be represented in machine-readable form. That is, with some kind of metadata indicating the number of rows, columns, and their contents. Does a standard for specifying a table exist? This is important for automated data mining of the conventional literature. I’m also interested in it so that evidence charts can be mined.

  7. What I’m talking about with machine-readable tables is sort of a halfway measure, easier to achieve in the short term than the literate programming mentioned by Neil… which would be even better

  8. Pingback: Quick Links | A Blog Around The Clock

  9. Greg Tyrelle says:

    I find tables in PLOS/BMC journals as PDFs more often than not. Multiple times I’ve had to extract this data with ad hoc pipelines of pdftotext and regexps. My sense is that publishers will gain more traction if they help to solve these problems. Maybe then we could be routinely doing things like this: http://chem-bla-ics.blogspot.com/2010/09/visualizing-data-embedded-in-xhtmlrdfa.html

  10. Travis says:

    Neil,

    Could you give me a specific example? In my area of research, it’s usually pretty straightforward to get the data from the table/figure, but I don’t work with very complicated data. And it seems that if there is enough data for someone to re-do an analysis, wouldn’t that require publishing pretty much the whole dataset?

    Grrl,

    I thought that the Sb experience was that we could re-create tables and figures on our websites, but not copy and paste an image of the tables/figures themselves. Is it true that it’s kosher to simply copy and paste a figure?

  11. Martin Fenner says:

    Alex, I agree that directly embedding the data for tables and figures would be even better. Semantic Web experts will have a better answer to this, but I would prefer a simple solution that can be easily implemented.

    Travis, I agree that table data are often easy to get. But it is still a manual process, and doesn’t work for more complicated tables or figures.

    I will talk about this topic (among other things) at the STM Frankfurt Conference next week. I hope that some of the publishers in the audience will start thinking about embedding table and figure data.

  12. Pingback: Blogging Beyond the PDF | Gobbledygook

  13. Pingback: Blogging Beyond the PDF | Social Media Master