What Users do with PLOS ONE Papers

Inspired by four recent blog posts and their comments (Comments at journal websites: just turn them offOpen Access and The Dramatic Growth of PLoS ONENo Comment?If you email it, they will comment), I created a graphic to show what users do with PLoS ONE papers. As always, the data behind the graphic are openly available. I think that the number of times a paper is informally discussed (comments, Facebook, science blogs, etc.) should be much larger compared to the numer of formal citations. The challenge is of course to have technology that captures all these discussions – this is much more difficult than for bookmarks or citations, and is obviously what altmetrics is all about. The blog posts I link to above also express another feeling: that there are still too many barriers for scientists to take part in the informal discussion of scholarly research on the web, in particular as comments on journal websites. Hat tip to David McCandless for inspiration.

Update 08/02/12: The publication of the dataset used in this chart was delayed, but the data are now available at the link provided. 

Related Posts Plugin for WordPress, Blogger...
This entry was posted in Conferences, Interviews, Presentations, Recipes, ResearchBlogging, Reviews, Snippets, Thoughts and tagged . Bookmark the permalink.

16 Responses to What Users do with PLOS ONE Papers

  1. Sylvain says:

    The graph seems to imply inclusion between, for instance, PLoS comments and PDF downloads, whereas I assume it only aims at comparing population sizes. Is that right?

  2. Martin Fenner says:

    Sylvain, I wanted to imply inclusion, even though this is of course an oversimplification. But I think that often the process works similar to this this: read short piece of paper (abstract, etc.) -> read fulltext -> save in reference manager -> discuss informally -> cite in a scholarly paper. The number of people at every step of course gets smaller.

  3. wohaaa… So. Many. HTML. Views.

    Seriously, I thought the PDF to HTML ratio would be higher, even though I realize you basically “have” to see the HTML first (unlike, say, Wiley’s Ecology and Evolution).

  4. Martin Fenner says:

    Philippe, yes this is indeed a lot of HTML views. You have to add the 12,492,110 HTML views at PubMed Central plus the unknown number of views in institutional repositories. The HTML to PDF ratio of about 4 is surprisingly consistent for most PLOS papers.

  5. Interesting post! I would be very interested to see twitter mentions of a paper and when papers are discussed on a blog too next to the facebook likes and discussions.

  6. Martin Fenner says:

    Twitter for PLOS articles was started just recently so the numbers are still small. Blog posts about PLOS articles are difficult to find, Research Blogging is one of the best sources (ScienceSeeker is another one).

  7. What language did you make the graphic in?

  8. Martin Fenner says:

    Scott, I calculated the numbers with R, and then made the chart with OmniGraffle, using 1/10 of the square root of the numbers as length in px.

  9. Pingback: Predicting the growth of PLoS ONE « LIBREAS.Library Ideas

  10. Why do these data not line up with the underlying dataset? There are 47,029 PLoS ONE articles in the dataset, yielding 278,270 CrossRef citations, as I tally it in the Excel spreadsheet you reference as the source for these data.

  11. Martin Fenner says:

    Kent, I think you looked at the dataset of all PLOS papers (including PLOS Biology, PLOS Medicine, etc.) whereas I was interested in the PLOS ONE subset.

  12. Yes, that’s what it is. The Excel just suppresses the fields holding the data for the other journals. Thanks for the help.

  13. Martin Fenner says:

    There is another small difference: I looked at the July data dump, which should be available shortly. The download link still points to the April data.

    One exercise you can do with this dataset is to compare citation counts from CrossRef, Web of Science, Scopus and PubMed Central. A preliminary analysis shows that they are very highly correlated, but I would like to understand where they differ.

  14. Pingback: Data Integrity and Presentation — Journalism, Verification, Skepticism, and the Age of Haste « The Scholarly Kitchen

  15. I’d suggest you update the blog post itself to indicate that the underlying data aren’t available yet and only extend through April. That will make it clear to visitors and readers.

  16. Pingback: Como redes sociais dão medidas do impacto de um artigo científico