The Beauty of Scientific Papers, The End of PDFs

There are endless new web technologies that academic journals should experiment with, yet they don’t all translate to the world of PDFs. Do we bound ahead, making richly interactive and accessible HTML articles while the PDF suffers? Should we support continued innovation on the PDF side?

We raise these questions in response to an appreciated mention of the PLOS journals in a post by Stefan Washietl over at Paperpile’s blog. Stefan takes a look at design trends in academic publishing and beautiful papers to come out of some of the most influential journals of the past 350 years.


Of course, it’s great to have both the PLOS article PDFs and HTML pages highlighted as beautiful examples, but we think there is plenty more work to be done. We’re working on a mobile-optimized version of our journals, new journal homepages, and we have a few tweaks and refinements to the article page coming down our development pipeline, all based on user feedback and reactions to our December 2012 redesign. And like Stefan, we’re excited to see new article viewers like eLife’s Lens and NCBI’s PubReader exploring new ways of interacting with research content on the web.

So in light of all these exciting developments on the web, we echo Stefan’s final question: “for how much longer will we print PDFs?” And toward that future, what do you think it will take for the community to completely abandon PDFs in favor of HTML articles?

Related Posts Plugin for WordPress, Blogger...
This entry was posted in Tech. Bookmark the permalink.

5 Responses to The Beauty of Scientific Papers, The End of PDFs

  1. Martin Fenner says:

    I agree that there are many good reasons to keep copies of articles offline, in particular if they are from subscription journals. epub is the standard format for doing this with HTML, and it has become the standard for ebooks (together with the related mobi used by Amazon). PubMed is providing article downloads in epub format, but overall the format is not yet widely supported by publishers, reference managers, etc.

  2. Chris Rusbridge says:

    While appreciating the advantages of HTML-based articles, so far they lack one the major advantages of PDFs. The latter are portable and relatively fixed. I can save them, transfer them from device to device, leave them and open them many years later, and they will be the same.

    Try saving a HTML article, moving it to a different computer, opening it in a different browser, leaving it and coming back a few years later. It’s not impossible that these things will go well, but it is not the norm; it requires care and work, whereas with a PDF, it just kinda happens by default.

    HTML articles clearly have real advantages over PDF articles for some research. But fixity and portability are important, and currently broken in HTML.

    If you want to concentrate on articles in HTML without the capability of a PDF as well, then fixity and portability are areas to concetrate on. Which probably means a packaging standard in HTML, or at least among journals. The Safari .webarchive is an example, but not sufficient as it is not supported by other browsers (and probably not Open, either).

  3. John Peloquin says:

    Printing articles to physical paper will likely be with us for a long time, regardless of whether they are delivered as html or pdf. Anecdotally, most of the scientisists I work with use hard copies and file folders to organize their literature collections. As their collections grow, the time required to convert these collections to a new organizational system also grows. The time available to undertake such a conversion decreases as one’s career progresses, so I expect most of the people who print articles today as graduate students will still be printing articles when they retire. Organizing paper copies in file folders has certain advantages: reliability, no vendor lock-in, unlimited viewing area, ease of browsing, and easy annotation. The principal disadvantage is difficulty searching and reorganizing the collection. Overall, the system works quite well, especially when combined with citation management software. In this usage case, a migration of publishers from pdf to html is essentially irrelevant; it can bring no new advantage.

    The choice of html or pdf is somewhat more important for those of us who digitally maintain our article collections. However, digital organizational schemes must solve the same set of problems as a paper-based scheme, and end up resembling the physical filing cabinet to some extent. Some people reproduce the physical filing cabinet exactly with files and folders on their hard drives; others (including myself) use searchable databases. Annotation of articles—scribbling in the margins, attaching notes—remains important, as does the ability to reliably use the collection over decades. This means not depending on third party webservers that may change or disappear. Annotation tools for html have unfortunately not yet matured. Since there are options for annotation pdfs, I solve the html annotation problem by printing html articles to pdf. Local storage and vieweing of web content is, however, possible by downloading a page through the browser or with a database-driven application like Evernote or Zotero. Results vary between sites, especially when the sites make heavy use of javascript (like most journal websites). Overally, the digital journal article collection doesn’t at the moment benefit from html any more than the paper-based system.

    Still, I do think that html content delivery has the potential to be a net benefit. Html (really, xml) can be repurposed much more easily than postscript by using different stylesheets or a different viewing application. This raises the possibility of displaying the article content in more convenient ways than a static pdf can. The way Lens presents figures alongside their in-text references is very convenient; not having to flip through the article to cross-reference figure and equation numbers is a huge help to understanding the text. Html can also be dynamically reformatted depending on the device, for instance to splay across three widescreen monitors or fit on a phone display. It is also easier to rearrange or excerpt xml content than pdf content, for example to produce a summary of one’s notes alongside the relevant bits of article text. Dynamic display may also have a deterimental effect on recall, though—I find that I remember articles in part by the appearance of their typeset layout. I’m definitely looking forward to experiencing the advantages of dynamically formatted content, though.

    In summary, viewing html articles in web apps like Lens is superior to pdf when reading for fun, but for work, the necessity of organizing and integrating information from many content providers makes pdf a better choice. To replace the pdf ecosystem with html, we need 1) convenient storage of html-based articles for local viewing, 2) many options for local viewing of html content, just like we have many choices when viewing pdfs, 3) html annotations that are compatible between viewer applications, and 4) these html article viewers must provide a reading experience superior to pdf viewers. These aims require the adoption of an open standard for html delivery of journal articles. Even though this might at first seem like reinventing the wheel, formats that separate content from presentation, like html, are a fundamentally stronger choice than pdf for display and reuse of content in diverse environments.

  4. Grant Jacobs says:

    Final paragraph confuses two different questions. People can want to use/read PDFs without printing them.

    Don’t presume everyone OK doing (extensive) reading on mobile devices – many find the small screens too limiting (incl. those with poor eyesight).

    PDFs good for filing for later reading. (Don’t forget that full-text searching of own private library can be very useful for some.)

    Yes, you can store HTML locally, but as a format it naturally “wants” to be distributed; web archives are still pretty clumsy.

  5. Tom Ulrich says:

    Even though wireless connectivity is growing by leaps and bounds, I think there will always be a need for some kind of option for offline viewing of web-based content while in low/no bandwidth environments. For instance, I love the recent crop of online science mags, but when I’m on a subway train with my WiFi-only iPad, unless I’ve saved them already with an offline viewer like Pocket or Instapaper, they might as well be locked in a safe someplace. The same for papers.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>