Diving into the haystack to make more hay?

Diving into the haystack to make hay is one of the most inefficient activities imaginable (as well as a figurative absurdity). Doing science inevitably entails discovery, but the process has historically been far more difficult without effective tools to support it. With the quickening pace of scholarly communications, the vast volume of information available is overwhelming. The key puzzle piece might be as hard to find as a needle, and moreover, no scholar has the time to dive into the hay-verse to make more hay.

With recent advances in data science and information management, research discovery has become one of the most pressing and fastest growing areas of interest. Reference managers (ReadCube, Mendeley, Zotero) and dedicated services have been experimenting with novel ways to deliver relevant articles to scholars (PubChaseSparrhoEigenfactor Recommends). While we have by no means exhausted the number of ways to innovate and make them mainstream, we have yet to turn our attention to scholarly objects beyond the article. The utility of finding the full research narrative is established, a given. But the potential value of discovering and accessing scholarly outputs that are created before final results are communicated or never integrated into an article at all is almost untapped. Until now.

Scholarly Recommendations

Last year, our partnership with figshare began with the hosting and display of Supporting Information files on the article, accommodating the broad range of file types found in this mixing pot. Both the Supporting Information file viewer and figshare‘s PLOS figure and data portal increase the accessibility of the data and article component content associated with our articles. The latter makes PLOS figures, tables, and SI files searchable based upon key terms of interest.

Beyond the article: Today, we continue to build on the figshare offering with the launch of research recommendations, a service which delivers relevant outputs associated with PLOS articles and beyond. This begins to fill a critical need for tools that address the full breadth of research content. Rather than being limited by the article as a container, we can now present a far broader universe of scholarly objects: figures, datasets, media, code, software, filesets, scripts, etc.

figshare recommendations widget

 Hansen J, Kharecha P, Sato M, Masson-Delmotte V, Ackerman F, et al. (2013) Assessing “Dangerous Climate Change”: Required Reduction of Carbon Emissions to Protect Young People, Future Generations and Nature. PLoS ONE 8(12): e81648. doi: 10.1371/journal.pone.0081648

While papers tell the story of the final research narrative, data – the building blocks of science – are especially critical to the progress of research. They underlie the results expressed in a narrative that is published in papers. The most rigorous and expansive path of discovery includes not only related articles, but arguably even more fundamentally, data and a host of research outputs that lead up to the paper or may even be independent of an article. PLOS’ data availability policy was the foundational step, ensuring that data is publicly accessible. Delivering strong recommendations to surface relevant research now adds even more value for the scholarly community.

Beyond the publisher: the recommendations delivered by figshare extend beyond research outputs attached to PLOS publications. In fact, they are retrieved from the entire figshare corpus of over 1.5 million objects. We want to enrich the discovery experience for users using the breadth of possible OA research outputs, regardless of whether they have been published as part of a research paper. Not all scholarly outputs may fit in an article, but might very well be critically instrumental to others’ research.

Right at your finger tips

The recommendations are displayed for every PLOS article on the Related Content tab. To select the most related ones, figshare uses Latent Semantic Analysis across the entire PLOS corpus to build a “semantic” matrix, which is then used to retrieve a list of best related entries for each of the articles. Five recommendations are displayed with the option to load more. The type of file is denoted by icons or thumbnails when available, with a preview of the object upon hover-over. The full view of the file is available by clicking on the thumbnail. The content and all its metadata is available on figshare via the highlighted title. Keyword tags are also displayed, which can be used to find other associated content of that kind.

figshare recommendations widget2

Franzen JL, Gingerich PD, Habersetzer J, Hurum JH, von Koenigswald W, et al. (2009) Complete Primate Skeleton from the Middle Eocene of Messel in Germany: Morphology and Paleobiology. PLoS ONE 4(5): e5723. doi: 10.1371/journal.pone.0005723

Mark Hahnel, founder of figshare, said “PLOS has continuously demonstrated their desire to advance academic publishing and we’re always very happy to play a part in their innovations. The latest developments will ultimately make figshare content more discoverable and benefits our user base as well as PLOS readers and authors.”

With the figshare recommendations, it is our aim to advance the process of discovery and accelerate the research lifecycle itself. Please check them out on the Related Content tab at every PLOS publication, dig into the offerings, and see where your research goes. We welcome your thoughts and reflections. Are these useful and relevant? Would you like them delivered through additional channels? Feel free to comment here or contact @jenniferlin for PLOS information. figshare is also available via emailtwitter, facebook or google+.

Cross-posted on figshare blog.

This entry was posted in Tech. Bookmark the permalink.

2 Responses to Diving into the haystack to make more hay?

  1. Jason Schuman says:

    I am very encouraged with the drive for increased transparency by providing a mechanism for raw data or a greater amount of experimental data to be publicly available. What can be done to increase the amount of data corresponding to publications that is made publicly available?

  2. Pingback: Make data sharing easy: PLOS launches its Data Repository Integration Partner Program | PLOS Tech

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>