Goodbye PLOS Blogs, Welcome Github Pages

This is the last Gobbledygook post on PLOS Blogs, and at the same time the first post at the new Github blog location. I have been blogging at PLOS Blogs since the PLOS Blogs Network was launched in September 2010, so this step wasn’t easy. But I have two good reasons.

In May 2012 I started to work as technical lead for the PLOS Article-Level Metrics project. Although this is contract work, and I also do other things – including spending 5% of my time as clinical researcher at Hannover Medical School – this created the awkward situation that I was never quite sure whether I was blogging as Martin Fenner or as someone working for PLOS. This was all in my head, as I never had any restrictions in my blogging from PLOS. With the recent launch of the PLOS Tech Blog there is now a good venue for the kind of topics I like to write about, and I have started to work on two posts for this new blog.

There will always be topics for which the PLOS Tech Blog is not a good fit, and for these posts I have launched the new personal blog at Github. But the main reason for this new blog is a technical one: I’m moving away from blogging on WordPress to writing my posts in markdown (a lightweight markup language), that are then transformed into static HTML pages using Jekyll and Pandoc. Last weekend I co-organized the workshop Scholarly Markdown together with Stian Haklev. A full workshop report will follow in another post, but the discussions before, at and after the workshop convinced me that Scholarly Markdown has a bright future and that it is time to move more of my writing to markdown. At the end of the workshop each participant suggested a todo item that he/she would be working on, and my todo item was “Think about document type where MD shines”. Markdown might be good for writing scientific papers, but I think it really shines in shorter scientific documents that can easily be shared with others. And blog posts are a perfect fit.

The new site is work in progress. Over time I will copy over all old blog posts from PLOS Blogs, and will work on the layout as well as additional features. Special thanks to Carl Boettiger for helping me to get started with Jekyll and Github pages.

Category: Thoughts | Tagged | 6 Comments

re3data.org: registry of research data repositories launched

Earlier this week re3data.org – the Registry of Research Data Repositories – officially launched. The registry is nicely described in a preprint also published this week.

re3data.org offers researchers, funding organizations, libraries and publishers and overview of the heterogeneous research data repository landscape. Information icons help researchers to identify an adequate repository for the storage and reuse of their data.

I really like re3data.org, and that is not because I personally know several of the people involved in this project, or because they cited this blog in their preprint. I think that we are just at the beginning of building the infrastructure needed for research data management, and re3data.org fills an important need. In my opinion it is not enough to provide lists of research data repositories, we need additional information that can help guide researchers in selecting an appropriate research data repository. re3data.org has addressed this nicely by providing a vocabulary for the registration and description of research data repositories, and by creating a simple icon system:

The re3data icon system

Possible values for each icon. From http://dx.doi.org/10.7287/peerj.preprints.21v1

Future directions I would like re3data.org to take include:

  • Training and education. Researchers probably pick research data repositories mainly based on the familiarity of the repository within their community rather than the criteria developed by re3data.org. A lot more training and education is needed before researchers understand the importance of persistent identifiers, licenses and other criteria.
  • Integration. re3data.org can make it easier to integrate into existing scientific infrastructure, e.g. by using persistent identifiers such as DOIs for research data repositories, or by providing an API that makes it easier for other services to integrate re3data.org.
  • Governance. Whether or not scientific infrastructure such as re3data.org is accepted and used by the community depends on many factors, and governance is one of the most important ones. re3data.org should seek the support of other organizations, in particular from outside Germany. A governing board, re3data.org as an independent organization, and strategies to coordinate with similar efforts such as Databib are possible strategies.

 

Category: Thoughts | Tagged , | 2 Comments

Metrics and attribution: my thoughts for the panel at the ORCID-Dryad symposium on research attribution

This Thursday I take part in a panel discussion at the Joint ORCID – Dryad Symposium on Research Attribution. Together with Trish Groves (BMJ) and Christine Borgman (UCLA) I will discuss several aspects of attribution. Trish will speak about ethics, Christine will highlight problems, and I will add my perspective on metrics. This blog post summarizes the main points I want to make.


Continue reading »

Category: Thoughts | Tagged , , | 1 Comment

New DataCite / ORCID Integration Tool

A new service allows researchers to add research datasets – and other content with DataCite DOIs, including all figshare content – to their ORCID profile by integrating with the DataCite Metadata Store. The tool is an adaption (or fork) of the CrossRef Metadata Search developed by Karl Ward, and was developed by Gudmundur Thorisson and myself as part of work in the EU-funded ODIN project. More details can be found here.


Continue reading »

Category: Thoughts | Tagged , | 1 Comment

Announcing Markdown for Science Workshop on June 8th

On Saturday June 8th – exactly a month from today – the PLOS San Francisco offices will host a workshop/hackathon about using markdown for science. A lot of people are experimenting with markdown for authoring scientific articles – see blog posts herehere or my post here, and the scientific manuscript here.

Markdown is a simple markup language for text, and is primarily used for HTML content on the web, but can also be converted to PDF, LaTeX and others. One challenge with markdown is that there are a number of slightly different “flavors” out there, from the original markdown to multimarkdown, github-flavored markdown and pandoc. Some of the advanced formatting of scientific documents – tables, citations, math – is still a challenge for markdown.

Will markdown become our next authoring format for scientific content? Will there be yet another flavor, scholarly markdown? How will markdown writing tools be different from LaTeX tools or Microsoft Word? If you care about any of these questions and are in or near San Francisco, join us on for all full day on June 8th. Free registration is open at http://mdsci13.eventbrite.com. We are collecting workshop ideas at https://github.com/karthikram/markdown_science/wiki/workshop, the Twitter hashtag is #mdsci13.

This event is organized by Stian Haklev and myself, with generous support by a 1K Challenge prize from Force11, and hosting provided by PLOS.

Category: Snippets | Tagged | 4 Comments

Baby steps toward better metrics

Article-Level Metrics provide new ways to look at the impact of scholarly research. Two important concepts are a) to track metrics for individual scholarly articles instead of using numbers aggregated by journal, and b) to go beyond citations and also include usage stats and altmetrics.

Article-Level Metrics is also doing something else: instead of tracking impact by year, it looks at usage, altmetrics and citations in real-time. There might have been technical reasons to do so 20 years ago, but there really is no longer any reason why scholarly impact should be tracked on a yearly basis in 2013. Unfortunately there is one big stumbling block:

The publication date of a scholarly article is often difficult or impossible to obtain. Publication year may be the only available information.

A good example is CrossRef. They provide a lot of interesting metadata about an article and make this information available in a very nice search interface. But they only require the publisher to provide the publication year, information about the publication month and day is optional. There are many other examples of journals and services that just can’t tell you when exactly an article was published. This might have made sense when periodicals were printed on paper, but doesn’t work for digital content.

 

Category: Snippets | Tagged | 3 Comments

You should be able to install my software in less than one hour – or why DevOps is important

Cameron Neylon yesterday wrote a great blog post about appropriate business models for shared scholarly communications infrastructure. This is an area I have also been thinking about a lot recently, and in this post I want to add a technical perspective (and an announcement) to the discussion.


Continue reading »

Category: Thoughts | Tagged , , , | Comments Off

Mendeley and Elsevier

Earlier this week the rumors that started in January became official: Elsevier is buying Mendeley (see also here). A lot has been written about this announcement, in particular about the fear that Mendeley as a product and organization will turn into something not as open and collaborative as before.

I first met Victor and Jan from Mendeley in 2008 and did an interview with Victor in September 2008. We worked together in the organization of two Science Online London conferences (2009 and 2010, together with Nature.com and others), and my current job started with an entry for an API programming contest co-organized by PLOS and Mendeley, with the first lines of code written in the Mendeley offices during the Science Online London 2011 hackathon. I wish Mendeley all the best with their new parent.

What this acquisition signals to me is that commercial publishers are now moving into the software tools for scientists business at full speed. They have always done this, but with ReadCube by Digital Science (a Nature Publishing Group sister company) in 2011, the acquisition of Papers by Springer last year and now Mendeley, reference management now often means using a tool owned by a publisher – this market used to be dominated academic software such as Zotero and commercial software vendors such as Thomson Reuters (Endnote) or ProQuest (RefWorks).

For me this trend signals that publishers have realized that we are moving into an Open Access publishing model, which in contrast to subscription publishing is not about owning the content, but about providing valuable services around content that is free to read and reuse.

Category: Thoughts | Tagged , | 9 Comments

Comment: the case for open preprints in biology

Last week Philippe Desjardins-Prouly et al. published the article The case for open preprints in biology – naturally as a preprint on figshare. The article sees preprint servers as a great opportunity for open science, and discusses the status of preprints in the biological sciences. In this blog post I want to add some comments to the text.


Continue reading »

Category: Reviews | Tagged , , , | 2 Comments

Some Thoughts on Beyond the Paper

Today the journal Nature has released a special on the Future of Publishing. It includes a lot of interesting reading, but I want to focus on the comment Beyond the Paper by Jason Priem. In the comment Jason describes his vision of the future of scholarly communication, a future where many of today’s roles for articles and journals will be replaced by the decoupled journal and online tools taking the lead in dissemination and filtering of scholarly content.

Jason makes a strong case for this vision, and takes his time to also discuss the concerns and challenges. He doesn’t have the space to discuss in more detail how we get to that future, and in particular what the role of researchers, publishers, libraries and funders be in that transition.

Jason’s vision will probably be overwhelming for many researchers, and might not directly address what is probably the biggest issue for most researchers: funding for grants and jobs is limited, and the processes we use to select for good science and good scientists are inefficient and often arbitrary. Most students entering graduate school will not be able to have a career in academia, and most academics will say that they spend far too much time with evaluations – of their own work and the work of others. It is unclear to me how we can get from the current system – where one misstep such as denied grant or submission to the wrong journal can mean the end of a career – to the system that Jason envisions. The current climate doesn’t really foster experimentation by researchers and I am interested to understand how researchers can take part in this process of change.

The vision of the decoupled journal is very threatening for some of the stakeholders of the current scholarly communication ecosystem, in particular publishers and libraries. Every journal publisher and library knows that it has to reinvent itself to survive the digital transformation, but a vision that is build around a new ecosystem of service providers needs to be clear how publishers and libraries can be part of the transformation process.

Lastly, I disagree with the notion that today’s publication silos will be replaced by a set of decentralized, interoperable services that are built on a core infrastructure of open data and evolving standards — like the Web itself. I would argue that both scholarly communication and the web in general have a tendency for centralization, and that scientific infrastructure needs to be interoperable first and decentralized second. Without a focus on interoperability the future of scholarly communication will not be open and in the hands of many, but will be a race to become one of the dominant players in this new ecosystem, and we might end up with not 1000s of libraries and publishers but just a handful of technology companies holding the keys to our scientific infrastructure.

Priem, J. (2013). Scholarship: Beyond the paper Nature, 495 (7442), 437-440 DOI: 10.1038/495437a

Category: Thoughts | Tagged , | 9 Comments