On Microattribution

At the Science Online London Conference later this week I will moderate a session on microattribution, together with Mike Peel, Bora Zivkovic and Scott Edmunds. I thought that microattribution would be an established concept, so I was surprised to find so little information about it. Wikipedia doesn’t know about microattribution. A search in PubMed retrieves only two hits (the 2011 Nature Genetics paper mentioned below and an editorial in the same issue). One of the first mentions of the term appears to be an August 2007 Editorial in Nature Genetics (Compete, collaborate, compel), but at the time the term also included what we today would call data citation. It therefore might be a good idea to try a definition:

Microattribution ascribes a small scholarly contribution to a particular author.

Important for this definition is the small scholarly contribution, which until now we have been unable to appropriately associate with its contributor. This scholarly contribution is too small to merit a scholarly paper or publication as dataset. And in most cases we have not bothered to provide unique identifiers for the scholarly contribution and/or the author.

A very good example for where microattribution can be valuable is the description of genetic variation. A March 2011 Nature Genetics paper (Systematic documentation and analysis of human genetic variation in hemoglobinopathies using the microattribution approach) concluded that microattribution demonstrably increased the reporting of human variants, leading to a comprehensive online resource for systematically describing human genetic variation. One example described in the paper is the genetic variation in the promoter of the KLF1 erythroid transcription factor, explaining differences in the level of fetal hemoglobin (HbF). Most of these variants were never published in a paper (the blue squares in the figure).

Figure 3 from Giardine B et al. Nature Genetics 2011. dx.doi.org/10.1038/ng.785.

For the first time we now seem to have both the technology and willingness to enable microattributions on a large scale. There will be ample time for discussion in the microattribution session on Friday, but I’m personally most interested in the next practical steps to move microattribution forward. My background is unique author identifiers (ORCID) and my co-moderators bring in their experience with Wikipedia (Mike Peel), science blog aggregation (Bora Zivkovic) and crowdsourcing of sequencing efforts (Scott Edmunds). Some of the questions that I would like to address in the session are:

  1. Should all microattribution information be collected in one, several or many places?
  2. Do we need one, several or many identifier schemes for contributions and authors?
  3. What level of detail should we allow for microattributions?
  4. Should microattributions use persistent identifiers?
  5. Can and should we keep our scholarly contributions separate from other contributions?

My simplified view of the scholarly record, updated

I have said before that I think that attribution should be separated from evaluation, and I think is also true for microattribution. The main reason is that evaluation is something we still know very little about, and a scholarly record available to everybody will make it much easier to make progress here. I don’t think anybody knows yet what distinguishes a good from a bad microattribution, or whether it is possible to compare the scientific impact of 10 or 100 microattributions to one scholarly paper.

Disclaimer: I sit on the Board of Directors of the Open Researcher & Contributor ID (ORCID) initiative which aims to help solve this and related problems.

