Unmeasurable Science

On Wednesday PLoS BLOGs launched with a splash. We (both PLoS BLOGs as a whole and me individually) got a lot of positive feedback and words of encouragement – so we are off to a good start. As both our community manager Brian Mossop and myself are currently in London for the Science Online London Conference, we could celebrate the launch in person. With a good pint of British ale Thursday evening.

Today I want to talk about something that is sticking in my head since a conversation a few weeks ago with some friends (all esteemed professors in biology or medicine) over another beer. And this has of course been discussed before, both on this blog and elsewhere. Doing science is not only about doing exciting research, and communicating the results to your peers and the public. Measuring scientific output – of funded research projects, a particular researcher, or an institution – seems to be equally important. I would argue that in recent years this has even become the most important aspect of doing science. The successful researcher of today is not necessarily a brilliant mind, skillful experimenter or successful communicator, but a good manager of science. Grants need to be written, collaborations and networks built and maintained, and papers published.

This is all good and well in the sense that researchers should be held accountable for how they are using their funding, often from public sources. And we want to fund the right projects, i.e. those that have the highest likelihood of achieving something new and exciting. But there are two very big problems with this: 

We don’t really know how to evaluate science, particularly in numbers that can be used to compare research projects. This problem is aggravated by the fact that funding (and hiring) decisions are predictions based on past performance. Excellent science by definition is new and groundbreaking, and predicting scientific progress is really hard to do. The past performance of a researcher, the research environment he is working in (colleagues, scientific equipment, etc.), and of course the project outline written down in the proposal are all very helpful. But can we really  predict the next Nature or Cell paper before a project has even started? And the evaluation of scientific output is also extremely difficult. Do you just look at published papers? And if so, how do you evaluate the scientific impact of those papers? Through peer review? By the number of times they were cited? Citation counts have several problems, one of them is the fact that they take a few years to accumulate. The journal a paper is published in? How good an indicator of the impact of the individual paper is that? Should you rather look at download counts or other article-level metrics? We need to think much more about these important issues, as our funding (and hiring) decisions depend on it. I am very happy that I was invited to a workshop by the National Science Foundation this November that will talk about some of these issues. Unique digital author identifiers will play an increasing role in our efforts to tackle the technical aspects of this problem, and I will say something about the ORCID initiative.      

The evaluation of science is taking up more and more of our time that is then missing time for doing research. Before the first experiment is even started, a project has taken months or sometimes even years of grant writing and grant reviewing. The regulatory requirements are also increasing, and in the case of clinical research involving patients (something I do) can be overwhelming. After a research project is finished, paper writing and reviewing (and writing a report for the funding agency) again takes many months. In the end it might have taken us two years to do the experiments, but five years from beginning to end of the project.

If we want researchers to do more research – and less grant writing, manuscript writing and peer reviewing (because all this output has to be evaluated by someone), we have to ask funding organizations and institutions to do something about this. There are many possible solutions, and some of them have already been realized:

  • Grants can be given for longer periods of time, e.g. 5 years instead of 3 years
  • The review of grants could just look at the researcher, and doesn’t try to predict scientific discoveries based on the proposed work
  • Smaller grants, e.g. less than $50.000 don’t need to have a concluding report, a link to a paper published with the results should be enough. Similarly we might need fewer progress reports in larger grants.
  • Many aspects of grant and paper writing could be made less time-consuming by standardization and automation. 

There is a lot of potential in the last point, and the workflow is broken at many points. Why do researchers have to list their publications in their CV, this information is publicly available? Why does every funding organization want a slightly different format for their grant proposals, can’t we standardize this? Why don’t we have better paper writing tools? Microsoft Word doesn’t really now about the different required sections for a manuscript and is far from perfect for collaborative writing.

But I have to stop writing now, the second day of the Science Online London Conference is about to begin. I’m wearing my new PLoS BLOGs T-shirt and look forward to another great conference day. I will write about the conference in a separate post in the next fews, but for now let’s just say this conference is even better than in 2009.        

This entry was posted in Thoughts. Bookmark the permalink.

9 Responses to Unmeasurable Science

  1. mangrist says:

    These are great ideas. I think many scientists got fat and happy during the period when the NIH budget doubled. Those days are clearly over.

  2. Peter says:

    These ideas sound very good and I agree with them, but I think this is only half the problem. The change has to got both ways. Researchers and their institutions have to become much more professional when it comes to acquiring grants and evaluating projects. We need better standards, but they will be wasted if we, the researchers (and our institutions), do not invest in efficiently using them — which includes money and time for training.

    Research institutions often do not offer any kind of support or training for their own researchers when it comes to grant writing — on any level. Neither do they offer support for a later evaluation. Why is that so? Why don’t we require a certain amount of overhead to be spent on developing such support? Why don’t we have people employed that specialize in giving this kind of support? After all, even standards should be revised regularly — and as a researcher I don’t want to be the one who has to keep track all the time.

    As an example, I could imagine that continuous reporting/evaluating could be less draining than the big reports that nobody reads. It should be possible to require researchers to keep a blog-like log for a project. After all, your lab notes are already there, why not keep them digitally and with some additional reflection once in a while? Writing a ‘report’ the length of a blog post or just a good long email at the end of the week seems much less work than a huge report at the end of 6 months or a year.

    But again, this would have to be developed professionally — just as writing in general is a skill that requires training, so would such a kind of reporting/evaluating. Do we have/develop tools for that? Do we share the tools that we have? Do we train our students to use them? Do we have methods to allow trusted outsiders to evaluate such ‘status updates’? While the project is running?

    Such transparency could even shift the focus back to “successfully failing” instead of blowing up every single small result to reach the smallest publishable unit. After all, science is about failing, again and again, until we have revised our ideas enough to make progress in understanding. Imagine, our failures would become our successes again.

  3. Martin Fenner says:

    Peter, thank you for the very detailed response and the many important points you mention. I would like to add another one, as English is not my first language. Practically all our manuscripts and many grant proposals are written in English, and it would be a big help (and time saver) if institutions in non-English speaking countries would be better in training and support for science writing.

    I like your “continous reporting idea”. A good evaluation tool should not require any extra input from researchers, and should blend in into their workflow. It also blends in with a general trend of Open Notebook science. At the Science Online London Conference this weekend, Alice Bell talked about Upstream Reporting, which is the same concept with a journalistic perspective.

    Mischa, thanks for pointing out that you can’t talk about evaluating science without talking about the economics around it. And this leads to a very difficult discussion of what disciplines should be funded. Do we want to spend a finite amount of money for climate research, particle accelerators or paleontology? And can we really evaluate the quality of science across disciplines? Impact factors are an example where cross-disciplinary comparisons fail badly.

  4. Akshat Rathi says:

    Great post and yes there may never be a silver bullet to this problem. But I like the idea of automation and I think one can envision, even with current technology, that we can make tools for the funding bodies which keep continuous track of the many metrics of a particular project. This may need the researcher to ‘cite’ the project code every time he publishes something in a mainstream journal (or the progress blog, as mentioned by Peter). This way the researcher at the end of the project can then only write a small report about what they learned from the failures of the project (because successes have already been ‘recorded) and where he sees future research going.

  5. Martin Fenner says:

    Akshat, I’m a big fan of standards and automation, and a standard identifier for research grants would help a great deal. I haven’t really seen anything in that direction yet. It might be easier to assign universal identifiers to granting agencies and then use their internal number.

  6. Pingback: Quick Links | A Blog Around The Clock

  7. Pingback: ORCID as unique author identifier: what is it good for and should we worry or be happy? | Gobbledygook

  8. Peter says:

    Thanks for your kind comment, Martin. Unfortunately, I was not able to reply sooner. You wrote

    ‘a good evaluation tool should not require any extra input from researchers, and should blend in into their workflow’.

    While I agree very much with the latter, I somewhat strongly disagree with the former ( in a certain way and maybe we actually agree).

    As I wrote, I think we (as researchers) must become more professional with respect to reporting and evaluation tools. But evaluation takes time — we should not believe that new tools can eliminate the work as such.

    I think if there are to be automated tools, we need to invest time in learning how to use them well — and plan for the time necessarily spend on using them. At the same time, this must be accepted by whoever pays for our research as ‘time well spent’, i.e., this must become a valued activity.

    Let me make a comparison with professional programmers — on a small project level, I’m not thinking of Google here, but your average small business with small projects. From what I hear , one of their biggest problems is documentation.

    A lot of code is written without documenting it properly. As I understand it, lack of documentation leads to two problems. On the one hand, nobody can take over even a small project from another programmer since it turns out to be easier to just re-write the damn thing rather than to understand undocumented code. On the other hand, because of small hacks, workarounds and general ‘deadline approaching’-rushing, mistakes creep in leading to errors and security risks.

    The more abstract problem is, as I understand, that writing documentation is often not considered ‘real work’. That is, programmers are discouraged to document well, since, well, they should spend their time programming! But good documentation is a difficult thing. It drains time in the short term and the long term benefits do not always seem to outweigh that. I think the same is true for good a evaluation or reporting tool in general.

    If we, as researchers, want to keep better records to make evaluations easier/automated, we need to accept that this costs time in the short term (because, well, that weekly blog post might require an hour and when you just learned about it maybe three and you could already be out for some tgif action or with your loved ones).

    So after writing, again, too long a response: Evaluation should be long term and automatic, but the work for the researcher needs to be short term — and it will be work! And this conflicts needs to be considered in automation or other means to make evaluation less of a pain.

    Above all, the work that goes into this needs to be considered good, serious, scientific(!) work.

  9. Martin Fenner says:

    Peter, again thanks for a very thoughtful comment. I think that we are actually very much thinking along the same lines. “Integrated” is probably a better word than “automated” for the evaluation that I have in mind.

    The scientific paper as synthesis of a large amount of work is of course still required, but there is no reason why the paper should be the only documentation of a project.