The Price of Innovation – my Thoughts for Beyond the PDF

The Beyond the PDF Conference is currently taking place in Amsterdam. Unfortunately I am unable to attend in person this time (I took part in the first Beyond the PDF in January 2011), but I was watching the livestream of the Business Case panel disucssion yesterday afternoon.

the globe of science and innovation

Globe of Science and Innovation at CERN, Flickr photo by nico h.

How to pay for the development of new scientific infrastructure and tools is something that I think a lot about science moving away from academia to become a developer of scientific software last year. I would assume three things:

  • there are a lot of great ideas out there to improve scholarly communication
  • there is enough money out there to pay for improvements in scholarly communication
  • we are frustrated because progress is much slower than we anticipate

If we have enough great ideas and enough money, but don’t see the results we expect, something must be going wrong. A simple answer would be that it is different people and organizations that have the ideas from those that have the money, but I don’t think that this is the reason. My suspicion is that there is a deeper problem, and that the approach we take to scholarly innovation is broken. Below is how innovation is approached by the major players:

  • individual scientists and/or software developers come up with great ideas, but don’t get past the prototype stage because of limited resources
  • academic tools and infrastructure are built as part of a funded project (anywhere from 6 months to a few years), but there are no resources to turn this into a service that is persistent beyond the project
  • publishers and large academic institutions have the resources to build these tools. They are often less innovative because of their size
  • funders pay for projects (see above), but rarely for infrastructure, and they rarely get involved in innovative projects themselves
  • commercial organizations can quickly bring great ideas to market (in particular small startups), but it is often unclear how their services are paid for in the long run
At the end of the day it seems that we have a lot of great ideas, but many of them never reach critical mass, and an even smaller number has long-term sustainability. I can think of a number of great projects that have never gained traction, and of a number of great tools and services where I have no idea how their development and service is paid for. The idea to get to a large number of users no matter what it costs, and figure out the business plan later is popular with internet startups, but dangerous when we care about tools we want to still use two years from now. Two projects that are not specific to science, but are important for science and have made this work are Wikipedia and Github. From the long list of tools for scientists I would not pick Mendeley or figshare (both great services, but still in search of sustainability), but ArXiV and Papers.
It also doesn’t help that most scientists are a conservative bunch when it comes to technology, and that the scientific market is fairly small compared to the overall number of users. Another big challenge is to innovate in an open environment, i.e. to make the innovation available to as many people as possible without barriers of access. Some of my personal conclusions from all this are the following:
  • we should acknowledge that we have an innovation problem, and it is not simply solved by getting more money
  • we have a collaboration problem, too many people are doing similar things without talking to each other and working together
  • scientific infrastructure and tools cost money. We need the right people to pay (ideally not the individual researcher), fair prices and intelligent business models
  • funders should reconsider how they pay for scientific infrastructure, as the project-based approach is broken
  • large organizations (commercial, non-commercial and academic) should think about their approach to innovation, in particular how they support innovation outside of their organization

You can follow the Beyond the PDF livestream today or follow the Twitter hashtag #btpdf2.

Related Posts Plugin for WordPress, Blogger...
This entry was posted in Thoughts and tagged , , . Bookmark the permalink.

14 Responses to The Price of Innovation – my Thoughts for Beyond the PDF

  1. Phil Lord says:

    I think you missed an important point. We *can* innovate in publication. It’s not hard, nor expensive. The difficulty is that we are forced to publish with legacy publishers because a) the libraries will offer them services such as archiving that they will not offer individual scientists and b) universities and governments like to judge research by the cover of the book (erm, journal) it is published in.

    Publishers do not need to innovate. They just need to maintain the reputation structures that they have built up. Scientists can innovate in publication, but get little credit for doing so.

  2. Ian Mulvany says:

    Great post, I’ve been thinking about the same issues for a long time too. I’ll add a few quick observations.

    Getting from idea to a tool that is used by a lot of people is a problem that is not unique to this space. Creating products with market fit, and that gain market adoption is intrinsically hard. There are a lot of other tools out there that are doing well that have not gained huge market or mind share, but that provide value to their base and seem to be sustainable e.g. – citeulike, labarchives. There are lots that have hit the deadpool e.g. – JOVE, Connotea, nature preceedings.

    I hope that there will always be this process of new ideas emerging, some working, some not working. It’s going to be hard to figure out ahead of time which are the tools that emerge and gain traction that provide a phase transition to the way things are done in general. It may be that tools of that scale will always come from the consumer space e.g. – google, twitter. I’m not convinced that publishers will ever get their shit together on this, and that’s because they have too much technical and process debt.

    I do worry about interoperability.

    I do worry about the amount of time that I seem to be waiting to see good tools built on top of data mining, today I’m encouraged by Impact Story, the PLOS ALM tool and Altmetric.com.

    I don’t worry too much about sustainability, if the tool in question are currently adding value into the ecosystem. I’m doubtful that researchgate adds that much value, I think today mendeley and figshare are undoubtedly adding value. If they hit the deadpool they will leave behind a cohort of people who are expecting better from their tools, and who will find reasonable alternatives.

    I totally agree that project based funding tends not to work. It’s useful for getting a few publications out, and for keeping some dev positions going within academia. It’s rarely useful for anything else, and you wrap that statement up in RDF and take it home to the triplestore if you wish.

    I agree that we should talk more.

    I don’t know where the future is, but my hope rests on our tools and platforms becoming easier to work with, and so making experiments cheaper to try out. That’s what I’m trying to do a eLife right now.

  3. Jason Hoyt says:

    Great analysis, Martin. I too was watching portions of #btpdf2 from abroad yesterday.

    There might be an innovation “problem,” but is that any different than other industries? Take the financial industry, or healthcare, tech, the housing market, etc… In every industry it is extremely difficult to 1) come up with a sustainable idea, 2) bring together the right people to make it happen, and 3) penetrate a market full of incumbents who have the money, network, and lobbying power to stop new businesses. And large organizations within every industry either resist internal change or are incapable as well.

    To my eye, science and scholarly communication is not entirely different from what is happening outside. One thing that might be different is the apparent number and experience of innovators in this smaller science market. That means innovation is slower. The number of people interested and experienced in “sustainable innovation” within science is definitely growing, however. We are all learning from the initial wave of innovation, what has worked, and what doesn’t work.

    Science innovation is behind the curve, but it is also catching up.

    Disclosure: I am the co-founder of PeerJ and previously VP of R&D at Mendeley.

  4. Martin Fenner says:

    Phil, I think we disagree here, or maybe we look at this from a different angle.

    Ian and Jason, thank you for adding your thoughts. It is certainly true that these problems are not unique to scholarly communication. But I think it makes sense to think about what we as the scientific community can do to promote innovation. Two thoughts I like is to lower the barriers for innovation, and then quickly move on when things don’t work out for one reason or another, and to focus on interoperability. This is probably why I like persistent identifiers and good APIs so much.

    We should not only talk more to each other, but also work closer together, and open APIs are obviously good for that. The emerging altmetrics / article-level metrics space is a good example for great things we can do together, but also for opportunities we can miss because we insist on one particular technical platform, business model, etc.

  5. Stacy says:

    >> libraries will offer [publishers] services such as archiving that they will not offer individual scientists <<

    Phil, could you elaborate? My understanding is that librarians often do offer archiving to scientists in the form of repositories. Few researchers use those services, though, and that might be because repositories have their own innovation and usability problems.

  6. Nice post Martin.

    You mentioned Github above. My feeling is their model is one to strive for. That is, they make money off not only individuals, but enterprise customers. They are able to promote open source (good for innovation), and make a lot of their code base open source (more innovation). Their formula seems to work – open source almost everything, make money off the enterprise, and keep consumers happy by creating good looking/easy to use consumer experience. This model seems to be absent in publishing – though PeerJ looks more like a tech startup than a publisher, so that’s good.

    Another problem is incentives. There is no incentive for academics to write software – though this may be slowly changing with at least NSF saying you can get credit for data products – maybe software soon. So, academics can be involved in hackathons, etc., but then you go home and think well I’d better work on this paper instead of that software. Thus, innovation seems more likely to happen outside academia.

  7. Mr. Gunn says:

    Oh, man, are you singing my tune here, Martin!

    ” …too many people are doing similar things without talking to each other and working together”

    Yes, absolutely, but I think this is necessary. Getting a project off the ground and running is a huge effort. The people who start startups pour their entire lives into those things. You won’t often find enough passion to start a truly innovative project, but not the ego to insist on doing it your own way, in the same person or even on the same team. So it’s really more of a Darwinian progress here, in my view. God bless Jason, Ian, Mark, Elizabeth Iorns, and Dan Whaley for taking these things on. You too, Martin – you know I’m a huge fan of what ORCID is doing, but it remains to be seen how they’re going to make it work. For example, they’re asking companies to become partners, but it’s not clear what the value proposition is for them, unless they’re a publisher. What about toolmakers like Mendeley or Science Exchange?

    The point about academic projects being really good ideas but being ignored and forgotten about when the grant runs out is a excellent one, and not only because I’ve been making that point myself for years. It’s so frustrating to hear about these excellent technologies at places like Code4Lib that never see the light of commercial day. I think a lot of the dynamic you lay out above comes down to academic projects being good ideas that aren’t sustainable because there’s no effort put into product-market fit, whereas startups might see the same problem and come up with a solution that gets adoption, but often doesn’t attract the people with the good ideas from academia because of the “holier than thou” attitude of academia with respect to commercial enterprise.

    I can’t tell you how many funders I’ve talked to who say things like, “yeah, that’s a great idea, but we can’t give money to a for-profit”. Sloan is one of the only groups I can think of that’s actually thinking about this problem.

    Also, Martin, I would disagree with your comment that Mendeley is still looking for sustainability. I think we’ve found a sustainable solution with the Institutional Edition and future data products to come.

  8. Martin Fenner says:

    Scott, I agree that academic software development is really hard because a) you don’t get the same scientific “credit” as your colleagues writing papers and b) the salary and job perspective is often not competitive with a job in industry.

    William, thanks for the long comment. I didn’t specifically mention ORCID or article-level metrics/altmetrics because I would be biased. But persistent identifiers for people, data and publications is a very good example of wasted resources because there are so many different standards out there. I think that APIs and persistent identifiers are the glue that can hold many of these things together (probably more so than linked open data, but the two concepts are of course related). I really like what Mendeley is doing with its own API, and you can be really proud of the number of API calls you receive every month.

    The comment about Mendeley sustainability was of course an outsider’s view and referred more to the past than the future. A few years ago we had a big hype of building “facebooks for scientists”. Everybody was trying to grow as fast as possible, burning a lot of money at least initially.

  9. Mike Taylor says:

    We have innovation ideas coming out of our ears. Honestly, by far the biggest problem we as a community face is the roadblock of traditional publishers, with their government-funded monopolies on knowledge and systematic opposition to anything that enriches the world without giving them a slice. It really is that simple.

  10. Martin Fenner says:

    Mike, innovation is always opposed by the incumbents, so traditional publishers behave as you would expect. I am more interested in the behavior of researchers and worry that most of them don’t seem to be particularly interested in innovation in scholarly communication. Is it that the innovation ideas aren’t good enough, or what else is holding them back? To give just one example, most manuscripts are probably still sent around between coauthors as Microsoft Word documents using email during the authoring process.

  11. Mike Taylor says:

    That’s true enough. Clay Shirky said it best: “Institutions will try to preserve the problem to which they are the solution”. But its always depressing to watch an industry become the exact opposite of what it set out to be: in this case, publishers that do everything they can to prevent dissemination.

    As for process innovation: it’s true that we could and should do a lot better than email MS-Word manuscripts back and forth. I think here the biggest impediment to change is the need for all co-authors on a given project to use the same, or at least compatible, software for text editing, reference management, version control and so on. It’s tough to get a lot of people to shift at once.

  12. Mr. Gunn says:

    If you listen to the people who talk about product adoption cycles, they’ll tell you it’s only the early adopters who care about new ideas and who will be interested in something because it brings new benefits. The late adopters, the majority, are more motivated by fear of missing out or being exposed to some kind of risk. Can we make a statement to inform these people?

  13. Ian Mulvany says:

    In terms of getting credit for software I’d like to give a shout out to the journal of open research software (http://openresearchsoftware.metajnl.com/), a meta-journal from ubiquity press. I’m on the editorial board. Myself and my fellow editorial board members (http://openresearchsoftware.metajnl.com/editorial-board/) feel strongly that people who build tools should get credit. We hope that this initative can start pushing things in the right direction. We have been reviewing a number of submissions over the past months, and the goal is to provide citable credit for the uderlying software, while at the same time ensuring that the software is open and reusable.

  14. Martin Fenner says:

    Thanks Ian. I am actually quite positive that we change something in this area, as I think the work to put this together is relatively straightforward, and I had a number of discussions on this topic recently.

    You want to put your scientific software into a software repository that can issue persistent identifiers (ideally DataCite DOIs) and has a long-preservation strategy. Even better is a publication that describes this software and links to the DataCite DOI. You then want to use ORCID to pull all this together for the researcher, and sprinkle some usage stats and altmetrics on top to demonstrate that the software is actually been used.