Open data movement seemingly doesn’t differ much from other “open” movements – its goal is to have free access to data (you can substitute data for articles/source code/creative content) without certain restrictions and mechanisms of control. Quite recently a group of usuall suspects (Cameron Neylon, John Wilbanks, Peter Murray-Rust and Rufus Pollock) put together a set of principles for open data in science called Panton Principles.
It’s easy to think about open data in similar way to thinking about open access or open source. However I’ve come to think that open data, especially in life sciences, is different. The background story is that we had to remove “open data” component from a big grant proposal because it was incompatible with the policy (aka long-term strategy) of the funder. It didn’t violate any of the requirements, but in get-a-grant game you don’t want to lower your chances on purpose. Hence no “open data” on that project. Konrad pointed out in this FriendFeed thread that Poland is no different to other countries in respect of compatibility of open data principles with long-term strategy of science development. And indeed – Wikipedia entry on open scientific data states that it can be challenged already by individual institutions as well as by grant agencies:
As the term Open Data is relatively new it is difficult to collect arguments against it. Unlike Open Access where groups of publishers have stated their concerns, Open Data is normally challenged by individual institutions. Their arguments may include:
- this is a non-profit organisation and the revenue is necessary to support other activities (e.g. learned society publishing supports the society)
- the government gives specific legitimacy for certain organisations to recover costs (NIST in US, Ordnance Survey in UK)
- government funding may not be used to duplicate or challenge the activities of the private sector (e.g. PubChem)
And this made me think that from a point of view of a funder that wants some return on investment in science, open data and open source are pretty much incompatible. In abundance of open source code, data becomes an asset and is more protected. In abundance of open data, analysis methods become more valuable, and as such more protected (meaning less likely to be released under open source license).
Current business models don’t fit well in situation of abundance-of-everything (that is data and software), so I don’t expect that science funders (the least innovative group in the world) can get out of the “scarcity/competitiveness” frame of thinking anytime soon. I have a feeling that our issues with open data being incompatible with long-term strategy of science development, will return some day when we try to put open source into yet another grant.