Creative Commons for Science: Interview with Puneet Kishor

Creative Commons provides copyright licenses to help standardize and simplify the sharing of scientific content and other creative works. PLOS applies the Creative Commons attribution license to all published works, and Creative Commons licenses are essential for Open Access publications. The long-awaited version 4.0 of the Creative Commons licenses was released last week. This release provided a perfect opportunity to ask Puneet Kishor from Creative Commons a few questions.

1. What is Creative Commons, and what is Science Commons?

Creative Commons (CC) is a nonprofit organization that enables sharing and use of creativity and knowledge through free legal tools. Our easy-to-use copyright licenses provide a simple, standardized way to give the public permission to share and use your creative work—on conditions of your choice. CC licenses let you easily change your copyright terms from the default of all rights reserved to some rights reserved. Creative Commons licenses are not an alternative to copyright. They work alongside copyright and enable you to modify your copyright terms to best suit your needs.


Science Commons was a project within CC that started around mid-2005 to explore what CC could accomplish in the area of science. Science Commons had three interlocking initiatives designed to accelerate the research cycle—the continuous production and reuse of knowledge that is at the heart of the scientific method. Together, they form the building blocks of a new collaborative infrastructure to make scientific discovery easier by design: making scientific research re-useful by helping people and organizations open and mark their research and data for reuse; enabling one-click access to research materials by streamlining the materials-transfer process so researchers can easily replicate, verify and extend research; and integrating fragmented information sources by helping researchers find, analyze and use data from disparate sources by marking and integrating the information with a common, computer-readable language. Science Commons ended in 2010, and the CC Science program was launched in 2011. I joined CC as the science lead in August 2012.

2. What is your role at Creative Commons?

My mouthful of job-title is Project Coordinator for Science and Data. I do everything from educating those who may have never heard of CC to assisting our long-time users and partners in adoption, application and promotion of various CC tools. But I see my role as much broader than just ensuring CC license adoption. Everything I do, and that CC Science gets involved in, is driven by CC Vision of “nothing less than realizing the full potential of the internet.” A vision that grand requires work of broad scope. Getting a CC license on scientific content is predicated on an entire system of science geared toward openness, from the toolmakers to practitioners to decision-makers and policy-makers. If any part of the system is closed or resistant to open practice of science, it closes off the products. The entire information lifecycle of science has to be open, otherwise scientific inquiry is short-changed. So, I devote my energies across the spectrum.

3. What did you do before working for Creative Commons?

Immediately before I joined Creative Commons I was a researcher in the Department of GeoSciences at the University of Wisconsin-Madison where I created Earth-Base. I am still involved with Earth-Base, and in my ridiculously non-existent spare time I am plotting EB2, the next iteration of Earth-Base.

Before University of Wisconsin-Madison, I worked at a private geographic information systems (GIS) consulting company for nine years on mostly state and local governmental GIS projects.

And, before that I was a GIS Specialist at the World Bank in Washington DC providing operational support to the Bank’s lending programs, and also helping with targeted research projects.

I have been working with geospatial technologies since 1988, and with open source geospatial for the past decade. I have been an elected Charter Member of the Open Source GeoSpatial Foundation (OSGeo) from its very start.

4. Why are licenses important for scientific content?

Licenses remove uncertainty by spelling out what may or may not be done with the licensed work. Science is both interdisciplinary and international, crossing disciplinary and jurisdictional boundaries, and since science builds upon existing work, it depends on being able to reuse content.

There is, however, no such thing as an international copyright that will automatically protect an author’s writings throughout the world. Protection against unauthorized use in a particular country depends on the national laws of that country. An author who wants copyright protection for her work in a particular country should first determine the extent of the protection available to works of foreign authors in that country. As you can gather, this is onerous for those seeking to license their work, and onerous for those wanting to legally use works of others.

5. Are there too many licenses? Should CC BY be the default for text?

Those are two questions. For the first, yes, I do believe there are too many licenses. While a license can and does remove uncertainty, it is also important that different licenses work well together. As I mentioned above, science is inherently transnational and increasingly interdisciplinary. Scientists want to use and reuse existing data, mix them together, analyze them, adapt them, create new data and uncover insights. Doing that with works under different licenses creates confusion as it is not clear whether or not the different works can be mixed. This brings us back to uncertainties. A globally recognizable license with clear terms and conditions alleviates those uncertainties.

The second question ventures in the territory of opinion. My personal opinion is that yes, CC BY is a good choice of license for scientific reports and, under certain conditions, even data. In most cases, opting out of copyright altogether by releasing data under a CC0 public domain dedication is an even better choice. It may seem surprising to some, but you lose nothing by relinquishing all rights, and actually gain a lot both for yourself and for science in general. This is truly a case of “give and ye shall receive.”

When mandated by policy or funding conditions, such as when the work is funded by public monies, all material should be made available to everyone without any restrictions other than for security or privacy considerations. For personal works, however, it is a personal choice, and everyone should make their own decision. My job is to educate others of the consequences of choosing different licenses and give them sufficient information so they can make the right decision. And, hopefully they will choose an open license that will enrich the commons.

6. How are licenses for data different from licenses for text?

Facts of nature are not copyrightable in the United States and other jurisdictions. As such, copyright licenses don’t even apply. Additionally, it really is very difficult to ascertain where raw, uninterpreted, discovered data end and where human interpretation of such data begins. Note however that there may be other rights one has in the data.

This is why we recommend using CC0 for data, as it completely removes any complications that may arise from potentially inappropriately using a license where it doesn’t belong, of imposing burdensome legal obligations.

That said, the latest version of CC licenses, version 4.0 to be released soon, do cover database rights aka sui generis database rights (SGDR), that are particularly relevant in Europe. Besides CC0, using CC BY 4.0 or CC BY-SA 4.0 will ensure that both the creator and the user of content are covered appropriately.

7. How do best practices for citing scientific content relate to CC licenses?

As mentioned above in question 6, both the CC BY license or CC0 can be used for data. But we do recommend CC0 when it is quite clear the data don’t have copyrightable elements. CC0 relieves any legal obligation to provide attribution thereby making use of data much easier because a dataset may have numerous contributors or may have undergone numerous changes attributing each one of which would be very onerous. That said, CC0 doesn’t relieve the user from the normative practice of giving credit where credit is due. This is easily achieved by citing the source dataset properly, a good scientific practice anyway. Science has thrived for the past however many centuries by not only building upon the work of others but also properly citing their work, not just to demonstrate due diligence and to recognize the works of others but also allowing downstream readers of the reports to locate the cited works. Keep in mind, CC0 does not place any legal burden to cite, but good scholars should give credit where credit is due. CC0 depends on such good behavior.

A note about data citation: many of us in the community are working quite hard to come up with clear principles for data citation to encourage the practice and make data into first class citable entities. See FORCE11’s Draft Declaration of Data Citation Principles which are open for community comment until Dec 31 as well as its precursor, the CODATA “Out of Cite, Out of Mind” report published in the Data Science Journal on 13 September 2013. I have been involved deeply in both initiatives, and am happy to see this very important need being addressed with a good, simple set of principles that can set the stage for a new era of properly citing data and thereby encouraging collection, curation, management and publishing of data itself as a good scientific practice.

8. Can you provide some links to resources where researchers can learn more about CC licenses?

The best place to start is the CC web page on licenses. It describes our licenses, and the three-layers of each license, the human and machine readable layers, and the legal deed.

The blog entry introducing the new CC4 licenses should be the next stop.  We also describe the policy decisions that went into the design of the new licenses, and the changes in the new suite.  Those intending to apply the license should visit the license chooser.  Finally, our guide to marking work correctly with a CC license is a good place to learn the best practices appropriate to one’s content.

9. Is license information for scientific content readily available, e.g. from CrossRef and DataCite? Are there standards for Open Access metadata, e.g. from NISO?

There have been many attempts to create and standardize metadata that describe permissions and conditions for usage of content. CC’s licenses themselves encode license information in RDFa using the CC Rights Expression Language (CCRel). NISO has been convening a working group on Open Access Metadata and Indicators (OAMI). The success of these initiatives, however, has been arguably mixed. Embedding correct metadata is extra work, and unless it is automated, it will not be mainstream. A new initiative called Commons Machinery, funded by the Shuttleworth Foundation, is working to raise awareness of metadata and encourage adoption of open standards for metadata through technological solutions that automatically embed and retain metadata with information about the license and creator of a work into the digital work itself.

10. What is new in CC4?

As mentioned above, the change most relevant to scientific content in CC4 is the coverage of SGDR, aka database rights, that are a mostly EU peculiarity. That means a database creator in the EU, or any other jurisdiction where SGDR might exist, can use a CC4 license allowing use of the database without the user worrying about violating any database rights. In other words, using CC4 relieves the creator from separately licensing database rights, and it relieves the downstream database user – in particular if located in a region where SGDR apply– from worrying about violating any database rights. See Section 4 of the legal code for details.

CC4 - it's here. Taken from

CC4 makes attribution requirements much easier by making them flexible and more clear. The user can satisfy the attribution requirements “in any reasonable manner based on the medium, means, and context in which You Share the Licensed Material.” More specifically, attribution can be completed by including an URI to another page that has all the information needed. This may or may not be attractive to scientific researchers, especially those applying licenses to data, where it could be onerous to have to include all those details in/alongside the data itself. To be clear, this is possible under 3.0 also, but CC chose to make it unquestionable in 4.0 by stating it explicitly. See Section 3a of the legal code for details.

We also now require re-users to indicate if licensed material has been modified, even when doing so hasn’t created an adaptation.  In the data context, this puts downstream re-users of a database that has changed on notice that something has been done to it.  Ideally, the person who made the changes indicates just what has been changed, but even if not it then tells re-users they should look back at the original.  This is a matter of provenance and making sure that changed data indicated, which is important in the scientific context.

11. What will you be working on after CC4 has been released?

Now that the CC4 licenses have been released, most of our energy will be devoted to education and outreach. From helping those who currently use an older version of the license move to CC4 to explaining the benefits of adopting CC4 to those looking at licenses for the data and other scientific content, there is much to be done.

Of course, putting a license on a work is only a part of the story, and usually the end part. It is predicated on many other prerequisites—tools, workflows, a culture, a system of rewards and incentives, policies all working toward practicing, supporting and promoting open science. We will continue to support all such activities by either leading them, or assisting others with advice, connections and other resources.

We have a couple of major initiatives that we will focus on—we are building a Science Affiliates Network to augment our current network of Affiliates worldwide. We are also planning a series of Science Technology and Policy Salons on a variety of scientific topics and how they are impacted by IP laws. We hope to shed more light on topics such as text and data mining, and cloud computing.

Related Posts Plugin for WordPress, Blogger...
This entry was posted in Tech and tagged . Bookmark the permalink.