Open data for science education

Open data is the idea that scientific data should be freely available to all, without restrictions, in searchable online repositories. The open data movement is gaining momentum in the scientific community because of its promise to enable more frequent replication of studies and to accelerate the pace of research. But the advantages for science education are just as compelling.

Science students can benefit greatly from educational materials that expose them to real-world phenomena and data. Unlike learning from broad generalizations and pre-fabricated “cookbook” labs, examining and working with real data can increase interest and better prepare students for careers in science. As states begin to adopt the Next Generation Science Standards, which emphasize practices such as analyzing and interpreting data, and mathematical and computational thinking, developers of K–12 science curriculum materials are increasingly looking for ways to incorporate scientific data into their lessons and assessments.

However, barriers exist that prevent educators from effectively using much of the data that scientists produce. As a reader of PLOS blogs, you are likely familiar with the open access movement in scholarly publishing. But even access to journal articles, though valuable, is often not sufficient for educators’ purposes. Data in journal articles are usually in the form of a few graphs. These graphs are typically frozen in PDFs as part of a paper that conveys the authors’ interpretation of the results in the context of their particular study. And the data presentation choices were made with one audience in mind: experts in the field.

Open Data by Colleen Simon for opensource.com (CC-BY-SA)

Open Data by Colleen Simon for opensource.com (CC-BY-SA)

Using data as it is presented in papers is almost never pedagogically sound at the middle or high school level; much must be changed about the presentation. Jargon and acronyms might have to be removed from axis titles, individual data sets might need to be separated if they are layered into a single figure, or perhaps a section of the graph that describes phenomena outside the scope of the lesson and would have to be removed. Making these kinds of educationally necessary modifications—while maintaining scientific accuracy—often requires access to full original datasets.

Unfortunately, most scientific data is not archived and readily available online. Educators have to contact the study authors and see if they are willing and able to pass it along. Just as with journal articles, this “write to the author” stop-gap is wildly inefficient. Study authors often can’t or won’t respond to requests for original data for a variety of reasons. Sometimes they are simply out of town and not checking email. Sometimes they want to publish more papers and are afraid of getting scooped. And sometimes, especially with older studies, they actually can’t find their data.

In a 2002 survey of geneticists, of those who admitted to denying at least one request from a colleague for published data, the most commonly given reason was the “effort required to actually produce the information” (80 percent of respondents). As Todd Vision, a biologist at UNC and contributor to the Data Dryad open data repository, explained in BioScience:

Unarchived data files are often misplaced, corrupted, or the software in which they were produced becomes obsolete. Memories fade.

Science education materials developers need full access to the data in order to determine its pedagogical strengths and weaknesses. This process often involves investigating many different data sets until settling on the ones that will best address the learning goals for their particular project. Following up on hundreds of individual papers—with a dismal rate of return—isn’t feasible for a small education nonprofit or a lone teacher trying to innovate at a struggling school. This leaves vast amounts of potentially more educationally useful data untapped.

I talked to Sandra Porter, who I met at the last Science Online conference, about her experience with obtaining data for curriculum materials development. Sandra is the president of Digital World Biology, and one of her collaborative projects, Bio-ITEST, involved the development of bioinformatics curriculum materials for secondary students. In genetics and bioinformatics, which are inherently data-focused, data archiving requirements are more common and Sandra and her colleagues were able to take advantage of open data resources such as the National Center for Biotechnology Information (NCBI) and the Barcode of Life Data (BOLD) Systems. Yet even in these fields, access to raw data—the kind that practicing scientists would encounter in their careers—can be tricky to obtain. Sandra commented:

The raw data was useful for us because we needed to know what raw data looks like so we could work out analysis problems in advance. These types of data files are not likely to be available from many places since these raw data are usually processed and analyzed through many pipeline steps before they get submitted to a database.

There are many worthy reasons to support the open science movement, but the argument for science education holds its own among them. It has never been easier to bring real scientific data into classrooms, and the benefits to young scientists-in-training are clear. It would be a shame for all of that educational potential to languish on old hard drives.

Creative Commons License
Open data for science education by Sci-Ed, unless otherwise expressly stated, is licensed under a Creative Commons Attribution 3.0 Unported License.

This entry was posted in Open science and tagged , , , , . Bookmark the permalink.

13 Responses to Open data for science education

  1. Pingback: Open data for science education | Sci-Ed | Appr...

  2. Pingback: Open data for science education | Sci-Ed | MyEd...

  3. Pingback: Open Access Movement for Science › From The Lab Bench

  4. Open Data seems a great resource through which people can access all scientific data very easily. It is very helpful especially for science students as they can have all educational materials.

    VA:F [1.9.22_1171]
    Rating: -1 (from 1 vote)
  5. Pingback: Open data for science education | Sci-Ed | Open...

  6. Pingback: Open data for science education | Sci-Ed | Auto...

  7. Pingback: Not just an advantage for research: Open Data is also great for science education › Hybrid Publishing Lab Notepad

  8. Pingback: Open data for science education | Sci-Ed | Open...

  9. Pingback: Open data for science education | Sci-Ed | Scie...

  10. Pingback: Open data for science education - PLoS Blogs (b...

  11. Giving students access to real data is a great idea and raises lots of interesting issues for science education. Among them are, what levels of technical and analytical expertise are required to work with real data sets? What does a curriculum that provides adequate preparation for this type of analysis look like? Can it be delivered to a wide range of students or is this an excercise for the “gifted” and privileged?

    Taken in another direction, what about analyzing the educational data that the students generate? What do students’ answers to data-based questions, and their own experimental analyses, reveal about what they have learned, what level of sophstication they have attained, and what ideas are challenging? The answers to these questions could help improve science education more broadly.

    VA:F [1.9.22_1171]
    Rating: 0 (from 0 votes)
    • Jean Flanagan says:

      Thanks for commenting! You raise some very good points – these are questions that need to be addressed in the science education research community. In this post, I was focused on the much simpler goal of access to data (in a way that allows for modification). I think there is great potential in having students work with real data – in guided/scaffolded ways – but the details of what that instruction should look like are very important and worthy of further study.

      VN:F [1.9.22_1171]
      Rating: 0 (from 0 votes)
  12. Pingback: Open data for science education | Sci-Ed | Fung...

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>