How to build better collective data on school innovation in 5 simple steps

By:

Sep 19, 2019

Educators, researchers, intermediaries, and funders are frequently on the hunt for information about how schools are innovating, and new knowledge-building efforts often arise to fill this need. Consider these scenarios:

  • A non-profit organization launched a database on innovative schools five years ago, but the data hasn’t been updated regularly since then. 
  • An organization supporting school redesign shares a list of schools in a PDF document with short descriptions of each school’s model.
  • A funder wants to surface innovative middle schools in the Southwest and pays a consultant to run a survey to discover schools.

Important data-building efforts like these are underway constantly in organizations of all stripes—and yet, as we’ve argued, knowledge of school innovation remains deeply fragmented. Databases go out of date, static lists of schools can’t be merged with other data, and siloed datasets inevitably result in duplicative efforts to collect information from schools on the frontlines of student-centered learning.

In an ideal world, efforts to build knowledge about school innovation would be transparent and complementary so that new efforts could build on each other. The Christensen Institute’s recently-launched Canopy project is an effort to prototype a method of structuring data in just this way. By reimagining where data comes from and how it’s structured, the Canopy shows promise in highlighting insights that could be surfaced if the field captured data in a consistent way. In the course of the project, we learned several relatively simple ways for organizations to share knowledge more coherently today, even in the confines of our current data silos. 

Before starting out: some strategic questions

Most new efforts to build data are designed with a specific audience or use case in mind: for example, a map for school leaders to learn about schools that are implementing experiential learning in new, promising ways. However, part of what perpetuates a walled-garden effect in school innovation data is that focusing on a particular user, in a particular moment, using a particular (and perhaps duplicative) data collection method, may hamper the long-term utility of that data. A narrow focus upfront can lead to design decisions that make the data less usable for a broader audience, or may obscure opportunities for the new data to build on existing efforts across the field.

Before building (or funding) a new dataset or list, consider some big-picture questions that could make or break the long-term sustainability and utility of that data. Who else might have a use for the data beyond your purposes? What structural decisions will make the data more usable by people and researchers outside of your organization? How might you minimize the burden on practitioners who will provide the data? Are you replicating someone else’s process that you might build upon instead? 

The nitty-gritty: Five simple tips for sharing more coherent data

What does it actually mean to make data more “usable”? Researchers, funders, and school intermediary organizations should consider these five simple steps to sharing data in a more collective manner.

1. Track NCES School ID numbers for each school. Most public and charter schools, as well as districts, have one. These unique numbers can be used to match schools when merging two datasets. Especially if the school does not yet have an NCES ID, make sure to use the full legal name of the school or district when you record it in a data source, even if you’re comfortable referring to it in shorthand. 

2. Allow users to download a CSV file of the full dataset or list. While the static CSV version of a complex dataset may not be pretty, at least it allows users to access the raw data in full. And remember that lists are data, too! If you’re making a list of schools on a website or in a PDF, format an additional version of the list in a spreadsheet with some basic fields that will help others sort the data and merge it with other files.

3. Share openly using Creative Commons. Creative Commons licenses ensure that people can legally use the information you share, and make transparent the terms under which they can use it.

4. Identify and capture relevant metadata. Metadata helps users understand how data was collected and built. For example, if you are making a spreadsheet of schools to visit, record metadata such as when a school was added, who added it, when it was last visited, or how the information about the school was collected (phone interview, site visit, survey, etc). That way, future users will be better equipped to compare your data with another dataset, or aggregate two datasets together. 

5. Consider using tags from existing tagging systems. Common language can create more coherent data across the field. For example, if you are tracking information that maps onto Canopy tags (visit the Canopy website and download the dataset to see the full list), consider using the same tag variable to store the information. For example, you might use the “restorative_practices” tag to indicate that a school is adopting a restorative approach instead of a traditional punishment-based disciplinary system. Be aware that most tagging systems are evolving, and will certainly change as practices shift and as more stakeholders participate in building new tags. Nevertheless, beginning to use common tags could help bridge data silos to a degree before a more standard method is formalized.

The next step: Moving towards shared data standards

Ultimately, developing the technical mechanisms to merge data seamlessly will require collective work over time. Part of that work involves standardization. Data standards—mutually understood ways of defining different pieces of information, how they relate to each other, and how they’re shared—would enable more comparable and cumulative data, and more efficient sharing, towards a better understanding of who’s doing what in school innovation. However, standards related to data on school design are currently inadequate or completely lacking. In the longer term, building towards standardization would offer a structure for collaboration by enabling organizations developing and working with datasets to share and merge data in predictable and complementary ways, and at scale.

In the meantime, data managers can use existing standards where they are already present. Organizations that manage ways of codifying and sharing education data have already standardized many types of information about schools, courses, students, and more. Existing standards range from basic elements like school name and level, to elements related to instruction such as blended-learning model, work-based learning opportunities, language of instruction, and more. For existing database managers, it’s not necessary to change current data fields—simply create a translation guide that maps the database’s fields onto standardized concepts where they exist. For new data-building efforts, design them from the start by building on elements from existing standards wherever possible.

Are you actively building and managing data on school innovation? You can learn about our full recommendations for better collective solutions to data fragmentation, which will be available in October 2019. Download the full Canopy Project dataset via the Canopy website to receive an email notification once the recommendations are released.

Chelsea is a research fellow at the Institute focusing on blended and personalized learning in K-12 education, where she analyzes how innovation theory can inform the design of new instructional models.