Bi-level Metadata Registry Development

Christopher Patton
Christopher is a new masters student in computer science at the University of California, Davis. His academic interests and job experience have been broad to date, ranging from computer vision to networks. In graduate school, he'll focus on the theoretical foundations of computer networks. His current work for DataONE involves designing and implementing a crowd-sourced, online dictionary for metadata terms.

Project Description: 

The goal of the proposed summer internship is to prototype a metadata registry framework in two parts: a vernacular part consisting of evolving, freely contributed terms and a lightly supervised canonical part consisting of stable terms that crowd-sourced, reputation-based methods have brought to prominence. Leveraging social technologies while benefiting from expert moderation, this bi-level mechanism can be used in any subject domain to create highly relevant metadata registries that avoid the inefficient and unresponsive maintenance pattern plaguing almost every mature registry.

The intern will work with the DataONE PAMWG (Preservation and Metadata Working Group) to begin populating a registry instance emphasizes, but is not limited to, earth and environmental sciences. As per working group goals, the instance will feature a low barrier for contributions, transparency in review processes, and support for balanced discussion and lightweight moderation by elders (experts). Stack Overflow, Hacker News, and Wikipedia, have proven, through a range of reputation-based approaches, that quality can be achieved by drawing the best from user communities. Pooling resources across sciences will reduce duplicate efforts and spending, and support greater interoperability within DataONE and among other scientific data initiatives.

Primary Mentor: 
Jane Greenberg, John Kunze