You are here
Provenance-enabled Reproducibility: Developments in DataONE
Reproducible research is enabled, in part, by provenance metadata that describes the lineage and processing history of data and knowledge artifacts. Provenance plays an important role in many scientific applications and use cases. Yet this information is often not tracked as thoroughly and systematically as science metadata. DataONE has been working on tools to display provenance information and to support recording of provenance metadata through programming languages such as R and Matlab and through an intuitive, user friendly, web-based UI.
During this webinar we will describe the history to date, showcasing the tools developed and providing a demonstration of the new web-based provenance editor. We highlight the collaborative efforts in building a community around provenance, and introduce future integration with WholeTale and other community initiatives.
Chris Jones is a Software Engineer at the National Center for Ecological Analysis and Synthesis (NCEAS), at the University of California, Santa Barbara. He has worked on informatics projects for the last fifteen years, focusing on generic solutions to common data management needs in the earth and ecological sciences. Chris has built systems to document and archive data for regional and international consortia, stream data in near real time from arrays of oceanographic sensors deployed across the insular Pacific islands, and has been involved in metadata standards development and ontology development. Chris tries to handle computer systems in stride, despite their frequent tantrums. He lives in Colorado.
Bryce Mecum is a scientific software engineer with expertise in data analysis and programming and data management systems, including systems like R, GitHub, repository software, Python, and UNIX. He has a background in fisheries modeling and management, and builds software systems supporting environmental synthesis.
Matthew Jones is the Director of Informatics Research at the National Center for Ecological Analysis and Synthesis, and co-PI on DataONE. His research focuses on environmental informatics, and particularly software for management, integration, analysis, and modeling of heterogeneous environmental data. Products have included metadata standards like Ecological Metadata Language, data systems like the KNB Data Repository and DataONE, and scientific workflow systems such as Kepler for tracking the structure and provenance of analysis.