I want to search


Provenance and DataONE: Facilitating Reproducible Science

Date of Webinar: 
Tuesday, May 12, 2015
9 am Pacific / 10 am Mountain / 11am Central / 12 noon Eastern

Webinar Abstract

Provenance is a form of metadata that describes the lineage and processing history of data and knowledge artifacts and plays an important role in many scientific applications and use cases. For example, an ecologist might want to combine different datasets for a study, but needs to know how the candidate datasets were derived. A climate scientist might need to document the processing history of climate model outputs to facilitate reproducibility. A natural history collection manager might want to run automated data curation tools on specimen collection data, but has to understand the proposed “repairs” before executing them. In all these and many other cases like these, provenance information plays a crucial role. In this webinar, we will first give an overview of the different types of provenance information and how they can be used, e.g., to facilitate reproducible science.

We then show how a DataONE user can search and navigate provenance information using the new UI currently under development in DataONE. After this user-oriented view on provenance, we finally take a look “behind the scenes” of the DataONE provenance technologies and present plans for future developments.

DWSBertram Ludäscher is a professor at the Graduate School of Library and Information Science (GSLIS) at the University of Illinois, Urbana-Champaign, and the director of the Center for Informatics Research in Science and Scholarship (CIRSS). He also holds faculty affiliate appointments at the National Center of Supercomputing Applications (NCSA) and the Department of Computer Science at UIUC. Prior to joining the iSchool at Illinois he was a professor at the Department of Computer Science and the Genome Center at the University of California, Davis. His research interests span a number of areas in the data to knowledge lifecycle, from modeling and design of databases and workflows, to knowledge representation and reasoning. His current research focus includes both theoretical foundations of provenance as well as practical applications, in particular to support automated data quality control and workflow-supported data curation. He is one of the founders of the Kepler scientific workflow system, and a member of the DataONE leadership team, focusing on data and workflow provenance. Until 2004 Ludäscher was a research scientist at the San Diego Supercomputer Center (SDSC) and an adjunct faculty at the CSE Department at UC San Diego. He received his M.S. (Dipl.-Inform.) in computer science from the Technical University of Karlsruhe (K.I.T.) and his PhD (Dr.rer.nat.) from the University of Freiburg, both in Germany.
DWSChris Jones is a Software Engineer at the National Center for Ecological Analysis and Synthesis (NCEAS), at the University of California, Santa Barbara. He has worked on informatics projects for the last fifteen years, focusing on generic solutions to common data management needs in the earth and ecological sciences. Chris has built systems to document and archive data for regional and international consortia, stream data in near real time from arrays of oceanographic sensors deployed across the insular Pacific islands, and has been involved in metadata standards development and ontology development. Chris tries to handle computer systems in stride, despite their frequent tantrums. He lives in Colorado.
DWSLauren Walker is the Software Designer at the National Center for Ecological Analysis and Synthesis and for DataONE. Her work focuses on creating user-minded interfaces and web applications for environmental scientists.