Provenance and DataONE: Facilitating Reproducible Science
Provenance is a form of metadata that describes the lineage and processing history of data and knowledge artifacts and plays an important role in many scientific applications and use cases. For example, an ecologist might want to combine different datasets for a study, but needs to know how the candidate datasets were derived. A climate scientist might need to document the processing history of climate model outputs to facilitate reproducibility. A natural history collection manager might want to run automated data curation tools on specimen collection data, but has to understand the proposed “repairs” before executing them. In all these and many other cases like these, provenance information plays a crucial role. In this webinar, we will first give an overview of the different types of provenance information and how they can be used, e.g., to facilitate reproducible science.
We then show how a DataONE user can search and navigate provenance information using the new UI currently under development in DataONE. After this user-oriented view on provenance, we finally take a look “behind the scenes” of the DataONE provenance technologies and present plans for future developments.