Provenance for Self or Others? A Study with Hands-on Experiments

DataONE 2019 Intern Yilin Xia
Yilin Xia

Yilin Xia is a second-year master student in School of Information Science at the University of Illinois at Urbana Champaign (UIUC).
He received his Bachelor Degree in Information Management and Information System (Financial Intelligence) in 2018 from Southwestern University of Finance and Economics. His research interests lie primarily in the interdisciplinary field of machine learning, data visualization, and data provenance. In his free time, Yilin enjoys cooking, traveling and reading.

Project Description: 

Data provenance is an important form of metadata that captures the lineage and processing history of data products resulting from data-driven analyses and workflows. Provenance information can increase the transparency, reproducibility, and reuse of data products. Recent years have seen considerable research and development efforts devoted to standards, tools, and applications that capture, store, query, and visualize provenance.
The goal of this project is to study contemporary use of provenance in different stages of the data life-cycle in order to answer questions such as: Who is creating or using provenance and for what purposes? Is provenance capture and use already ingrained and best practice in some domains, or is it viewed as yet another “metadata chore” that scientists reluctantly deal with.
This project consists of two parts: (i) an “environmental scan” / survey of the research literature on data provenance with a focus on provenance tools and applications (possibly including some limited survey work), and (ii) a hands-on part whose goal is to use commonly mentioned tools in their prototypical settings. A key outcome is a report with findings and recommendations based on the literature survey and the intern’s own hands-on experiences.

Primary Mentor: 
Bertram Ludäscher
Secondary Mentor: 
Michael Gryk, Robert Sandusky