I want to search


Understanding and Using Provenance from Digital Notebooks

Tianhong Song
Tianhong Song

Tianhong Song is a PhD candidate in Computer Science at University of California, Davis. He received his BS in Applied Biological Science and Enterprise Management at Zhejiang University. His research interests include workflow design and analysis, large scale data analysis and integration, algorithm design and optimization in general

Project Description: 

Capturing and analysis of provenance from scientific workflow environments and databases is well studied and understood. Scientists can employ provenance information to better understand, debug, and document their findings, and thus greatly simplify and enhance reproducible science. Digital notebooks such as iPython can be understood as a new way of marrying ideas from high-level scripting, interactive workflows, and even “executable papers”: similar to the idea of Literate Programming, the digital notebook combines both documentation (the paper) and the code to produce the results. Thus, by design, they are self-documenting and greatly enhance transparency and reproducibility. However, the provenance models used for digital notebooks are less well studied than those for databases and workflow systems. Capturing the provenance of data obtained during multiple interactive sessions is therefore one of the enablers of emerging, dynamic models of scholarly communication. Based on existing and to-be-developed notebooks, the project will investigate the potential for provenance capture in these environments, identifying technical challenges and assessing both complexities and opportunities. Furthermore, it will explore the modeling of provenance in such contexts and the adaptation of existing querying and analysis techniques

Primary Mentor: 
Bertram Ludäscher
Secondary Mentor: 
Paolo Missier