WebinarWatch on demand

Developing, Packaging and Sharing Reproducible Research Objects: The Whole Tale Approach

Speakers

Craig Willis

Craig Willis

National Center for Supercomputing Applications

Craig Willis is a senior research programmer at the National Center for Supercomputing Applications (NCSA) where he works as part of the Whole Tale development team. He is also a PhD candidate at the School of Information Sciences at the University of Illinois at Urbana-Champaign. His research focuses on methods of verification and dissemination of computational research artifacts for transparency and reproducibility.
Bertram Ludäscher

Bertram Ludäscher

University of Illinois, Urbana Champaign

Bertram Ludäscher is Director of Center for Informatics Research in Science and Scholarship, Professor at the School of Information Sciences, National Center for Supercomputing Applications, and the Department of Computer Science, University of Illinois, Urbana Champaign. He conducts research in scientific data management, scientific workflows, and data provenance. His research interests also include foundations of databases, knowledge representation, and reasoning. Ludäscher applies this work in a number of domains, e.g., biodiversity informatics and taxonomy.
A key objective of the NSF-funded Whole Tale project is to make the development, sharing, and reuse of reproducible research objects more seamless, both for creators and users of “tales”. Tales can be seen as a special kind of research object that bundle data and metadata, but also code and the runtime execution environment necessary to reproduce the computational aspects of a research paper. In this webinar, we will illustrate the Whole Tale approach using a number of simple examples and demonstrations. Read more
It has been recognized for some time now that sharing data is critical when publishing research findings. For example the FAIR principles for scientific data management and stewardship require that data be findable, accessible, interoperable, and reusable. Scientific data repositories and networks such as DataONE provide researchers with ways to share and publish research data, bundled with appropriate metadata to allow peers to interpret and reuse the data products associated with research papers. Increasingly, the need to share code (i.e., scripts and analysis programs) used in the generation, analysis, and visualization of data—alongside the data products—has been recognized as well. However, it remains challenging (if not impossible) for many potential users of a research object to deal with the installation of complex software dependencies and the appropriate parameterization and execution of multiple scripts contained in often complicated, nested research objects. A key objective of the NSF-funded Whole Tale project is to make the development, sharing, and reuse of reproducible research objects more seamless, both for creators and users of “tales”. Tales can be seen as a special kind of research object that bundle data and metadata, but also code and the runtime execution environment necessary to reproduce the computational aspects of a research paper. In particular, a human-centered narrative, e.g., in the form of a Jupyter (or RStudio) notebook can be used as the central element of research tales that interleave scientific explanations, code and visualizations. By making it easier to develop, package, share, and execute tales, different types of users are supported by the Whole Tale approach: researchers can easily combine data from different sources (e.g., DataONE member repositories), analyze and visualize the data, and then bundle up their research products—together with the software environment used to generate the products—and share the resulting tales. Peers can use the shared tales, e.g., as part of a review process (associated with a scientific publication) or make it the basis for their own research. In this webinar we will illustrate the Whole Tale approach using a number of simple examples and demonstrations.
Watch previously recorded video