DataONE and colleagues from Indiana University will be running at 3 hr workshop titled "Big Data Means Your Metadata Must Work" at Supercomputing 2011, Seattle, Nov 12-18th.
Below is a copy of our abstract. We hope it encourages you to join us!!
Data intensive computing means a lot of valuable data coming from parallel and graph computations and remote instruments. However, a recent Science article estimated that only 1% of ecological data is accessible after the research has been published. This tutorial addresses how metadata is critical to the use and reuse of scientific data. It discusses techniques for metadata capture, communication, and use in data discovery using four tools: XMC Cat, Metacat, Morpho, and Mercury. In this tutorial, participants will be taken through (1) installing and configuring a metadata catalog, (2) capturing domain-specific metadata both programmatically and using web-based user interfaces, (3) searching for metadata, and (4) configuring their search to return the right results. It will use examples from geo- and environmental sciences. At the conclusion of the tutorial, participants will have deeper familiarity with metadata tools currently available and will be on their way to making big data shareable!