Meeting Agenda (long version)
Remote attendance is available for those sessions marked with an asterisk. Remote attendees can join in listen only mode using the links below. Remote attendees may ask questions using the chat message feature.
Monday Jul 16th
|0830||Session 1*: Welcome to the DUG, Introduction to DataONE and Update||Amber Budden, Bill Michener||Madera|
|0900||Session 2*: DataONE Member Node Federation||Amy Forrester and DataONE Member Nodes||Madera|
|0940||Session 3*: DUG Business Meeting (epad notes)||Co-chairs; Karl Benedict and Bob Sandusky||Madera|
|1030||Session 4*: DataONE Sustainability (epad notes)||Panel Discussion incl: Bryan Heidorn, Karl Benedict, Bob Sandusky, Bill Michener, Erin Robinson, Matt Jones, Amber Budden||Madera|
|1130||Session 5: Birds of a Feather**||Sustainability of infrastructive||Madera (epad notes)|
|Community engagement and open data advocacy||Canyon A (epad notes)|
|FAIR data||Canyon C (epad notes)|
|1330||Session 6*: Breakout Sessions|
|Breakout A: Technical Sustainability through DataONE (epad notes)||Matt Jones, Dave Vieglais, Karl Benedict||Canyon A|
|Breakout B: Community Outreach and Education (epad notes)||Amber Budden, Amy Forrester||Canyon C|
|1550||Session 6 cont'd: Report Back from Break Outs||Mod: Bob Sandusky||Madera|
|1600||Session 7: Community Oral Presentations||Madera|
|Supporting reusable, reproducible, and interoperable computational modeling||Allen Lee|
|Enabling FAIR Data Project, the DataONE Users Group, and a Discussion on Implementation and Transition||Shelley Stall|
|Repositories for Scientific Data Papers||Bill Michener|
|1730||Session 8: Reception and Poster Session||West Foyer|
**Potential Topics: Sustainability of services, software and infrastructure; FAIR data; Open science advocacy and education
Optional Social Event
Following the end of the poster session, we invite you to join us across the street at Gentle Ben's Micro Brewery across the street at 865 East University Blvd. They have a full menu in addition to a wide range of brews. No pre-registration required.
Tuesday July 17th 2018
The DataONE User Group are supporting two workshops as part of the Tuesday ESIP agenda. These and other ESIP activities are not included in the DUG registration and you must register with ESIP for full or one-day attendance.
- Jul 17th 0930: Interoperability within the DataONE federation
- Jul 17th 1600: Enabling transparency and reproducibility in science through practical provenance frameworks
Breakout A: Technical Sustainability through DataONE
Part I: Rich Metadata Models
Part II: Service APIs and Discovery
The intent of this breakout is to provide overview of how metadata is handled by DataONE across various scenarios of dataset description for access and reuse. The session will cover the purpose of metadata, good practices for describing datasets, how to describe access to datasets and how the DataONE indexing and search UI helps. The focus will be on these services within the context of technical sustainability and how Member Nodes and the DUG can have a role in sustaining DataONE infrastructure.
Breakout B: Community Outreach and Education
Part I: New Education Resources
Part II: Member Node Support Documentation.
In Part I of this breakout we will introduce some new education resources; the Research's Guide to DataONE and the Data Management Skillbuilding Hub and explore the potential for extension of the hub as a platform for community products. In Part II we will discuss mechanisms for increased support of upcoming and current Member Nodes through enhanced documentation, and the framework for presentation of documentation.
Oral Presentation Abstracts
Michael Barton, Marco Janssen, Allen Lee, Kenneth Buetow, and Calvin Pritchard
Supporting reusable, reproducible, and interoperable computational modeling
Code is data and there is a growing awareness that it should be archived and citable alongside all the other digital artifacts that support modern research with computational dependencies. Although making code available is a great first step, there's a lot more work needed to make it amenable to reuse and evaluation. To this end, CoMSES Net is dedicated to establishing community standards and guides to good practice for documentation, semantic interoperability, and modular decomposition of computational modeling components that facilitate evaluation and reuse. We aspire towards a future where computational modelers can discover, evaluate, and connect well-encapsulated domain specific models together that can be run on commodity compute resources or supercomputers and configure them to test alternate hypotheses and implementations. As a concrete example, a hydrology model that routes water across a landscape informs an agent based model that simulates farmer decisions (what to plant, where, as well as terraforming land use decisions) how much water will be available to a given grid cell and ultimately derives expected yields based on a crop systems model. This particular problem space is fraught and littered with arguably failed scientific pipeline software (as in failed to gain widespread adoption in the broader scientific community). But we should keep trying!
Shelley Stall, Lynn Yarmey; Erin Robinson; Kerstin Lehnert; Mark Parsons; Leslie Wyborn; Brooks Hanson; Brian Nosek; Joel Cutcher-Gershenfeld
Enabling FAIR Data Project, the DataONE Users Group, and a Discussion on Implementation and Transition
The Enabling FAIR Data project in the Earth, science and environmental sciences is moving into the implementation phase shortly after the DUG 2018 meeting. In this session, the recommendations and guidelines of the project will be reviewed, timelines, and the tools and information that can be used to support transition.
Although more and more publishers are requiring data to be available upon publication, practices are widely variable. Many still allow “available from authors” statements, and even when deposition is mandated, data are commonly simply placed in the supplementary information section of their manuscript as a PDF file or a spreadsheet in a proprietary format, such as Excel, without any metadata. These supplements are not indexed. Some researchers use repositories that have solid preservation practices, but in many fields, this is haphazard and not mandated, so that similar data are treated inconsistently.
To address these problems, AGU is convening a coalition of leading publishers and repositories in the Earth, space, and environmental sciences to enable FAIR data standards across the community, with major funding from the Laura and John Arnold Foundation. The solution will require researchers to place their data in a repository that supports persistent identifiers, data citation, community standards for metadata, and access to data before publication to support publication peer review.
Bill Michener, Matt Jones
Repositories for Scientific Data Papers
The talk includes an overview of the history of data publication in the ecological and environmental sciences and presents an approach for streamlining peer-reviewed data publication. We discuss how data repositories can play a new role in the data publication process. Example data papers are used to illustrate the benefits to science and society.
Steves, I, Goldstein, J, Jones, M & the NSF Arctic Data Center Team
Open source tools and workflows for repository management
The Arctic Data Center is geographically focused rather than domain specific, which fosters cross-disciplinary data discovery and synthesis. However, developing guidelines and support for heterogeneous data and metadata across social, physical, and biological sciences can be challenging. Open source tools and workflows help us collaborate as a team to address some of these issues.
Mach, M, Budden, A, Whitmire, A, Bloom, D, Rauch, S, Meyer, J, Hutchison, V
A Researcher's Guide to DataONE
For researchers new to open science and data management, understanding the complexity of the landscape, the role of various organizations and the support available can be challenging. There are multiple initiatives designed to consolidate this information, organized thematically by service or content type. Our new Researcher's Guide to DataONE walks researchers through the many resources available to them at DataONE within the framework of a 'roadmap'. The poster presented represents this step-by-step journey and summarizes more comprehensive, dynamic information available through the DataONE website.
Vieglais, D & the Make Data Count Team
How to Make Your Data Count
Make Data Count (MDC), a Sloan funded project between California Digital Library, DataONE and DataCite, provides incentives and aims to show researchers the value of their research data by displaying data usage and citation metrics.
NEON within the landscape of data repositories
The U.S. National Ecological Observatory Network (NEON), sponsored by the U.S. National Science Foundation and managed cooperatively by Battelle, is a continental-scale, long-term observation facility that supports research of ecological drivers and responses. Now in its first year of operations, NEON collects and disseminates a full suite of observations from field crews (including soil, water, and organismal samples), automated instruments, and remote-sensing airborne platforms from 80 field sites across the U.S. Currently, more than 170 data products are freely available through a data portal, a public API, and a network of repositories, including AeroNet, MG-RAST, BOLD, OpenTopography, AmeriFlux, Phenocam Gallery, and NCBI SRA. In addition, tens of thousands of samples are available upon request, and NEON is standing up a new partnership with Arizona State University to manage its bioarchive. As a DataONE member node, we ask how NEON can more efficiently improve the discoverability of data across the landscape of archives, external repositories, and data/metadata aggregators?
Walworth, DH, Bradley, J, Smith, S
Alaska Data Integration working group Metadata ToolKit
The Alaska Data Integration working group(ADIwg) Metadata Toolkit is an open source suite of applications for authoring and editing metadata for spatial and non-spatial projects and datasets. The main goal of the toolkit is to promote the creation and use of metadata by lowering the level of technical expertise required to produce archival quality metadata.
The mdTranslator application supports translation between multiple metadata formats. Currently mdTranslator reads mdJSON, FGDC CSDGM, and sbJSON (the native format for the U.S. Geological Survey ScienceBase catalog) and outputs metadata in multiple standards, including ISO 19115-2, 19110, HTML, mdJSON, sbJSON and FGDC CSDGM. Support for ISO 19115-1 will be coming soon (December 2018).
mdEditor is an open source metadata authoring tool for documenting projects, data sets and other data resources. mdEditor may be used to create mdJSON records and by interfacing with the mdTranslator, produce metadata in any of the supported output formats. mdEditor can be used by anyone to create metadata and requires no prior knowledge of any of the supported metadata formats.
EARTH AND SPACE SCIENCES DATA ARE A WORLD HERITAGE
Community Partnership to Develop Best Practices Across the Data Lifecycle to Advance Open and Fair Data
Integrity and transparency within research is solidified by a complete set of research products that are findable, accessible, interoperable, and reusable. These are known as the FAIR Guiding Principles. Unfortunately not all research artifacts are saved in such a way that they can be understood by other researchers reading the publication, and reused and repurposed in multiple other research endeavors.
To accelerate this process, the American Geophysical Union and a set of partners representing the International Earth and Space Science community have been awarded a grant from the Laura and John Arnold Foundation to develop a collaborative solution across researchers, journals and repositories that will evolve the Earth and Space Science (ESS) publication process to include not just the publication, but all research inputs into that publication and related derived data products to help develop a unified process that is efficient and standardized for researchers and supports their work from grant application through to publishing.