I want to search


DUG 2018 Agenda

Meeting Agenda (long version)

Remote attendance is available for those sessions marked with an asterisk. Remote attendees can join in listen only mode using the links below. Remote attendees may ask questions using the chat message feature.

Plenary Sessions: remote participation link
Breakout A: remote participation link
Breakout B: remote participation link

Monday Jul 16th

Time Session Speaker Location
0800 Breakfast Madera
0830 Session 1*: Welcome to the DUG, Introduction to DataONE and Update Amber Budden, Bill Michener Madera
0900 Session 2*: DataONE Member Node Federation Amy Forrester and DataONE Member Nodes Madera
0940 Session 3*: DUG Business Meeting (epad notes) Co-chairs; Karl Benedict and Bob Sandusky Madera
1010 Break
1030 Session 4*: DataONE Sustainability (epad notes) Panel Discussion incl: Bryan Heidorn, Karl Benedict, Bob Sandusky, Bill Michener, Erin Robinson, Matt Jones, Amber Budden Madera
1130 Session 5: Birds of a Feather** Sustainability of infrastructive Madera (epad notes)
Community engagement and open data advocacy Canyon A (epad notes)
FAIR data Canyon C (epad notes)
1230 Lunch Pima
1330 Session 6*: Breakout Sessions
Breakout A: Technical Sustainability through DataONE (epad notes) Matt Jones, Dave Vieglais, Karl Benedict Canyon A
Breakout B: Community Outreach and Education (epad notes) Amber Budden, Amy Forrester Canyon C
1530 Break
1550 Session 6 cont'd: Report Back from Break Outs Mod: Bob Sandusky Madera
1600 Session 7: Community Oral Presentations Madera
Supporting reusable, reproducible, and interoperable computational modeling Allen Lee
Enabling FAIR Data Project, the DataONE Users Group, and a Discussion on Implementation and Transition Shelley Stall
Repositories for Scientific Data Papers Bill Michener
1700 Adjourn
1730 Session 8: Reception and Poster Session West Foyer

**Potential Topics: Sustainability of services, software and infrastructure; FAIR data; Open science advocacy and education

Optional Social Event
Following the end of the poster session, we invite you to join us across the street at Gentle Ben's Micro Brewery across the street at 865 East University Blvd. They have a full menu in addition to a wide range of brews. No pre-registration required.

Tuesday July 17th 2018
The DataONE User Group are supporting two workshops as part of the Tuesday ESIP agenda. These and other ESIP activities are not included in the DUG registration and you must register with ESIP for full or one-day attendance.

Breakout Descriptions

Breakout A: Technical Sustainability through DataONE
Part I: Rich Metadata Models
Part II: Service APIs and Discovery
The intent of this breakout is to provide overview of how metadata is handled by DataONE across various scenarios of dataset description for access and reuse. The session will cover the purpose of metadata, good practices for describing datasets, how to describe access to datasets and how the DataONE indexing and search UI helps. The focus will be on these services within the context of technical sustainability and how Member Nodes and the DUG can have a role in sustaining DataONE infrastructure.

Breakout B: Community Outreach and Education
Part I: New Education Resources
Part II: Member Node Support Documentation.
In Part I of this breakout we will introduce some new education resources; the Research's Guide to DataONE and the Data Management Skillbuilding Hub and explore the potential for extension of the hub as a platform for community products. In Part II we will discuss mechanisms for increased support of upcoming and current Member Nodes through enhanced documentation, and the framework for presentation of documentation.

Oral Presentation Abstracts

Michael Barton, Marco Janssen, Allen Lee, Kenneth Buetow, and Calvin Pritchard
Supporting reusable, reproducible, and interoperable computational modeling

Code is data and there is a growing awareness that it should be archived and citable alongside all the other digital artifacts that support modern research with computational dependencies. Although making code available is a great first step, there's a lot more work needed to make it amenable to reuse and evaluation. To this end, CoMSES Net is dedicated to establishing community standards and guides to good practice for documentation, semantic interoperability, and modular decomposition of computational modeling components that facilitate evaluation and reuse. We aspire towards a future where computational modelers can discover, evaluate, and connect well-encapsulated domain specific models together that can be run on commodity compute resources or supercomputers and configure them to test alternate hypotheses and implementations. As a concrete example, a hydrology model that routes water across a landscape informs an agent based model that simulates farmer decisions (what to plant, where, as well as terraforming land use decisions) how much water will be available to a given grid cell and ultimately derives expected yields based on a crop systems model. This particular problem space is fraught and littered with arguably failed scientific pipeline software (as in failed to gain widespread adoption in the broader scientific community). But we should keep trying!

Shelley Stall, Lynn Yarmey; Erin Robinson; Kerstin Lehnert; Mark Parsons; Leslie Wyborn; Brooks Hanson; Brian Nosek; Joel Cutcher-Gershenfeld
Enabling FAIR Data Project, the DataONE Users Group, and a Discussion on Implementation and Transition

The Enabling FAIR Data project in the Earth, science and environmental sciences is moving into the implementation phase shortly after the DUG 2018 meeting. In this session, the recommendations and guidelines of the project will be reviewed, timelines, and the tools and information that can be used to support transition.
Project background:
Although more and more publishers are requiring data to be available upon publication, practices are widely variable. Many still allow “available from authors” statements, and even when deposition is mandated, data are commonly simply placed in the supplementary information section of their manuscript as a PDF file or a spreadsheet in a proprietary format, such as Excel, without any metadata. These supplements are not indexed. Some researchers use repositories that have solid preservation practices, but in many fields, this is haphazard and not mandated, so that similar data are treated inconsistently.
To address these problems, AGU is convening a coalition of leading publishers and repositories in the Earth, space, and environmental sciences to enable FAIR data standards across the community, with major funding from the Laura and John Arnold Foundation. The solution will require researchers to place their data in a repository that supports persistent identifiers, data citation, community standards for metadata, and access to data before publication to support publication peer review.

Bill Michener, Matt Jones
Repositories for Scientific Data Papers

The talk includes an overview of the history of data publication in the ecological and environmental sciences and presents an approach for streamlining peer-reviewed data publication. We discuss how data repositories can play a new role in the data publication process. Example data papers are used to illustrate the benefits to science and society.

Poster Presentations

Steves, I, Goldstein, J, Jones, M & the NSF Arctic Data Center Team
Open source tools and workflows for repository management

The Arctic Data Center is geographically focused rather than domain specific, which fosters cross-disciplinary data discovery and synthesis. However, developing guidelines and support for heterogeneous data and metadata across social, physical, and biological sciences can be challenging. Open source tools and workflows help us collaborate as a team to address some of these issues.

Mach, M, Budden, A, Whitmire, A, Bloom, D, Rauch, S, Meyer, J, Hutchison, V
A Researcher's Guide to DataONE

For researchers new to open science and data management, understanding the complexity of the landscape, the role of various organizations and the support available can be challenging. There are multiple initiatives designed to consolidate this information, organized thematically by service or content type. Our new Researcher's Guide to DataONE walks researchers through the many resources available to them at DataONE within the framework of a 'roadmap'. The poster presented represents this step-by-step journey and summarizes more comprehensive, dynamic information available through the DataONE website.

Vieglais, D & the Make Data Count Team
How to Make Your Data Count

Make Data Count (MDC), a Sloan funded project between California Digital Library, DataONE and DataCite, provides incentives and aims to show researchers the value of their research data by displaying data usage and citation metrics.

Laney, C
NEON within the landscape of data repositories

The U.S. National Ecological Observatory Network (NEON), sponsored by the U.S. National Science Foundation and managed cooperatively by Battelle, is a continental-scale, long-term observation facility that supports research of ecological drivers and responses. Now in its first year of operations, NEON collects and disseminates a full suite of observations from field crews (including soil, water, and organismal samples), automated instruments, and remote-sensing airborne platforms from 80 field sites across the U.S. Currently, more than 170 data products are freely available through a data portal, a public API, and a network of repositories, including AeroNet, MG-RAST, BOLD, OpenTopography, AmeriFlux, Phenocam Gallery, and NCBI SRA. In addition, tens of thousands of samples are available upon request, and NEON is standing up a new partnership with Arizona State University to manage its bioarchive. As a DataONE member node, we ask how NEON can more efficiently improve the discoverability of data across the landscape of archives, external repositories, and data/metadata aggregators?

Walworth, DH, Bradley, J, Smith, S
Alaska Data Integration working group Metadata ToolKit

The Alaska Data Integration working group(ADIwg) Metadata Toolkit is an open source suite of applications for authoring and editing metadata for spatial and non-spatial projects and datasets. The main goal of the toolkit is to promote the creation and use of metadata by lowering the level of technical expertise required to produce archival quality metadata.
mdJSON is the metadata format that ties the suite of tools together. Based on JavaScript Object Notation(JSON), mdJSON is capable of capturing 90% of ISO 19115-1 and FGDC CSDGM. Schemas are available for validation and documentation of mdJSON metadata records. mdJSON is an excellent lightweight alternative to current XML-based metadata formats.
The mdTranslator application supports translation between multiple metadata formats. Currently mdTranslator reads mdJSON, FGDC CSDGM, and sbJSON (the native format for the U.S. Geological Survey ScienceBase catalog) and outputs metadata in multiple standards, including ISO 19115-2, 19110, HTML, mdJSON, sbJSON and FGDC CSDGM. Support for ISO 19115-1 will be coming soon (December 2018).
mdEditor is an open source metadata authoring tool for documenting projects, data sets and other data resources. mdEditor may be used to create mdJSON records and by interfacing with the mdTranslator, produce metadata in any of the supported output formats. mdEditor can be used by anyone to create metadata and requires no prior knowledge of any of the supported metadata formats.

Stall, S
Community Partnership to Develop Best Practices Across the Data Lifecycle to Advance Open and Fair Data

Integrity and transparency within research is solidified by a complete set of research products that are findable, accessible, interoperable, and reusable. These are known as the FAIR Guiding Principles. Unfortunately not all research artifacts are saved in such a way that they can be understood by other researchers reading the publication, and reused and repurposed in multiple other research endeavors.
To accelerate this process, the American Geophysical Union and a set of partners representing the International Earth and Space Science community have been awarded a grant from the Laura and John Arnold Foundation to develop a collaborative solution across researchers, journals and repositories that will evolve the Earth and Space Science (ESS) publication process to include not just the publication, but all research inputs into that publication and related derived data products to help develop a unified process that is efficient and standardized for researchers and supports their work from grant application through to publishing.