Connect

California Digital Library and Partners Launch DataUp

October 2nd 2012 - The University of California’s Digital Library (CDL) and its partners today launched DataUp, a free data management tool.

Researchers struggling to meet new data management requirements from funders, journals and their own institutions can now use the DataUp web application and a Microsoft Excel add-in to document and archive their tabular data.

“DataUp will change the way scientists do their work, making it easy for them to manage and preserve their spreadsheet data for future use,” said Bill Michener, principal investigator for the DataONE project.

Scientific datasets have immeasurable value, but they are useless without proper documentation and long-term storage. Data sharing is also strongly encouraged in the scientific community but is not the norm in many disciplines, including Earth, ecological and environmental sciences. DataUp addresses these issues.

CDL partnered with the Gordon and Betty Moore Foundation, Microsoft Research Connections, and DataONE to create the DataUp tool which is free to use and creates a direct link between researchers and data repositories. Also today, it was announced the DataUp project has been contributed to the Outercurve Foundation’s Research Accelerator Gallery.

The DataUp add-in operates within a program many researchers already use: Microsoft Excel. The web application allows users to upload tabular data in either Excel format or comma-separated value (CSV) format. Both the add-in and the web application allow users to:

• Perform a “best practices check” to ensure data are well-formatted and organized
• Create standardized metadata, or a description of the data, using a wizard-style template
• Retrieve a unique identifier for their dataset from their data repository
• Post their datasets and associated metadata to the repository.

Although hundreds of data repositories are available for archiving, many scientific researchers are either unaware of their existence or do not know how to access them. One of the major outcomes of the DataUp project is the ONEShare repository, created specifically for DataUp, where users can deposit tabular data and metadata directly from the tool.

An added advantage of ONEShare is its connection to the DataONE network of repositories. DataONE links existing data centers and enables users to search for data across participating repositories by using a single search interface. Data deposited into ONEShare will be indexed and made available by any DataONE user, facilitating collaboration and enabling data re-use.

"Although many tools exist for managing data, DataUp is uniquely positioned because it improves the quality and documentation of data in Microsoft Excel, the tool of choice for many researchers that would otherwise not participate in data preservation initiatives," said Matthew Jones, Director of Informatics at UC Santa Barbara's National Center for Ecological Analysis and Synthesis. "Scientific synthesis will benefit tremendously from the infusion of these small but information-rich data sets from Excel into the DataONE ecosystem of shared data."

CDL envisions the future of DataUp directed by the participating community at large. Interested developers can expand on and increase the tool’s functionality to meet the needs of a broad array of researchers. Code for both the add-in and web application is open source and participation in its improvement is strongly encouraged.

About the University of California Curation Center (UC3) at the California Digital Library
UC3 is a creative partnership bringing together the expertise and resources of the University of California. Together with the UC Libraries we provide high quality and cost-effective solutions that enable campus constituencies — museums, libraries, archives, academic departments, research units, and individual researchers — to have direct control over the management, curation, and preservation of the information resources underpinning their scholarly activities. For more information, visit http://www.cdlib.org/services/uc3/

Partners:
Microsoft Research Connections collaborates with and supports the work of the world’s top academic researchers and institutions. We establish partnerships to advance the state of the art in computer science and develop technologies that fuel data-intensive scientific research. By connecting leading researchers around the world, we aspire to accelerate the scientific discoveries and breakthroughs that respond to some of the world’s most urgent global challenges. Our fellowships, grants, and awards help to inspire the next generation of computer scientists and the broader research community.

The Gordon and Betty Moore Foundation is committed to making a meaningful difference in environmental conservation, patient care and scientific research. Gordon, co-founder of Intel, and his wife Betty established the foundation in 2000 to create positive outcomes for future generations. The Moore Foundation focuses on that goal around the world and in the San Francisco Bay Area. Learn more at www.Moore.org.

DataONE is the foundation of new innovative environmental science through a distributed framework and sustainable cyber-infrastructure that meets the needs of science and society for open, persistent, robust, and secure access to well-described and easily discovered Earth observational data. Supported by a $20 million award made as part of the National Science Foundation's (NSF) DataNet program (Grant #OCI-0830944), and with coordination nodes at the University of New Mexico, University of California Santa Barbara, and University of Tennessee, DataONE represents a collaboration of universities and government agencies coalesced to address the mounting need for organizing and serving up vast amounts of highly diverse and inter-related but often heterogeneous, scientific data.

The Outercurve Foundation is a not-for-profit foundation providing software IP management and project development governance to enable and encourage organizations to develop software collaboratively in open source communities for faster results. Outercurve is the only open source foundation that is platform, technology, and license agnostic. For more information about the Outercurve Foundation contact info@Outercurve.org.

Contacts
Carly Strasser
510-987-0179, carly.strasser@ucop.edu

Trisha Cruse
510-987-9016, patricia.cruse@ucop.edu