I want to search


New partnership targets data persistence

Tuesday, August 5, 2014

A new collaboration between the Global Biodiversity Information Facility (GBIF) and the Data Observation Network for Earth (DataONE) aims to support long-term persistence of the biodiversity data shared through the network.

GBIF Secretariat and DataONE have signed a memorandum of cooperation that includes plans for a pilot repository to archive datasets shared through GBIF, using the services built by DataONE.

The collaboration is part of a commitment in the GBIF Work Programme to explore data archival services offering redundancy to handle scenarios such as technical failure and the disappearance of projects or institutions that share data through the network. Future developments could include adapting GBIF's Integrated Publishing Toolkit (IPT) to include redundant storage of datasets in DataONE as part of the standard data publishing process, and to enable publishing institutions to deposit wider types of content than are currently processed through GBIF.org.

The partnership will also support DataONE's objectives to improve its own data indexing services by building on GBIF's solutions for handling large-scale tabular data.
Other areas addressed by the agreement include:

  • finding efficient ways to exchange ecological data of the type shared by many of DataONE's partners
  • using GBIF's wide network of publishers and participating institutions to connect additional members to DataONE
  • informing further development of GBIF's work with Hadoop and other 'big data' technologies to produce highly efficient and scalable tools for working with very large quantities of occurrence data

GBIF Secretariat's head of informatics, Tim Robertson, said the partnership would open up new possibilities for improving GBIF's services for users of biodiversity data: "When people use data mobilized through GBIF, they often clean records, integrate with other content and produce a new collection of derived data. We're excited to work with DataONE so that GBIF will be able to offer a repository for these derived datasets that can be referenced in manuscripts using a Digital Object Identifier (DOI)."

"Ultimately we hope this will support replicability of research using data accessed through GBIF," Robertson concluded.