I want to search

Data Citation and Attribution

Attribution for DataONE Web Resources

If you use information from the DataONE website in published or unpublished work, please include attribution for the resources you use. Contributors to the material will either be found within the resource itself, or listed under 'credits' linked from the resource page.

General format and sequence:
Author [if known]. (Date published [if available], or n.d. [indicating no date]) Title of resource. Title of web site. Retrieved date. From URL. doi: [if available]

DataONE Tutorial on Data Citation:
Tutorials on Data Management: Data Citation. DataONE. Retrieved June 12, 2012. From http://www.dataone.org/sites/all/documents/L09_DataCitation.pptx

DataONE Best Practices Primer:
The Best Practices Primer has a DOI and therefore can be cited as:
Strasser, C., Cook, R., Michener, W., & Budden, A. (2012). Primer on data management: What you always wanted to know. A DataONE publication. http://dx.doi.org/doi:10.5060/D2251G48 (available via CDL)

Citing Data

If you use data in published work, please formally credit those who contributed to the collection and hosting of the data according to legal requirements, repository guidelines, and community norms. If you are citing DataONE web pages and web resources other than data, please see Attribution for DataONE Web Resources below.

For legal requirements, please consult the licensing or citation terms for the dataset you wish to reference. Because the datasets in DataONE come from a variety of different sources and member nodes, there is no universal type of license across DataONE, though there are some best practices for citation, described below in Finding Dataset Licenses and Citations.

For repository guidelines, refer to the best practices information, if available, from the member node data repository that hosts the data you wish to cite. The hosting member node is shown alongside the search results in the DataONE Search interface (see the Find Data page for more information about DataONE Search). Some examples are given in the Finding Repository Citation Guidelines section below.

For community norms, consult recent journal articles in your field as well as pages for relevant societies and organizations. For example, the American Geophysical Union (AGU) has a policy on referencing data and a specific entry on referencing data sets in their Author Reference Sheet.

For more general data citation norms, please consider information provided by DataCite here, the article by Parsons, Duerr, and Minster, and information below in the DataONE Citation Best Practices section.

Finding Dataset Licenses and Citations

DataONE enables access to data from a wide variety of sources, and those sources have differences in the licenses and in the requested citations. Often, these are reflected in the metadata for the dataset and/or ancillary documentation. If you have a dataset from DataONE, look in the directory with the downloaded data for documentation which lists license information. For example, datasets from the ORNL DAAC will often come with a Guide Document, which lists any use restrictions and a requested citation.

Many metadata records also contain this information. To find the metadata record for a dataset, if you don't already have it, go to the DataONE Search interface and search for that dataset. Paste the dataset identifier into the "Identifier" box and click the enter button. Find the dataset in the resulting list, and click on the record title to access the metadata.

Finding Repository Citation Guidelines

Many data repositories have specific guidelines on how the data they host should be attributed. These are often linked to from the repository web site, can be found by searching for "citation" along with the name of the repository, or found in the metadata for specific datasets, as described above.

For example, from the Dryad help pages:

How should I cite data from Dryad?

When citing data found in Dryad, please cite both the original article, as well as the Dryad data package. You can see both of these citations on the Dryad page for each data package. For example:

Westbrook JW, Kitajima K, Burleigh JG, Kress WJ, Erickson DL, Wright SJ (2011) What makes a leaf tough? Patterns of correlated evolution between leaf toughness traits and demographic rates among 197 shade-tolerant woody species in a neotropical forest. American Naturalist 177(6): 800-811. http://dx.doi.org/10.1086/659963

Westbrook JW, Kitajima K, Burleigh JG, Kress WJ, Erickson DL, Wright SJ (2011) Data from: What makes a leaf tough? Patterns of correlated evolution between leaf toughness traits and demographic rates among 197 shade-tolerant woody species in a neotropical forest. Dryad Digital Repository. http://dx.doi.org/10.5061/dryad.8525

If you are using a large number of data sources, it may be appropriate to provide a list of referenced data packages, rather than citing each individually in the references section. This list of data packages can then be deposited in Dryad, so others who read your publication can locate all of the original data.

Another example, from the ORNL DAAC's citation policy:

Citation Style, On-Line Data Set

Turner, D.P., W.D.Ritts, and M. Gregory. 2006. BigFoot NPP Surfaces for North and South American Sites, 2002-2004. Data set. Available on-line [http://daac.ornl.gov] from Oak Ridge National Laboratory Distributed Active Archive Center, Oak Ridge, Tennessee, U.S.A. doi:10.3334/ORNLDAAC/750.

The content of Citations should include as much of the following information as possible:

  • contributing investigators/authors
  • year of publication
  • product title
  • medium (for items other than printed text)
  • online location (i.e., URL)
  • publisher
  • publisher's location
  • date accessed
  • digital object identifier

In addition to data repository guidelines, individual journals may have specific guidelines for how to cite data. For example, BioMed Central offers this citation style guideline for its authors:

Dataset with persistent identifier:

Zheng, L-Y; Guo, X-S; He, B; Sun, L-J; Peng, Y; Dong, S-S; Liu, T-F; Jiang, S; Ramachandran, S; Liu, C-M; Jing, H-C (2011): Genome data from sweet and grain sorghum (Sorghum bicolor). GigaScience. http://dx.doi.org/10.5524/100012.

DataONE Citation Best Practices

When in doubt, cite a dataset as you would cite a paper, include actionable links in the full text inline citation and the references section, and credit others as you would like others to credit you.

If you would like to support the growing practice of data citation, we suggest the following as best practices:

  • Coordinate with publishers: Work with journal publishers and data repositories to archive data (with embargo period, if appropriate) during the publication process. This allows information about how to access a data set to be published with the associated article and allows the article citation and DOI to be included with the archived dataset.
  • Use persistent identifiers: Acquire a persistent identifier such as a Digital Object Identifier (DOI) or an Archival Resource Key (ARK) for each dataset you create and include this identifier with metadata for a data set. Some data repositories will provide a persistent identifier for each archived data set.
  • Use standardized keywords: When preparing metadata for a data set, use standardized keywords to describe data such as those from the USGS Biocomplexity Thesaurus or the NASA Global Change Master Directory Keywords. Use of standardized terms will support discovery of your data by other researchers interested in reuse of those data.
  • Create good metadata as you go: As you collect and/or work with your data, use an application such as Morpho, Metavist, or Mermaid that supports metadata creation to capture essential information about your dataset. Tracking such information throughout data collection and analysis ensures availability of high-quality metadata when data are ready to be uploaded to a repository.
  • Encourage data attribution, sharing, and curation: Encourage other data authors to cite data and to make their own data available for reuse by
    • providing full citation information for data whenever you publish work that makes use of other researchers' data,
    • archiving your own data in a repository that supports data discovery and reuse, and
    • updating your archived data sets when newer versions are available.

If you would like to be part of ongoing initiatives to standardize data citation practice, check out the DataCite initiative. Other guidelines and discussions on data citation include: