Describe method to create derived data products

When describing the process for creating derived data products, the following information should be included in the data documentation or the companion metadata file:

  • Description of primary input data and derived data
  • Why processing is required
  • Data processing steps and assumptions
    • Assumptions about primary input data
    • Additional input data requirements
    • Processing algorithm (e.g., volts to mol fraction, averaging)
    • Assumptions and limitations of algorithm
    • Describe how algorithm is applied (e.g., manually, using R, IDL)
  • How outcome of processing is evaluated
    • How problems are identified and rectified
    • Tools used to assess outcome
    • Conditions under which reprocessing is required
  • How uncertainty in processing is assessed
    • Provide a numeric estimate of uncertainty
  • How processing technique changes over time, if applicable
Document steps used in data processing

Different types of new data may be created in the course of a project, for instance visualizations, plots, statistical outputs, a new dataset created by integrating multiple datasets, etc. Whenever possible, document your workflow (the process used to clean, analyze and visualize data) noting what data products are created at each step. Depending on the nature of the project, this might be as a computer script, or it may be notes in a text file documenting the process you used (i.e. process metadata). If workflows are preserved along with data products, they can be executed and enable the data product to be reproduced.