Data quality and quality control methods should be part of the data information file
Documentation and metadata associated with a data file should be checked to make sure what is described is actually found in the data file.
Data sets that are to be integrated should be compatible and comparable.
Prepare and document a plan for assuring the quality of your data, so that you and others can trust they are accurate.
Whether compiled or entered from paper records, new digital data should be double-checked for accuracy to avoid input errors.
Quality control is highly dependent on the data collection methodology, but there are some general practices that can be applied
Often search terms and query strings are used to discover and capture data sets. By documenting the steps and terms used, one can more easily reproduce a dataset.
Missing values in a data set should be standardized for identification by the consistent use of well-defined missing value codes in data tables
Use statistical and visual methods for determining potentially erroneous data points
In a data file, when actual measurements could not be obtained, estimated values should be identified.
Problematic data should be flagged to indicate known issues so potential users are aware the limitations of the data.
Provide versions of data products with defined identifiers to enable discovery and use
Items to consider when versioning data products: