
Writing a data management plan: examples and levels of data documentation
Some examples of data documentation are:
- laboratory notebooks and experimental protocols
- questionnaires, codebooks, data dictionaries
- software syntax and output files
- information about equipment settings and instrument calibration
- database schema
- methodology reports
- provenance information about sources of derived data
Research data need to be documented at various levels:
- Project level: what the study set out to do, how it contributes new knowledge to the field, what the research questions or hypotheses were, what methodologies were used, what sampling frames were used, what instruments and measures were used, and so on. A complete thesis normally contains this information, but a published article may not. If a dataset is shared, a succinct technical report may need to be included for the user to understand how to make best use of the data for their purposes.
- File or database level: how all the files (or tables in a database) that make up the dataset relate to each other; what format are they are in; whether they supercede or are superceded by previous files. A readme.txt file is the classic way of accounting for all the files and folders in a project.
- Variable or item level: the key to understanding research results is knowing exactly how an object of analysis came about. Not just, for example, a variable name at the top of a spreadsheet file, but the full label explaining the meaning of that variable in terms of how it was operationalised..