Research Data Management

Resource for best practices in managing, storing, and sharing your research data

Documenting Data

Folders vs Metadata illustrationProper data documentation is a necessity, in order to provide context, if you plan to share your data and ideally should be started at the start of the project. Part of this documentation is metadata, or structured information used to describe content and make data easier to find, use, and cite.  Metadata records can be embedded into the data or stored separately; different standards for metadata exist among varying fields and subject areas.  

While metadata standards differ, it is a safe bet to, at a minimum, include -

  • Title - a name for the research project and its associated data
  • Creator - names and contact info for key organizations and people who created the data
  • Dates - any important dates associated with the data including project start and end dates, modification dates, and the time period covered by the data
  • Subject keywords - keywords or phrases describing your data to help others locate the data
  • Funders - agencies and organizations who funded the research
  • Rights - intellectual property rights held for the data
  • Language - the language of the content
  • File formats - formats for the data, i.e. TIFF, HTML, SPSS, as well as which software is necessary to view the data
  • Methodology - how the data was generated including equipment, software, or tools used, experimental protocols, and other information that might be included in a lab notebook

See the links below for more comprehensive lists of metadata fields for various disciplines and communities as well as a template for creating a descriptive readme.txt file for your data.

Image by John Norris from Flickr

Documentation Best Practices

Best Practices road signSome common best practices for data documentation include:

  • Establishing a convention - use consistent methods for data entry and store
  • Be descriptive - use descriptive filenames and information rather than generic ones
  • Include dates when applicable - relying on system dates can be unreliable and misleading
  • Do not use special characters - these can create problems with operating systems, software, and hierarchical systems
  • Use a folder structure that works for you - i.e. using tags vs. a hierarchy of folders
  • Leave instructions - spell out any potentially confusing information in a readme file
  • Use a batch rename - do not rename files one at a time

Image by Barry Dahl from Flickr