Research Data Management

Resource for best practices in managing, storing, and sharing your research data

Data Organization

It can be good to think about the end goal of your research proposal and develop your organization plan based on that goal. You may still need to modify your plan as the research progresses but it's still good to have an idea about your plan and a system in place right from the start.  The key is being able to find a system that works for you and following the system right from the beginning of the project.  While this may seem like a simple task, it's easy for digital data to quickly get out of hand when organizational procedures are not followed.

There are several pieces to keep in mind while setting up your file organization structure -

File Version Control

To keep track of versions of documents and datasetsFolder Clipart

Use naming conventions

Always record every change to a file no matter how small

Discard obsolete versions after backups have been made.

Directory Structure Naming Conventions

Directory top-level folders should include the project title, a unique identifier, and the date (year).

Image by OpenClipart-Vectors from Pixabay

The substructure should have a clear, consistent naming convention, e.g., uniform conventions for labeling each run of an experiment, each version of a dataset, and/or each person in the group.

File Naming Conventions

Identify the activity or project in the file name as well as the date the dataset was created

Avoid using special characters (e.g., $ % & # @ /) as these can become easily corrupted or misinterpreted by various operating systems or software.

Instead of using spaces, use CamelCase or Pot_hole_case as some softwares and operating systems have trouble processing spaces.

File Renaming

Renaming files individually can be tedious and lead to errors in naming conventions so it is best to find a free batch renaming tool instead.

For macOS users, bulk renaming can be completed without additional software by following these directions

File Naming Conventions for Specific Disciplines

File naming illustrationFor example -

DOE's Atmospheric Radiation Measurement (ARM) program

The Open Biological and Biomedical Ontologies

GIS Datasets

 

 

 

 

 

Image by xkcd from XKCD

 

Unique Identifiers

Datasets identifiers will allow your data to be referenced and shared. Data identifiers must be globally unique and persistent: they must not be repeated elsewhere and they must not change over time.

Identifier schemes: