Skip to Main Content

Data Management

This guide aims to help Tulane faculty, staff, and students manage, store, and share their research data.

File Organization

A well planned project will include both a file naming convention and directory structure that will ease the research process and increase efficiency.  

A brief yet descriptive file naming convention improves your ability to find files later and quickly determine what they contain.  The tips below improve cross compatibility of files between programming languages and software. 

File name tips:

  • Use names that are brief but descriptive.
  • Make sure all data producers use the same naming convention. 
  • Identify the version of the file.
  • Identify when the file was created.
  • Use three-letter file extension for files  (e.g. .rtf, .tif, .txt).
  • Avoid spaces and special characters (e.g. *, #, %).
  • Do not use letter case to identify different files (ex. datasetA.txt vs. dataseta.txt).
  • Include:
    • Project or experiment name or acronym, 
    • Location/spatial coordinates.
    • Researcher name/initials,
    • Date or date range of experiment, and/or
    • Type of data. 

Filename Example:

SABOR_CK_04072014_S1_bb_432.csv

SABOR is the project name
CK is the first and last initial of the data collector
04072014 is the DDMMYYYY the data was collected
S1 is the station number/location of the data
bb is the variable collected (backscatter)
432 is the wavelength that the data was collected
csv stands for the file type—ASCII comma separated variable

Instead of "bb 432" or "bb-432" use "bb_432".

Suggested File Formats

Ideally, file formats selected for a project are chosen during the planning stages of a project with a specific data repository or archive in mind. 

The following are some guidelines to help you in choosing an appropriate file format for your research:

  • Non-proprietary
  • Uncompressed
  • Unencrypted
  • Commonly used by the general research community
  • Open, documented standards
  • Using standard character encodings (ASCII, UTF-8)

Preferred File Formats:

  • Text: XML, PDF/A, HTML, ASCII, UTF-8 (not Word)
  • Tabular Data: CSV (not Excel)
  • Still Images: TIFF, JPEG 2000, PDF, PNG, BMP (not GIF or JPG)
  • Moving Images: MOV, MPEG, AVI, MXF (not Quicktime)
  • Sounds: WAVE, AIFF, MP3, MXF
  • Databases: XML, CSV
  • Statistics: ASCII, DTA, POR, SAS, SAV
  • Containers: TAR, GZIP, ZIP
  • Geospatial: SHP, DBF, GeoTIFF, NetCDF
  • Web Archive: WARC

Oregon State University has a table of other acceptable formats on top of the preferred file formats.

Version Control

Version control is necessary when a file is updated regularly or managed by more than one person.  Tracking changes in files over time can be done manually or through version control systems.

Simple version control systems:

Google Drive:  

  • Any time you edit files created on Google Drive (Docs, Sheets, Slides), new versions are saved as you go.
  • Version information includes who was editing the file and the date and time the new version was created.
  • You can also see changes made and revert back to a previous version at any time.

Tulane Box:

Any type of document can be stored and versioned with Box.

  • Any document can be stored in Box.
  • Any time you edit or upload a new version of a document, Box overwrites the old version with the updated version. You do not need to rename new versions. 
  • Box keeps track of your old versions should you want to restore a previous version. 
  • You can add comments to help indicate changes between versions.
  • Box allows you to share files and track who uploaded or updated each file and when.

Advanced version control systems:

Github: a free and open source distributed version control system. Learn more here

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.