Skip to Main Content

Data Management

This guide aims to help Tulane faculty, staff, and students manage, store, and share their research data.

Documentation and Metadata

Data documentation provides descriptive information about a data set and ensures its future use by you and others. Metadata is one way to document your data. Metadata and documentation should: 

  • Enable efficient organization of the research data
  • Facilitate discovery
  • Facilitate research data sharing
  • Identify the creator(s) of the data
  • Provide permanent identifiers for the data
  • Link the data to other related products, such as articles and other data
  • Support archiving and preservation

What to document?

It is important to begin to document your data at the very beginning of your research project and continue throughout the project. By doing so will make data documentation easier and reduce the likelihood that you will forget aspects of your data later in the research project. Don’t wait until the end to start to document your research project and its data.

Research Project Documentation:

  • Context of data collection
  • Data collection methods
  • Structure, organization of data files
  • Data sources used 
  • Data validation, quality assurance
  • Transformations of data from the raw data through analysis
  • Information on confidentiality, access & use conditions

Dataset Documentation:

  • Variable names, and descriptions
  • Explanation of codes and classification schemes used
  • Algorithms used to transform data
  • File format 
  • Software - version, OS

How to document?

A readme file provides information about a data file to help ensure that the data can be correctly interpreted, by yourself at a later date or by others when sharing or publishing data. Standards-based metadata is generally preferable, but where no appropriate standard exists, for internal use, writing “readme” style metadata is an appropriate strategy.

Want a readme file template? Cornell University provides a readme file template to help you get started. Feel free to adapt this template to meet your needs.  

What is metadata?

Metadata is data about your data.  

  • Answers who, what, why, when, where and how about your data set.
  • Often includes the purpose, time, geographic location, creator, access, variables, variable units, and terms of use of the data.
  • Organized using a schema which allows the data set to be easily indexed and retrieved from the repository

What metadata schema should I use?

Many disciplines have metadata schemas designed specifically for their types of data.  Funders may require the data that you will share to use a specific schema. Repositories may require that all data submitted to them for deposit use a specific standard (a standard identifies the required schema). It is best practice to choose a repository early in a project so you can identify which standard you will be required to use.

Example metadata standards and schemas:

  • DataCite: schema used for a DOI assigned to a dataset.
  • DDI (Data Documentation Initiative): international standard for social, behavioral and economic sciences. 
  • Dublin Core: basic and widely used standard. 
  • EML (Ecological Metadata Language): ecological standard supported by the Ecological Society of America. 
  • ISO 19115 or FGDC's Content Standard for Digital Geospatial Metadata for geospatial data: used to describe geospatial data. 
  • MIBBI (Minimum Information for Biological and Biomedical Investigations)

A more comprehensive list of metadata standards and schemas is provided by the UK Digital Curation Center.  

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.