Skip to Main Content

Data Curation, Preservation, and Reuse

This guide aims to help researchers through the data curation process at any stage in a projects lifecycle.

What are data citations?

Like article or book citations, data citations help researchers and research stakeholders preserve research data. By creating a data citation, you will have a structured way of locating your data in the vast data ecosystem. According to the Research Data Management Service Group at Cornell University, data citation provides the following benefits:

Benefits for data producers:

  • provides proper attribution and credit
  • creates a bibliographic "trail", connecting publications and supporting data, and establishing a timeline of publication and usage
  • demonstrates the impact of their work and establishes research data as an important contribution to the scholarly record

Benefits for data users:

  • citation makes it easier to find datasets
  • supports persistence of datasets
  • encourages the reuse of data for new research questions

Benefits for everyone:

  • increases transparency and reproducibility

Reference: Data citation | Research Data Management Service Group (cornell.edu)

Aspects of a Data Citation

While citing conventions vary between disciplines and stakeholder requirements, the following citation components are generally recommended in all contexts as recommended by the Joint Declaration of Data Citation Principles:

Author(s)

Dataset Name/Title

Publication Year

Publisher (Repository, Data Archive, etc.)

Edition/Version

Software used for analysis (Optional)

Persistent Identifier (DOI, URL, etc.) --  Normally provided by the hosting institution

Repository Guidelines

Visit the following websites for more information on data citations for specific repositories and disciplines:

Dataverse

Dryad

Roper Center (Cornell)

Data Citation Examples

Reference the following data citations when creating citations for your project.

Hanmer, Michael J.; Banks, Antoine J., White, Ismail K., 2013, “Replication data for: Experiments to Reduce the Over-reporting of Voting: A Pipeline to the Truth”, Harvard Dataverse, V1, http://dx.doi.org/10.7910/DVN/22893 UNF:5:eJOVAjDU0E0jzSQ2bRCg9g==

Guan, Lu, 2022, "Census of Twitter Users: Scraping and Describing the National Network of South Korea", https://doi.org/10.7910/DVN/9GRCYU, Harvard Dataverse, V6, UNF:6:sFlPT5eaD7gIOLoOuWr5MA== [fileUNF]

Armbruster, Jonathan; Lujan, Nathan; Armbruster, Jonathan W.; Lujan, Nathan K. (2016), Data from: A new species of Peckoltia from the Upper Orinoco (Siluriformes: Loricariidae), Dryad, Dataset, https://doi.org/10.5061/dryad.f94f2

George Mason University Center for Climate Change Communication/Yale University Project on Climate Change Communication. (2022). Climate Change in the American Mind Survey (Version 1) [Dataset]. Cornell University, Ithaca, NY: Roper Center for Public Opinion Research. doi:

O’Donohue, W. (2017). Content analysis of undergraduate psychology textbooks (ICPSR 21600; Version V1) [Data set]. ICPSR. https://doi.org/10.3886/ICPSR36966.v1

Data Citation Recommendations from Style Guides

Style and citation guides also provide recommendations for citing data.

APA

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.