Skip to Main Content

Tools for Organizing and Cleaning Data Workshop

An introductory guide to methods and tools for cleaning, organizing, and exporting data.

Before You Export

Before you export your data:

1. Think carefully about the kind of data you are working with

  • Does the data include commas and/or quotation marks?
    • If so, you may want to use TSV instead of CSV
  • Does the data include added-value elements such as links, visualizations, or multiple sheets that will not export unless you keep a particular file format?
    • If so, you may want to maintain the Excel file formatting. You could also export the data in multiple files: one with visualizations, one for each sheet, or one with link values and another with the rest of the data.

2. Check the import options of any tools you may want to use with this data in the future

  • Does the tool anticipate a particular encoding?
    • Use the encoding necessary. Almost every software that imports data will accept CSV, and most accept TSV.
  • Does the tool allow you to customize which delimiting character should be used to mark columns and rows?
    • If not, you will probably want to use the delimiting character it expects (usually a comma or tab).
  • Does the tool accept proprietary formats (such as xlsx) or does it require data in csv/txt formats?
    • If the tool accepts XLSX and your data is natively in XLSX, you should try to stick with that. Every time you export data into a new type of file, you run the risk of creating new errors.
  • Are you uploading the file to share it with a wide audience or in hopes of preserving it for the long term?
    • In this case, try to save your file as CSV or TXT...and verify that the encoding treats your data accurately. This will make sure that others using a different platform or software should be able to access your data. It also ensures that the code necessary for reading these files will still exist in 10-20 years, unlike the proprietary and often secret code necessary to read excel files.

Exporting from a Spreadsheet Tool

Visualization Options in Excel

Not all visualization options require export. Another option is to use the visualization options in Excel and export those using Copy and Paste. The easiest way to begin exploring some of these options is by using "Insert" > "Recommended Charts".

 

Recommended Visualization

Excel will creation a visualization based on it's understanding of your data. By selecting "Recommended Charts" you can see a quick preview of the results of the recommended charts. The second tab, "All Charts" will let you see other chart options. Finally, once you select a chart, you can edit the columns, names, and values displayed using the filter option to the left of the chart.

For this set of data, I recommend creating a visualization from pivot chart on the cuisine list created on the Organizing tab, or creating a new Pivot table for "Restaurant" and using "Recommended Chart with that data highlighted.

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.