Skip to Main Content
It looks like you're using Internet Explorer 11 or older. This website works best with modern browsers such as the latest versions of Chrome, Firefox, Safari, and Edge. If you continue with this browser, you may see unexpected results.

Data Cleaning and Inspection with Python

This guide provides an introduction to techniques used to clean data in Python.

Looking to clean and inspect your data with Python?


Use this guide to learn more about using the pandas library to clean and inspect your data.

You will be introduced to the following outcomes:

  • Import .csv files using pandas​ (Importing CSV files w/ Pandas)
  • Create data frames from .csv files with pandas​ (Importing CSV files w/ Pandas)
  • Generate descriptive information about data frames​ (Generating Descriptive Statistics)
  • Check data types​ (Checking and Changing Data Type)
  • Change data types​ (Checking and Changing Data Type)
  • Generate count of missing values in a columns​ (Handling Missing Data)
  • Delete rows with missing data​ (Handling Missing Data)
  • Replace missing values with mean values​ (Handling Missing Data)
  • Delete unwanted columns​ (Edits to Columns)
  • Split text columns​ (Edits to Columns)
  • Rename columns​ (Edits to Columns)
  • Create new columns based on a condition (Edits to Columns)
  • Group records based on certain attributes (Grouping Based on Attributes)
  • Save edited Data Frame to a .csv file (Saving Data to a CSV file)
  • Create data reports with SweetViz (SweetViz Library)

via GIPHY

Python is a programming language that can be used for a variety of data-intensive tasks from descriptive analytics to machine learning. With the help of community-created library packages, you can use Python to tackle your next academic, professional, or personal data project! For more information about Python, visit the following site:  Welcome to Python

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.