Hathi is pronounced "hah-tee" and is the Hindi word for elephant. HathiTrust is a partnership of academic & research institutions, offering a collection of millions of titles digitized from libraries around the world.
The HathiTrust Research Center (HTRC) enables computational analysis of the HathiTrust corpus. It is a collaborative research center launched jointly by Indiana University and the University of Illinois, along with HathiTrust, to help meet the technical challenges researchers face when dealing with massive amounts of digital text. It develops cutting-edge software tools and cyberinfrastructure to enable advanced computational access to the growing digital record of human knowledge.
This is the libguide for the workshop we offer each semester about basic cleaning and organizing tools. It includes a description of adding coordinates in Google sheets if you have a spreadsheet with addresses, which can help you create a data set useful for mapping tools discussed in other workshops.
This is the lesson plan for an Intermediate level data cleaning workshop we offer once a year, or as requested. It shows ways OpenRefine can be used to clean data beyond the capabilities of Excel or Google Sheets. It focuses on incorporating small bits of code in a way that is accessible for non-programmers.
Next Steps: Carrying New Skills Into The Classroom