Skip to Main Content

Analyze Digital Text as Data

A guide to introduce and support researchers interested in distant reading approaches to digital texts.

Web Scraping

Web scraping (also known as web harvesting, or web data extraction) is a means to extract (or scrape) data (text) from websites. This can be accomplished manually or automatically via various means, typically by crawling a website. 

Care should be taken to understand the ethical and legal implications of this activity, which varies from site to site. Please consult the link below for additional details and means. 

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.