Corpus DataThis site contains downloadable, full-text corpus data from ten large corpora of English -- iWeb, COCA, COHA, NOW, Coronavirus, GloWbE, TV Corpus, Movies Corpus, SOAP Corpus, Wikipedia -- as well as the Corpus del Español and the Corpus do Português. The data is being used at hundreds of universities throughout the world, as well as in a wide range of companies .
With this full-text data, you have the actual corpora on your computer, and you can use the data in any way that you'd like. The data for all three corpora comes in three different formats: data for relational databases, word/lemma/PoS, and words (paragraph format). When you purchase the data, you purchase the rights to any and all of these formats.