Search results
Results from the WOW.Com Content Network
Content. The Corpus of Contemporary American English (COCA) is composed of one billion words as of November 2021. [1][2][4] The corpus is constantly growing: In 2009 it contained more than 385 million words; [5] in 2010 the corpus grew in size to 400 million words; [6] by March 2019, [7] the corpus had grown to 560 million words. [7]
The table also includes frequencies from other corpora. As well as usage differences, lemmatisation may differ from corpus to corpus – for example splitting the prepositional use of "to" from the use as a particle. Also, the Corpus of Contemporary American English (COCA) list includes dispersion as well as frequency to calculate rank.
Corpus of Contemporary American English (COCA) 425 million words, 1990–2011. Freely searchable online; Corpus Resource Database (CoRD), more than 80 English language corpora. [2] Coruña Corpus, a corpus of late Modern English scientific writing covering the period 1700–1900, developed by the Muste research group at the University of A Coruña
Retrieved from "https://en.wikipedia.org/w/index.php?title=COCA:_Corpus_of_Contemporary_American_English&oldid=227128251"
Mark E. Davies (born 1963) is an American linguist. He specializes in corpus linguistics and language variation and change.He is the creator of most of the text corpora from English-Corpora.org (including the Corpus of Contemporary American English/ COCA) as well as the Corpus del español and the Corpus do português.
Main page; Contents; Current events; Random article; About Wikipedia; Contact us; Donate; Help; Learn to edit; Community portal; Recent changes; Upload file
Word list. A word list (or lexicon) is a list of a language's lexicon (generally sorted by frequency of occurrence either by levels or as a ranked list) within some given text corpus, serving the purpose of vocabulary acquisition. A lexicon sorted by frequency "provides a rational basis for making sure that learners get the best return for ...
The Cambridge International Corpus (CIC) is a collection of over 800 million words of real spoken and written English . The texts are stored in a database that can be searched to see how English is used. The CIC also contains the Cambridge Learner Corpus, a unique collection of over 60,000 exam papers from Cambridge ESOL.