Search results
Results from the WOW.Com Content Network
The Cambridge International Corpus (CIC) is a collection of over 2 billion words [1] of real spoken and written English. The texts are stored in a database that can be searched to see how English is used. The CIC also contains the Cambridge Learner Corpus, a unique collection of over 60,000 exam papers from Cambridge ESOL.
Text corpora (singular: text corpus) are large and structured sets of texts, which have been systematically collected.Text corpora are used by corpus linguists and within other branches of linguistics for statistical analysis, hypothesis testing, finding patterns of language use, investigating language change and variation, and teaching language proficiency.
This is a comparison of English dictionaries, which are dictionaries about the language of English.The dictionaries listed here are categorized into "full-size" dictionaries (which extensively cover the language, and are targeted to native speakers), "collegiate" (which are smaller, and often contain other biographical or geographical information useful to college students), and "learner's ...
The Cambridge English Profile Corpus (CEPC) is a corpus of learner English produced by students worldwide, and is being built by Cambridge University Press and the Cambridge English Language Assessment, in collaboration with a network of participating educational establishments across the world. These establishments include schools ...
The Teacher Word Book contains 30,000 lemmas or ~13,000 word families (Goulden, Nation and Read, 1990). A corpus of 18 million written words was hand analysed. The size of its source corpus increased its usefulness, but its age, and language changes, have reduced its applicability (Nation 1997). The General Service List (West, 1953)
Each corpus contains one million words in 500 texts of 2000 words, [7] following the sampling methodology used for the Brown Corpus.Unlike Brown or the Lancaster-Oslo-Bergen (LOB) Corpus (or indeed mega-corpora such as the British National Corpus), however, the majority of texts are derived from spoken data.
Corpus linguistics is an empirical method for the study of language by way of a text corpus (plural corpora). [1] Corpora are balanced, often stratified collections of authentic, "real world", text of speech or writing that aim to represent a given linguistic variety . [ 1 ]
Stefan Th. Gries (['ʃtɛfɐn 'tʰoːmɐs 'ɡʁiːs]) is Professor of Linguistics in the Department of Linguistics at the University of California, Santa Barbara (UCSB), Honorary Liebig-Professor of the Justus-Liebig-Universität Giessen (since September 2011), [1] and since 1 April 2018 also Chair of English Linguistics [2] (Corpus Linguistics with a focus on quantitative methods, 25%) in the ...