Search results
Results from the WOW.Com Content Network
Collocation extraction is the task of using a computer to extract collocations automatically from a corpus. The traditional method of performing collocation extraction is to find a formula based on the statistical quantities of those words to calculate a score associated to every word pairs.
His Corpus, Concordance, Collocation formulated the "idiom principle". [4] Though he had written many books, at his valedictory lecture in 2000 he stated that none of his many published articles passed successfully through peer-review, and that even an article he had been invited to write for a journal was peer-reviewed by mistake and rejected.
A word sketch triple is a triple consisting of headword, grammatical relation, collocation (e.g. man, modifier, young).Considering an underlying text corpus, a word sketch quintuple is a quintuple consisting of headword, grammatical relation, collocation, position of headword in the corpus, position of collocation in the corpus (e.g. man, modifier, young, 104, 103).
Each of the modules offers a number of other features in relation to the text corpus or text being analysed. Thus, for example, collocation and dispersion plots are computed with a concordance search. In addition, there are a number of additional modules that are useful for the preparation, clean-up and format the text corpus.
Sketch Engine is a product of Lexical Computing, a company founded in 2003 by the lexicographer and research scientist Adam Kilgarriff. [4] He started a collaboration with Pavel Rychlý, a computer scientist working at the Natural Language Processing Centre, Masaryk University, [5] and the developer of Manatee and Bonito (two major parts of the software suite).
Corpus linguists specify a key word in context and identify the words immediately surrounding them, to illustrate the way words are used in practice. The processing of collocations involves a number of parameters, the most important of which is the measure of association , which evaluates whether the co-occurrence is purely by chance or ...
In recent years, linguists have used corpus linguistics and concordancing software to find such hidden associations. Specialised software is used to arrange key words in context from a corpus of several million words of naturally occurring text. The collocates can then be arranged alphabetically according to first or second word to the right or ...
Corpus of Contemporary American English (COCA) 425 million words, 1990–2011. Freely searchable online; Corpus Resource Database (CoRD), more than 80 English language corpora. [2] Coruña Corpus, a corpus of late Modern English scientific writing covering the period 1700–1900, developed by the Muste research group at the University of A Coruña