Search results
Results from the WOW.Com Content Network
Each of the modules offers a number of other features in relation to the text corpus or text being analysed. Thus, for example, collocation and dispersion plots are computed with a concordance search. In addition, there are a number of additional modules that are useful for the preparation, clean-up and format the text corpus.
His Corpus, Concordance, Collocation formulated the "idiom principle". [4] Though he had written many books, at his valedictory lecture in 2000 he stated that none of his many published articles passed successfully through peer-review, and that even an article he had been invited to write for a journal was peer-reviewed by mistake and rejected.
Key Word In Context (KWIC) is the most common format for concordance lines. The term KWIC was coined by Hans Peter Luhn. [1] The system was based on a concept called keyword in titles, which was first proposed for Manchester libraries in 1864 by Andrea Crestadoro.
Collocation extraction is the task of using a computer to extract collocations automatically from a corpus.. The traditional method of performing collocation extraction is to find a formula based on the statistical quantities of those words to calculate a score associated to every word pairs.
A word sketch triple is a triple consisting of headword, grammatical relation, collocation (e.g. man, modifier, young).Considering an underlying text corpus, a word sketch quintuple is a quintuple consisting of headword, grammatical relation, collocation, position of headword in the corpus, position of collocation in the corpus (e.g. man, modifier, young, 104, 103).
Corpus linguists specify a key word in context and identify the words immediately surrounding them, to illustrate the way words are used in practice. The processing of collocations involves a number of parameters, the most important of which is the measure of association, which evaluates whether the co-occurrence is purely by chance or ...
Corpus linguistics is an empirical method for the study of language by way of a text corpus (plural corpora). [1] Corpora are balanced, often stratified collections of authentic, "real world", text of speech or writing that aim to represent a given linguistic variety . [ 1 ]
A concordancer is a computer program that automatically constructs a concordance.The output of a concordancer may serve as input to a translation memory system for computer-assisted translation, or as an early step in machine translation.