enow.com Web Search

Search results

  1. Results from the WOW.Com Content Network
  2. International Corpus of English - Wikipedia

    en.wikipedia.org/.../International_Corpus_of_English

    Each corpus contains one million words in 500 texts of 2000 words, [7] following the sampling methodology used for the Brown Corpus.Unlike Brown or the Lancaster-Oslo-Bergen (LOB) Corpus (or indeed mega-corpora such as the British National Corpus), however, the majority of texts are derived from spoken data.

  3. Text corpus - Wikipedia

    en.wikipedia.org/wiki/Text_corpus

    When the language of the corpus is not a working language of the researchers who use it, interlinear glossing is used to make the annotation bilingual. Some corpora have further structured levels of analysis applied. In particular, smaller corpora may be fully parsed. Such corpora are usually called Treebanks or Parsed Corpora. The difficulty ...

  4. Word list - Wikipedia

    en.wikipedia.org/wiki/Word_list

    Some major pitfalls are the corpus content, the corpus register, and the definition of "word". While word counting is a thousand years old, with still gigantic analysis done by hand in the mid-20th century, natural language electronic processing of large corpora such as movie subtitles (SUBTLEX megastudy) has accelerated the research field.

  5. List of text corpora - Wikipedia

    en.wikipedia.org/wiki/List_of_text_corpora

    Corpus Resource Database (CoRD), more than 80 English language corpora. [2] Coruña Corpus, a corpus of late Modern English scientific writing covering the period 1700–1900, developed by the Muste research group at the University of A Coruña; DBLP Discovery Dataset (D3), a corpus of computer science publications with sentient metadata. [3]

  6. Longman Grammar of Spoken and Written English - Wikipedia

    en.wikipedia.org/wiki/Longman_Grammar_of_Spoken...

    While targeting "English language students and researchers" (p. 45), an abridged version of the grammar was released in 2002, Longman Student Grammar of Spoken and Written English, together with a workbook entitled Longman Student Grammar of Spoken and Written English Workbook, to be used by students on university and teacher-training courses.

  7. Corpus linguistics - Wikipedia

    en.wikipedia.org/wiki/Corpus_linguistics

    Corpus linguistics is an empirical method for the study of language by way of a text corpus (plural corpora). [1] Corpora are balanced, often stratified collections of authentic, "real world", text of speech or writing that aim to represent a given linguistic variety. [1] Today, corpora are generally machine-readable data collections.

  8. Corpus of Contemporary American English - Wikipedia

    en.wikipedia.org/wiki/Corpus_of_Contemporary...

    The Corpus of Contemporary American English (COCA) is composed of one billion words as of November 2021. [1] [2] [4] The corpus is constantly growing: In 2009 it contained more than 385 million words; [5] in 2010 the corpus grew in size to 400 million words; [6] by March 2019, [7] the corpus had grown to 560 million words.

  9. Category:English corpora - Wikipedia

    en.wikipedia.org/wiki/Category:English_corpora

    Main page; Contents; Current events; Random article; About Wikipedia; Contact us; Pages for logged out editors learn more