enow.com Web Search

Search results

  1. Results from the WOW.Com Content Network
  2. British National Corpus - Wikipedia

    en.wikipedia.org/wiki/British_National_Corpus

    The British National Corpus (BNC) is a 100-million-word text corpus of samples of written and spoken English from a wide range of sources. [1] The corpus covers British English of the late 20th century from a wide variety of genres, with the intention that it be a representative sample of spoken and written British English of that time.

  3. List of text corpora - Wikipedia

    en.wikipedia.org/wiki/List_of_text_corpora

    Text corpora (singular: text corpus) are large and structured sets of texts, which have been systematically collected.Text corpora are used by corpus linguists and within other branches of linguistics for statistical analysis, hypothesis testing, finding patterns of language use, investigating language change and variation, and teaching language proficiency.

  4. General Service List - Wikipedia

    en.wikipedia.org/wiki/General_Service_List

    The General Service List (GSL) is a list of roughly 2,000 words published by Michael West in 1953. [1] The words were selected to represent the most frequent words of English and were taken from a corpus of written English. The target audience was English language learners and ESL teachers. To maximize the utility of the list, some frequent ...

  5. CLAWS (linguistics) - Wikipedia

    en.wikipedia.org/wiki/CLAWS_(linguistics)

    The CLAWS4 was used for the 100-million-word British National Corpus (BNC). A general-purpose grammatical tagger, it is a successor of the CLAWS1 tagger. [ 11 ] In tagging the BNC, the many rounds of work that went into CLAWS4 focused on making the CLAWS program independent from the tagsets.

  6. Most common words in English - Wikipedia

    en.wikipedia.org/wiki/Most_common_words_in_English

    A list of 100 words that occur most frequently in written English is given below, based on an analysis of the Oxford English Corpus (a collection of texts in the English language, comprising over 2 billion words). [1]

  7. Text corpus - Wikipedia

    en.wikipedia.org/wiki/Text_corpus

    An example of annotating a corpus is part-of-speech tagging, or POS-tagging, in which information about each word's part of speech (verb, noun, adjective, etc.) is added to the corpus in the form of tags. Another example is indicating the lemma (base) form of each word. When the language of the corpus is not a working language of the ...

  8. Lancaster-Oslo-Bergen Corpus - Wikipedia

    en.wikipedia.org/wiki/Lancaster-Oslo-Bergen_Corpus

    The Lancaster-Oslo/Bergen (LOB) Corpus is a one-million-word collection of British English texts which was compiled in the 1970s in collaboration between the University of Lancaster, the University of Oslo, and the Norwegian Computing Centre for the Humanities, Bergen, to provide a British counterpart to the Brown Corpus compiled by Henry Kučera and W. Nelson Francis for American English in ...

  9. Bank of English - Wikipedia

    en.wikipedia.org/wiki/Bank_of_English

    The Bank of English totals 650 million running words. [1] Copies of the corpus are held both at HarperCollins Publishers and the University of Birmingham. The version at Birmingham can be accessed for academic research. The Bank of English forms part of the Collins Word Web together with the French, German and Spanish corpora.