enow.com Web Search

Search results

  1. Results from the WOW.Com Content Network
  2. Brown Corpus - Wikipedia

    en.wikipedia.org/wiki/Brown_Corpus

    The Brown University Standard Corpus of Present-Day American English, better known as simply the Brown Corpus, is an electronic collection of text samples of American English, the first major structured corpus of varied genres. This corpus first set the bar for the scientific study of the frequency and distribution of word categories in ...

  3. List of text corpora - Wikipedia

    en.wikipedia.org/wiki/List_of_text_corpora

    Text corpora (singular: text corpus) are large and structured sets of texts, which have been systematically collected.Text corpora are used by corpus linguists and within other branches of linguistics for statistical analysis, hypothesis testing, finding patterns of language use, investigating language change and variation, and teaching language proficiency.

  4. Most common words in English - Wikipedia

    en.wikipedia.org/wiki/Most_common_words_in_English

    The researchers published their analysis of the Brown Corpus in 1967. Their findings were similar, but not identical, to the findings of the OEC analysis. According to The Reading Teacher's Book of Lists , the first 25 words in the OEC make up about one-third of all printed material in English, and the first 100 words make up about half of all ...

  5. International Computer Archive of Modern and Medieval English

    en.wikipedia.org/wiki/International_Computer...

    The ICAME group hosts academic conferences that focus on corpus linguistic studies of historical changes and contemporary grammatical descriptions of English, and makes corpora of different varieties of English available to scholars, starting with editions of the 1960s Brown Corpus.

  6. Part-of-speech tagging - Wikipedia

    en.wikipedia.org/wiki/Part-of-speech_tagging

    Research on part-of-speech tagging has been closely tied to corpus linguistics. The first major corpus of English for computer analysis was the Brown Corpus developed at Brown University by Henry Kučera and W. Nelson Francis, in the mid-1960s. It consists of about 1,000,000 words of running English prose text, made up of 500 samples from ...

  7. Lancaster-Oslo-Bergen Corpus - Wikipedia

    en.wikipedia.org/wiki/Lancaster-Oslo-Bergen_Corpus

    The Lancaster-Oslo/Bergen (LOB) Corpus is a one-million-word collection of British English texts which was compiled in the 1970s in collaboration between the University of Lancaster, the University of Oslo, and the Norwegian Computing Centre for the Humanities, Bergen, to provide a British counterpart to the Brown Corpus compiled by Henry Kučera and W. Nelson Francis for American English in ...

  8. Corpus linguistics - Wikipedia

    en.wikipedia.org/wiki/Corpus_linguistics

    The Brown Corpus was the first computerized corpus designed for linguistic research. [6] Kučera and Francis subjected the Brown Corpus to a variety of computational analyses and then combined elements of linguistics, language teaching, psychology , statistics, and sociology to create a rich and variegated opus.

  9. W. Nelson Francis - Wikipedia

    en.wikipedia.org/wiki/W._Nelson_Francis

    W. Nelson Francis (October 23, 1910 – June 14, 2002) was an American author, linguist, and university professor. He served as a member of the faculties of Franklin & Marshall College and Brown University, where he specialized in English and corpus linguistics.