enow.com Web Search

Search results

  1. Results from the WOW.Com Content Network
  2. List of text corpora - Wikipedia

    en.wikipedia.org/wiki/List_of_text_corpora

    Text corpora (singular: text corpus) are large and structured sets of texts, which have been systematically collected.Text corpora are used by corpus linguists and within other branches of linguistics for statistical analysis, hypothesis testing, finding patterns of language use, investigating language change and variation, and teaching language proficiency.

  3. Text corpus - Wikipedia

    en.wikipedia.org/wiki/Text_corpus

    To exploit a parallel text, some kind of text alignment identifying equivalent text segments (phrases or sentences) is a prerequisite for analysis. Machine translation algorithms for translating between two languages are often trained using parallel fragments comprising a first-language corpus and a second-language corpus, which is an element ...

  4. Category:Corpora - Wikipedia

    en.wikipedia.org/wiki/Category:Corpora

    Main page; Contents; Current events; Random article; About Wikipedia; Contact us; Pages for logged out editors learn more

  5. Category:Corpus linguistics - Wikipedia

    en.wikipedia.org/wiki/Category:Corpus_linguistics

    This page was last edited on 3 November 2019, at 01:44 (UTC).; Text is available under the Creative Commons Attribution-ShareAlike 4.0 License; additional terms may apply.

  6. List of datasets for machine-learning research - Wikipedia

    en.wikipedia.org/wiki/List_of_datasets_for...

    Fine-grain categorization and topic codes. 810,000 Text Classification, clustering, summarization: 2002 [25] Reuters: The Reuters Corpus Volume 2 Large corpus of Reuters news stories in multiple languages. Fine-grain categorization and topic codes. 487,000 Text Classification, clustering, summarization 2005 [26] Reuters

  7. Brown Corpus - Wikipedia

    en.wikipedia.org/wiki/Brown_Corpus

    The Brown University Standard Corpus of Present-Day American English, better known as simply the Brown Corpus, is an electronic collection of text samples of American English, the first major structured corpus of varied genres. This corpus first set the bar for the scientific study of the frequency and distribution of word categories in ...

  8. Oxford English Corpus - Wikipedia

    en.wikipedia.org/wiki/Oxford_English_Corpus

    The corpus is generally available only to researchers at Oxford University Press, but other researchers who can demonstrate a strong need may apply for access. [2] [3] The digital version of the Oxford English Corpus is formatted in XML and usually analysed with Sketch Engine software. [4] By April 27, 2006, the dictionary database had 1 ...

  9. Ancient text corpora - Wikipedia

    en.wikipedia.org/wiki/Ancient_text_corpora

    Ancient text corpora are the entire collection of texts from the period of ancient history, defined in this article as the period from the beginning of writing up to 300 AD. These corpora are important for the study of literature , history , linguistics , and other fields, and are a fundamental component of the world's cultural heritage .