enow.com Web Search

Search results

  1. Results from the WOW.Com Content Network
  2. List of text corpora - Wikipedia

    en.wikipedia.org/wiki/List_of_text_corpora

    Corpus of Political Speeches contains four collections of political speeches in English and Chinese from The Corpus of U.S. Presidential Speeches (1789–2015), The Corpus of Policy Address by Hong Kong Governors (1984–1996) and Hong Kong Chief Executives (1997–2014), The Corpus of Speeches given on New Year's days and Double Tenth days by ...

  3. American National Corpus - Wikipedia

    en.wikipedia.org/wiki/American_National_Corpus

    The American National Corpus (ANC) is a text corpus of American English containing 22 million words of written and spoken data produced since 1990. Currently, the ANC includes a range of genres, including emerging genres such as email, tweets, and web data that are not included in earlier corpora such as the British National Corpus .

  4. Category:English corpora - Wikipedia

    en.wikipedia.org/wiki/Category:English_corpora

    Main page; Contents; Current events; Random article; About Wikipedia; Contact us; Pages for logged out editors learn more

  5. Switchboard Telephone Speech Corpus - Wikipedia

    en.wikipedia.org/wiki/Switchboard_Telephone...

    The Switchboard Telephone Speech Corpus is a corpus of spoken English language consisted of almost 260 hours of speech. It was created in 1990 by Texas Instruments via a DARPA grant, and released in 1992 by NIST. The corpus contains 2,400 telephone conversations among 543 US speakers (302 male, 241 female).

  6. Corpus of Contemporary American English - Wikipedia

    en.wikipedia.org/wiki/Corpus_of_Contemporary...

    The Corpus of Contemporary American English (COCA) is composed of one billion words as of November 2021. [1] [2] [4] The corpus is constantly growing: In 2009 it contained more than 385 million words; [5] in 2010 the corpus grew in size to 400 million words; [6] by March 2019, [7] the corpus had grown to 560 million words.

  7. Manually Annotated Sub-Corpus - Wikipedia

    en.wikipedia.org/wiki/Manually_Annotated_Sub-Corpus

    Manually Annotated Sub-Corpus (MASC) is a balanced subset of 500K words of written texts and transcribed speech drawn primarily from the Open American National Corpus (OANC). The OANC is a 15 million word (and growing) corpus of American English produced since 1990, all of which is in the public domain or otherwise free of usage and ...

  8. TenTen Corpus Family - Wikipedia

    en.wikipedia.org/wiki/TenTen_Corpus_Family

    First text corpora were created in the 1960s, such as the 1-million-word Brown Corpus of American English. Over time, many further corpora were produced (such as the British National Corpus and the LOB Corpus ) and work had begun also on corpora of larger sizes and covering other languages than English.

  9. Category:Corpora - Wikipedia

    en.wikipedia.org/wiki/Category:Corpora

    Main page; Contents; Current events; Random article; About Wikipedia; Contact us; Pages for logged out editors learn more