enow.com Web Search

Search results

  1. Results from the WOW.Com Content Network
  2. Lancaster-Oslo-Bergen Corpus - Wikipedia

    en.wikipedia.org/wiki/Lancaster-Oslo-Bergen_Corpus

    The Lancaster-Oslo/Bergen (LOB) Corpus is a one-million-word collection of British English texts which was compiled in the 1970s in collaboration between the University of Lancaster, the University of Oslo, and the Norwegian Computing Centre for the Humanities, Bergen, to provide a British counterpart to the Brown Corpus compiled by Henry Kučera and W. Nelson Francis for American English in ...

  3. International Computer Archive of Modern and Medieval English

    en.wikipedia.org/wiki/International_Computer...

    The International Computer Archive of Modern and Medieval English (ICAME) is an international group of linguists and data scientists working in corpus linguistics to digitise English texts. [1] The organisation was founded in Oslo , Norway in 1977 as the International Computer Archive of Modern English, before being renamed to its current title.

  4. List of text corpora - Wikipedia

    en.wikipedia.org/wiki/List_of_text_corpora

    Timestamped JSI web corporaweb corpora of news articles crawled from a list of RSS feeds. Newsfeed corpora are being prepared in the framework of the project implemented by the Jožef Stefan Institute at Slovenian scientific research institute. [43] and published in Sketch Engine. More information about the project is on the project websites.

  5. International Corpus of English - Wikipedia

    en.wikipedia.org/.../International_Corpus_of_English

    Comparable variations would be British English, American English, and Indian English, that would be represented through a computer corpora. [2] The corpora are used by researchers to compare the syntax of the varieties of English. [3] ICE corpora completion would have comprehensive linguistic analysis of varieties of English that have emerged. [4]

  6. Cambridge English Corpus - Wikipedia

    en.wikipedia.org/wiki/Cambridge_English_Corpus

    The Cambridge International Corpus (CIC) is a collection of over 2 billion words [1] of real spoken and written English. The texts are stored in a database that can be searched to see how English is used. The CIC also contains the Cambridge Learner Corpus, a unique collection of over 60,000 exam papers from Cambridge ESOL.

  7. American National Corpus - Wikipedia

    en.wikipedia.org/wiki/American_National_Corpus

    The American National Corpus (ANC) is a text corpus of American English containing 22 million words of written and spoken data produced since 1990. Currently, the ANC includes a range of genres, including emerging genres such as email, tweets, and web data that are not included in earlier corpora such as the British National Corpus.

  8. Oxford English Corpus - Wikipedia

    en.wikipedia.org/wiki/Oxford_English_Corpus

    The digital version of the Oxford English Corpus is formatted in XML and usually analysed with Sketch Engine software. [4] By April 27, 2006, the dictionary database had 1 billion words. [5] Each document in the OE Corpus is accompanied by metadata including: title; author (if known; many websites make this difficult to determine reliably)

  9. Category:English corpora - Wikipedia

    en.wikipedia.org/wiki/Category:English_corpora

    Download QR code; Print/export Download as PDF; ... Appearance. move to sidebar hide. Help. Pages in category "English corpora" The following 18 pages are in this ...