Search results
Results from the WOW.Com Content Network
The Lancaster-Oslo/Bergen (LOB) Corpus is a one-million-word collection of British English texts which was compiled in the 1970s in collaboration between the University of Lancaster, the University of Oslo, and the Norwegian Computing Centre for the Humanities, Bergen, to provide a British counterpart to the Brown Corpus compiled by Henry Kučera and W. Nelson Francis for American English in ...
The International Computer Archive of Modern and Medieval English (ICAME) is an international group of linguists and data scientists working in corpus linguistics to digitise English texts. [1] The organisation was founded in Oslo , Norway in 1977 as the International Computer Archive of Modern English, before being renamed to its current title.
Timestamped JSI web corpora – web corpora of news articles crawled from a list of RSS feeds. Newsfeed corpora are being prepared in the framework of the project implemented by the Jožef Stefan Institute at Slovenian scientific research institute. [43] and published in Sketch Engine. More information about the project is on the project websites.
Comparable variations would be British English, American English, and Indian English, that would be represented through a computer corpora. [2] The corpora are used by researchers to compare the syntax of the varieties of English. [3] ICE corpora completion would have comprehensive linguistic analysis of varieties of English that have emerged. [4]
The Cambridge International Corpus (CIC) is a collection of over 2 billion words [1] of real spoken and written English. The texts are stored in a database that can be searched to see how English is used. The CIC also contains the Cambridge Learner Corpus, a unique collection of over 60,000 exam papers from Cambridge ESOL.
The American National Corpus (ANC) is a text corpus of American English containing 22 million words of written and spoken data produced since 1990. Currently, the ANC includes a range of genres, including emerging genres such as email, tweets, and web data that are not included in earlier corpora such as the British National Corpus.
The digital version of the Oxford English Corpus is formatted in XML and usually analysed with Sketch Engine software. [4] By April 27, 2006, the dictionary database had 1 billion words. [5] Each document in the OE Corpus is accompanied by metadata including: title; author (if known; many websites make this difficult to determine reliably)
Download QR code; Print/export Download as PDF; ... Appearance. move to sidebar hide. Help. Pages in category "English corpora" The following 18 pages are in this ...