Search results
Results from the WOW.Com Content Network
The corpus is much larger than the CCL (470 million characters), the CNC (100 million characters), the SUBTLEX-CH (47 million characters) and the LCMC (less than 2 million characters). It seems as if the frequency lists derived from this corpus might be the most reliable frequency lists currently available.
I would read in the BCC corpus frequency list as a dictionary, then Having concatenated all the news/magazine articles as plain text, I would build a dictionary of all the words in the news/magazine articles up to 8 characters long, counting their number of occurrences with the help of the BCC frequency list (which tells us which combinations ...
I'm honestly a little wary of adding built-in frequency listings because I don't think they're a very good way to learn Chinese; even a really excellent corpus will probably be several years out of date for slang vocabulary, so a term that comes up as uncommon may actually be quite common now (or vice versa) - people are constantly repurposing old words - plus I don't believe they're accurate ...
Thank you very much for your detailed explanation.! Yes, that makes sense. Also, by importing the card as a user dictionary you gain additional benefits without losing anything!, So if my understanding is correct it seems there are no significant downsides:) You're welcome! Yeah, it's true, for...
The Beijing Language and Culture University created a balanced corpus of 15 billion characters. It’s based on news (人民日报 1946-2018,人民日报海外版 2000-2018), literature (books by 472 authors, including a significant portion of non-Chinese writers), non-fiction books, blog and weibo entries as well as...
The corpus is much larger than the CCL (470 million characters), the CNC (100 million characters), the SUBTLEX-CH (47 million characters) and the LCMC (less than 2 million characters). It seems as if the frequency lists derived from this corpus might be the most reliable frequency lists currently available.
The Beijing Language and Culture University created a balanced corpus of 15 billion characters. It’s based on news (人民日报 1946-2018,人民日报海外版 2000-2018), literature (books by 472 authors, including a significant portion of non-Chinese writers), non-fiction books, blog and weibo entries as well as...
With a small corpus of 650 articles from People's Daily, downloaded using a Python script, I hope to start providing a more modern frequency list of media-related vocabulary. The frequency list has the following features: It uses all sections of the 人民日报 / People's Daily newspaper, including the sports section.
Hey Mike, I'm a big user of vocab lists and I'm about 1.5 months away from finishing the HSK4 list. Recently I've been studying some colloquial stuff and...
The pleco dictionary shows frequencies from 1 to 5. How many words are in each category? How have the frequencies been measured? I am familiar with some research about the frequencies years ago. I especially would like to know if the frequencies are based on research about written or spoken...