Thursday, January 12, 2012

number of known words

In English, it seems the average native speaker knows about 17,000 word families. A study of vocabulary comparing Dutch students entering university to non-native Dutch students found that the average vocabulary size of the Dutch was approximately 18,800 words.

The Common European Framework for Reference of languages (CEFR), appears to cover the most frequent 5,000 words at the C2 level, which is the top level of the test. You can check here on page 186. To pass C2 in English, one needs a vocabulary of around 4,000 or more words. For French, the learner should know 3,300 - 3,700 words to pass C2. These figures are a far cry from what the native speaker knows but apparently enough to perform well on the most advanced level of the CEFR test. So I wouldn't quite call C2 mastery of the language, nor equivalent to a native speaker.

However, the description of C2,  if true, does sound like someone fluent in the language:
Can understand with ease virtually everything heard or read. Can summarize information from different spoken and written sources, reconstructing arguments and accounts in a coherent presentation. Can express him/herself spontaneously, very fluently and precisely, differentiating finer shades of meaning even in the most complex situations.
So, if I were planning to learn a European language and wanted to know what kind of target I should set for the number of words that I needed to learn within my time frame, I think I would go for about 5,000 words. If I were learning French, I would expect to feel pretty advanced by the time I reached the 3,500 known-words mark. If I didn't feel so, I would start to wonder what was wrong.

Because I don't need to set a deadline, I would calculate my arrival at "advanced level" by going from my target words per day and figuring out how many days that should take me. I think I wouldn't learn more than 5 words a day, so it would probably take me 2 years to learn French. However, I already have studied French so I wouldn't be starting out as a complete beginner.

In reference to the CEFR level descriptions, my Japanese is around a B2 level, in case you were wondering. I passed the Japanese Language Proficiency Test level 2, five years ago. I wish I had been learning 5 words a day since then.


  1. Well, I don't know man...

    "Native speakers' vocabularies vary widely within a language, and are especially dependent on the level of the speaker's education. A 1995 study estimated the vocabulary size of college-educated speakers at about 17,000 word families, and that of first-year college students (high-school educated) at about 12,000."

    The most of the people are high-school educated, aren't they? So I would be content with 12000 only.

    and the tests:

    Hanyu Shuiping Kaoshi (HSK)
    HSK Advanced: Characters: 2865, Words: 8840.

    Japanese Language Proficiency Test (JLPT)
    Level 1: Kanji: ~2000 (1926), Vocabulary: ~10,000 (8009).

    "Students require a vocabulary of between 8,000 and 15,000 words to pass the TOEFL, TOEIC, AP, and Cambridge tests."

  2. Good post, Keith. From a quick read of that doc they seemed to be saying that they only measured up to 5,000, so I think it might be hard to draw conclusions about the C2 level, or mastery, etc. I guess the suggestion is that you *could* pass a C2 exam with something in this upper range of vocabulary.

    I don't know that many people with C2 level English, but the ones I know who have achieved high scores in English tests which cover the range B2-C1 indicate to me that C1 is still a long way from fluent - in almost any area you can imagine. Still, I'd be happy to reach it in my second-language!

    By the way, isn't Level 2 in the JLPT higher than B2? Maybe you're just being too hard on yourself!

  3. Well, there are 4 sentences in the description of C1 and I think I fail on all of them in Japanese. But for B2, I think I can pass on all of those points. When I passed JLPT 2, five years ago, I may have been at a B1 level or lower. JLPT 1 is probably about C1 level. JLPT 2 is probably about B1 level. There's a big gap between the two levels.

    The James Miltion book (linked to in the post) does not merely suggest that you could pass C2 with a certain amount of vocabulary, the research surveyed or tested the amount of known vocabulary by those who did pass the test by using X-Lex and then gave the average size of those vocabularies for the various levels.

    My points were, even if 4 or 5 thousand known words allows you to speak a foreign language really well and pass the C2, it should not be mistaken for "complete mastery." However, it is a good goal to shoot for knowing that you don't need to learn 20,000 words, you can just get up to 3, 4, or 5 thousand and you should be good to go.

    Note, as Igor has pointed out, Chinese and Japanese tests require a lot more vocabulary for "mastery." I don't what the actual passing level requirements are, though. But all of the 8,800 HSK words are clearly defined in an HSK dictionary. Word lists have been compiled from past JLPT papers too.

    Look at the James Miltion book again, on p. 181, TOEFL score of 630 is "Advanced level of performance" and has an X-Lex rating of 4500 - 4740 words. Far different from the quote that Igor has gotten from unreliable sources on the internet.

  4. On your point that there's a big gap between B1 and C1... I read a while ago, I think on a forum for ex-pats in Germany, someone's comment that they had done tests in German for every level of the CEFR and they felt like the biggest jump was from B1 to B2. They personally felt like that was the stage where they started to really understand and interact in the language.

    So, yeah, if that was a common experience, it would make the gap between levels seem very large, perhaps more so than a A1 to B1 jump?

  5. According to the Chartered Institute of Linguist, their Diploma qualifications are equal to both a C2 and a Masters level Degree with the University of Birmingham.

    After passing the exam you are a qualified interpretor or translator (chartered linguist). I think to work in these fields you would need more than 5,000 words.

    Also, the British educational system has been dumbed down over the past decade. In my day, an "A level" (taken at 18 before university) required around 4,000 words. "A levels" are equivalent to level 3 on the UK national qualifications framework which is the same as CEFR B2.

    If we assume that the A1 level require 850 words which we round to 1,000 then we have a linear relationship between the CEFR levels. Basically a 1,000 words per level with C2 being 6,000 and over.


No profanity. Please be considerate of others. Thank you.