Sunday, June 03, 2012

reading like a native

I know many people have the notion that you'll never be as good as a native speaker in a foreign language. A few of us have the audacity to aspire to native-equivalent performance in our second languages and even fewer strive for native-sounding pronunciation. Even if we are delusional, I think for those that really want to be extremely good in another language, there is no reason that we can't expect to be able to read like a native.

For languages with alphabetic scripts or even phonetic scripts, reading as well as a native may not sound like something that would be questionable. All you need to do is put in the time and even if you don't feel like looking up unknown words, eventually you'll get a pretty good idea of what most of the words mean.

But for Chinese and Japanese (any others?), reading is quite a challenge even at the advanced stages of learning. I haven't much experience with reading Chinese, but I do for Japanese. So let me just talk about reading Japanese.

Japanese has two phonetic syllabaries as well as Chinese characters called Kanji. The phonetic syllabaries, Hiragana and Katakana, are about as different as uppercase and lowercase characters in the alphabet. They just take some practice reading and eventually it becomes no problem.

Kanji, on the other hand, is the problem.  Many Kanji have 2 or 3 readings, while the most overly used characters have lots of readings. Looking up words with characters you don't know requires you to write the characters on a dictionary device of some type or another. Then, you'd better make a note somewhere because you don't want to have to look it up again.

Really, it is quite exhausting, so I don't recommend trying to look up every word you come across. It's better to wait until it comes up a second time. So if you remember seeing it before, then look it up. Statistically, I think this makes sense. The least frequent 1 or 2 percent of words are likely only going to show up one time in your reading. These 1 or 2 percent words may make up half of your unknown words, so you can save yourself a lot of trouble.

To illustrate what I mean, let's say your book has about 10,000 unique Kanji words, and let's say you know 4,000 of those already. That leaves about 6,000 words you would have to look up while reading the book if you were to go through the trouble of looking up every word. But if 3,000 of those words are only going to show up in the book 1 time each, you would forget even having seen them by the time you see them in another book. So it's not worth making the extra 3,000 look ups and notes.

OK, of course I'm just making those numbers up, but you do want to save time and trouble, don't you? The number of words that fall in the final 2% is about the same as the first 98%.

I have a book on Chinese vocabulary frequency. It tells of a Chinese corpus by BLI Press that has 1.8 million characters, contained 31,159 individual words.  Out of those, 16,593 of the words occurred 3 or more times in the corpus. 8,000 of those words make up 95.1% of the text.

So, you see, if you look up every word in your reading, you are in fact, wasting a lot of time.

How many hours does it take to reach the same reading level as a native? I don't know! How long would it take you to read 1.8 million characters in Chinese? You probably don't know! But we're too far along to stop now. So let's keep going!



  1. "How many hours does it take to reach the same reading level as a native?"

    10 million (comprehensible) words

  2. So am I not supposed to count the unknown words?

  3. Of course what I meant is about 99% comprehensibility, like in the good old days when you were a child and the only not-so-sure-what-it-means :)) words were: posterior, metacognitive, hydraulically...

  4. well ajatt said 200 books. basically you have read a lot. so far i've read like 50 books and i read the internet so i'm getting there.

  5. I'm learning Japanese too. here's my dokusho meter
    I don't put every single book i read because some of them are terrible and that's why i said i read like 50.

    I think for jaapanese after a while I've come to the realization that readings get very annoying after a while. I know all the possible readings of the kanjis that make up the word but then i would be unsure if there's onten or not and sometimes the word ends up being read as kunyomi+onyami or onyami+ kunoymi or kunyomi+ kunyomi even though doesn't seem like it would. i feel like as long as you know what the word means it's fine.

    I have one anki deck just for reading japanese . (ex cards 苗木 なえぎ 並木 なみき  地で行・く (じ) It's a nightmare. So if you know the word means but don't want to waste time looking up how you read it just because you know all the possibilities... then i think you don't have to. It's your time.

  6. so because it takes so much reading to be able read like a native the only people who are going to be able to do it are people who genuinely like reading. you can't just read for the sake of omg i gotta read fluently like a native. enjoyment is necessary since the sheer amount is just great.

    It also depends what level of japanese you want to read to. If you find light novels or young adult novels really interesting and that's your main focus .... to be able to learn to read that fluently is obviously going to be easier to read then koten (古典).. I think even japanese people trouble reading it. some people like manga a lot. depending on the manga it could be difficult to very easy.

    So I say read you want and I wouldn't focus on "fluent".

  7. Thanks for your comments, tv4evergurl.

    Measuring by the number of books is nice but not consistent. Some books may have about 100 pages while others have around 300.

    Anyway, how many hours would it take you to read 200 books?

  8. 200 min-length novels, at least 50,000 words ( ) is 40.000 pages.
    I've experimented in the past with reading speeds, so in my native language I read 50 pages per hour (Cards on the Table by Agatha Christie, Spider's Web etc) including very short breaks between chapters to drink water go to the bathroom etc.
    So if you're conservative in your estimation (it's always a good idea) for your average reading speed in Japanese your reading speed would be 25 pages per hour.
    That means 1600 hours at most.

  9. 200 books actually IS 10 million words.
    We can't be very far from the truth.

  10. 200 books ... 10 million words ... 1600 hours ...

    The "truth" just keeps getting worse and worse.

    If we could read Japanese for 2 hours a day, this would take 25 straight months.

    Of course, I think to accomplish that 25 pages per hour rate, we'd definitely have to skip all dictionary look-ups. With dictionary use, it's going to be more like 10 pages per hour I guess.

  11. I timed myself on a couple of pages. From the book that I am reading now, it takes 5 or 6 minutes to read one page. So at most, I can read 12 pages an hour, and that would be without stopping to look up any words.

    And I am not a super slow reader in Japanese either. I can read at approximately my speaking pace.

    The book I am reading has 19 columns (vertical text) per page and can fit about 40 characters in a column, plus some punctuation.

  12. 250 words (5 characters per word on average) per page is the industry standard, how many words are there in 19 columns? 760?! :) Sorry, I have had not much experience with Japanese texts, it's different.

  13. It's difficult to define what a word is in Japanese for counting purposes. If there were an average of 3 characters per word, then there would be 250 words per page. If 4 characters per word, then 190 words per page. And 5 characters per word would make 152 words per page.

    Of course, that's just for the current book I'm reading.


No profanity. Please be considerate of others. Thank you.