Thursday, January 31, 2008

Let me count the ways

The Japanese language has more words than the English language. But how do you count "words" in Japanese? All verbs have multiple forms. If you count each form of the verb as a word then you would have the 31 words below derived from the verb "to speak."

話さないnot speak
話しませんnot speak
話しなさるなdon't speak!
話そうgoing to speak
or: let's speak
話しましょうgoing to speak
or: let's speak
話すまいnot going to speak
話しますまいnot going to speak
話せばif (I) speak
話さなければif (I) don't speak
話しませんならif (I) don't speak
話さないでnot speaking
話さなくてnot speaking
or: had spoken
or: have spoken
or: had spoken
or: have spoken
話さなかったdid not speak
話したらif (I) should speak
or: if (I) were to speak
or: when (I) speak
or: when (I) spoke
話しましたらif (I) should speak
or: if (I) were to speak
or: when (I) speak
or: when (I) spoke
話さなかったらif (I) should not speak
or: if (I) were to not speak
or: when (I) didn't speak
or: when (I) hadn't spoken
話したりspeaking and
話しましたりspeaking and
話さなかったりnot speaking and
話せるcan speak
or: able to speak
話されるis spoken
or: will be spoken
話させるmake (someone) speak
or: will make (someone) speak
or: allow (someone) to speak
or: will allow (someone) to speak
話させられるis made to speak
or: will be made to speak

If you count each form of the Japanese verb, there are 31 listed. For English words, there are about 23 distinct words in my list. Then, do this with a second Japanese verb and you add another 31 words to your Japanese vocabulary while the English vocabulary would only add about 4 words (replacing speak, spoke, spoken, speaking).
Now it's Japanese 62, English 27. Already the difference is increasing drastically. If I create a little formula for this it would be J=31xN and E=19+(4xN). That would mean a vocabulary of 500 English verbs is equivalent to 2,019 words while 500 Japanese verbs equates to 15,500 words.

So if you count words by each form, then you have to admit that Japanese has a lot more words than English. But why would you count words this way?! Those 500 Japanese verbs are not going to be any harder to learn than those 500 English verbs. It only makes sense to count the word families. 500 verbs should be counted as 500 words. Even plural forms should not be increasing your count. "Event" and "Events" can be counted as one word only. This way gives us a better way to compare across languages since Japanese does not have these plural forms for every noun.

In order to say which language has more words (and thus more work for the students of that language), we need a manual counting method. Just throwing a few simple rules into a computer program to do the counting for us is not going to yield dependable results. It takes a human to decide on a case-by-case (or word-by-word) basis.

If I were starting a project to accomplish this task, here is the way I would go about doing it. Basically, I would take a concept and see how many words there are for that concept.

For example, the concept of "mother." I would start with the word and find how many other words there are which are used for the concept.

  • mother
  • mom
  • mum
  • mommy
  • momma

In Japanese we might have this list for the same concept:

  • 母親
  • お母さん
  • ママ
  • 母上
  • おふくろ

There may be some missing, but as an example we will use the above lists. In this example, Japanese has 6 words for mother and English has 5 words.Notice, I would only count a word once in Japanese no matter how many conventional ways there are of writing it, such as using Kanji or Hiragana or Katakana. But computer programs would require a lot of effort to incorporate my simple rule.

By using my method of concepts and finding out how many words there are for each concept, we can create a nice method of comparison. We would have covered the same number of concepts in each language and then we could compare the total number of words. That would give us a nice idea of which language is richer in vocabulary.

Some concepts, of course, will not have an equivalent in every language. I would not throw out those concepts. I would just give a zero to the language which is missing the concept. That would also be an interesting comparison in the end; to see which language has more concepts.

Once the project was completed, the database could then be used to count a learner's vocabulary. Although, one more level in the details of the database would be required. All of those word forms that we exclude from counting would need to be mapped to their original word. If in the learner's reading, something is not found, it would not be counted in the learner's vocabulary. Such as names. It does not matter how many names you know. They do not contribute to your vocabulary.

So there you have it. That is the way I would count vocabulary and be able to make comparisons between languages. This method would give you an accurate count to then determine how many words you need to know in a particular language. Once it is done for all languages we could figure out if there are differences in the requirements for becoming fluent in a language. And until such a method as I have described is employed, you can take with a grain of salt any figures you hear about how many words are needed to learn a language.

No comments:

Post a Comment

No profanity. Please be considerate of others. Thank you.