Fundamentals of Understanding, Translating and Matching CJK Names
June 7, 2017
When we talk about cross-lingual name matching between English and Japanese, it’s pretty straightforward, and pretty obvious which name is in English and which in Japanese. This applies to any set of names written in different scripts: Arabic to Cyrillic, Devanagari to Latin, etc. However, differentiating between names written in the same script, such as Chinese, Japanese, and Korean—which can all use the Han script (Chinese ideographs)—is trickier, (but thankfully not a problem for Rosette’s fuzzy name matching and name translation.)
First you have to know what language you are starting in. In many cases, how you pronounce the characters is necessary to matching it to the same name in a different language or correctly transliterating it to your chosen language. Even though the three languages share a script, names in one language are usually represented differently in the other two, necessitating a tool that can accurately select the right language model.
Preferred Ways to Translate
Take the Korean name 김정은 that is usually written in Hangul. The Chinese would use the hanja equivalent 金正恩 because it is Chinese ideographs. However, the Japanese would write it phonetically in katakana as キム・ジョンウン.
(takes Korean hanja)
(phonetic of Korean)
(phonetic of Japanese)
(takes Japanese kanji)
|Yoko Ono||오노 요코||小野洋子||小野洋子|
|Beat Takeshi||키타노 타케시||北野武||ビートたけし|
Alternatively, the Japanese name 小野洋子 would be represented as is in Chinese, but the Japanese name ビートたけし (a combination of katakana and hiragana script) would be transliterated to Chinese characters that sound like the Japanese.
Varying Pronunciations for the Same Character
Variations in character pronunciation are another reason you must know which language you are starting from. For example, character 金 is pronounced differently in each language:
|Japanese pronunciation of 金||kah-nay or kin|
|Chinese pronunciation of 金||jin|
|Korean pronunciation of 金||kim or gim|
As a non-speaker, how do you know what language a name is in?
Determining language is a relatively easy problem for someone who knows one of the languages, because although the languages share the Han script (Chinese ideographs), the characters used to form names in each language are very distinctive and different. This skill is analogous to an English speaker recognizing that “Jose Perez” is probably Spanish, while “Olivier Cousteau” is probably French.
The character 金 is the very common Korean surname “Kim”, but very rarely a surname for the Japanese and just occasionally a surname for the Chinese such as the famous Qing dynasty writer 金聖嘆 Jin Shengtan. Despite this, Japanese and Chinese language frequently use 金 with other characters to create given names.
Rosette’s name matching function understands idiosyncrasies of Han script and will determine the starting language of a name if unknown and then proceed with name matching. More simply, Rosette doubles as a language identifier and name matcher for these three languages.
Talk to the experts
We’ve been experts at working with multilingual names for over 20 years. Recognizing the complications of working with CJK names, we added CJK name translation support in Rosette API 1.7. Sign up for a free API account today (no credit card required) to test it out.
Dealing with the complexities of Asian names in your business or just want to learn more? Drop us a line.
Curious about how each of these languages express foreign names compared to the other two? Read on.
Chinese uses only Han characters (called hanzi) to write. Thus all names are written in hanzi. Each character has a basic meaning and usually one pronunciation:
|Family name||Given name|
Foreign names in Chinese
Since Chinese has no script besides hanzi, foreign names are written phonetically by selecting hanzi characters that approximate the sound of the foreign name. For example, in China, Obama is transliterated as 奥巴马 (ao ba ma), and in Taiwan it’s translated as 歐巴馬 (ou ba ma).
For Korean names, where an equivalent hanja version is commonly used and known, the Chinese will use that, however if the hanja are not known, then it will be translated phonetically to Chinese. A good example is the Korean actress Gong Hyo Jin (hanja name is 孔曉振, and in hangul, 공효진), but it is also sometimes written phonetically in Chinese as 孔孝真.
The exception is Japanese names in kanji and Korean names in hanja, which are used as is—although the Chinese are sure to pronounce them differently!
Chinese characters were borrowed by Korea as a writing system (called hanja) hundreds of years ago. Exactly when is unknown, but hanja was already in use when Korean King Sejong the Great, commissioned scholars in the 1440s to come up with the uniquely Korean script, hangul, which is almost exclusively used to write Korean today. Hangul is a purely phonetic representation and although Korean names now are written mostly in hangul, many times they “map” to particular hanja for their meaning. A given hangul name can map to multiple various hanja as there are many homonyms in Korean.
|Family name||Given name|
Foreign Names in Korean
In Korean, foreign names are simply transliterated phonetically and written in Hangul. What about names in kanji or hanzi?
Japanese names come in the greatest variety as the language has three scripts. Kanji are borrowed Chinese characters (from likely around the 4th century), from which the Japanese created hiragana and katakana, whose characters have no inherent meaning and just represent sounds. Kanji usually have at least two different readings depending on the word context they appear in. Names can be written in any of the three scripts, but for each person, they have ONE official spelling of their name and the different ways to write the same sounding name are not interchangeable, just as “Cyndi Lauper” is not the same as “Cindy Lawper” in English. The name “Yoko” has several kanji that match it as there are also many homonyms in Japanese.
|Family name||Given name|
|Kanji, Hiragana and Kanji||菅野||よう子|
Foreign Names in Japanese
Generally speaking, all foreign names are written phonetically in Japanese using the katakana script. For Korean names in hanja and Chinese names in hanzi, the hanja or hanzi name may also appear next to the katakana, but it is rare to see either without the katakana as well.