China's national language authority has published a new Chinese Coded Character Set to allow more than 17,000 rare Chinese characters be recognized by computers, according to Central China Television on Sunday. The expansion, a huge leap forward in computing as well as Chinese characters' protection, brings the total number of characters in the set to 88,115.
The new set will put into use nationwide from August 1 and be used at institutions such as hospitals and police stations to facilitate the input of rare Chinese characters that are in some people's names. The digitization of these characters also is conducive to preserving traditional culture.
"It is a great progress," Zhang Ji, a computer science insider, told the Global Times.
"The American Standard Code for Information Interchange, which is widely known and used for the 26 English letters, has great limits to transfer Chinese characters into computers," he said.
"Now China is using a different standard code named GB18030, which allows both simplified and traditional Chinese characters to be typed into computers and mobile phones."
The encoding of a character relies on the work of the International Organization for Standardization ISO/IEC 10646. Zhang explained that the process for coding these rare characters is very complicated.
"Every year, about 1,000 Chinese characters are submitted by China. The ISO , which is responsible for encoding Chinese characters, only meets twice a year to make decisions. As a result, it basically takes four to five years for Chinese characters to be recognized by the organization."
There is no specific definition for what counts as a "rare" Chinese character, yet experts say that any character with "extremely low usage" and which cannot be recognized by many people can be classified as "rare."
"Large number of characters have appeared and been lost over the thousands of years of history in China. As some of them are no longer being used now, they became what we call a rare character," Tan Jingchun, a language researcher at the Chinese Academy of Social Sciences, noted.
Public data from April shows that there were about 60 million people in the country whose names contain rare Chinese characters.
Linguist Tan Ruwei once mentioned that many people use uncommon characters in order to avoid repetition when naming their children. Most of these have beautiful meanings but can only be found in ancient poems instead of daily use.
While this may make for interesting names, it can also create some issues.
A hospital employee named Wang Zhe told the Global Times on Monday that his name has caused him a lot of inconvenience.
His given name Zhe means "sagacious" and "philosophical." Yet the character was not in any computer data bases, so it was unable to be recognized by any software.
"For example, when I went to the public security office to update my ID card, it used to be a complicated matter for the staff to recognize my name," Wang recalled.
"Not to mention what a nuisance it was every time I wanted to type my name online."
His situation only improved a few years ago when one day he found that the character in his name could finally be recognized by computers.
The August version of the set covers most of the rare characters used in the names of people and places in the country, as well as the characters used in professional fields such as literature, science and technology.
"Just like the Oracle Bone Inscriptions in which researchers learned about the society of the Shang Dynasty (c.1600BC-1046BC) by decoding the characters in them, these rare Chinese characters also need to be protected to allow future study. They are a reflection of the past," Liu Yongge, director of the Key Laboratory of Oracle Information Processing Department at Anyang Normal University, noted.