About Chinese Characters (Hanzi)

Chinese characters developed from a pictographic writing system in use more than 4,000 years ago. The earliest uncontroversial examples are the so-called "oracle bone inscriptions" of the Shang Dynasty period (most of the 2nd millenium B.C.E.). These consist of elaborate carvings on bones (believed to be used in divination) resembling modern characters in many ways. Between three and four thousand of these characters have been identified, of which perhaps a quarter have been conclusively deciphered. In fact, this oracle bone writing had already moved beyond something merely pictographic, by virtue of involving abstract relations and sound correspondences.

Scholars have discerned five primary methods by which these characters were constructed, and in fact the modern characters in use today may be divided into these same classes (1). It is important to remember, however, that this is a post-hoc analysis -- no one knows for sure how most individual characters developed.

Character Construction

Pictographs are the foundation of the whole system, although they make up only roughly 4% of modern characters. These originated in basically pictorial representations of objects -- e.g., sun, knife, tree. Over the centuries, these representations were streamlined and abstracted somewhat to make writing them easier. The figure depicts the evolution of 4 of the pictographic characters (moon, person, eye, mountain) through the first five of the developmental stages outlined below.

Indicative characters (about 1%) are slightly more abstracted pictorial representations. For example, the character for "above" consists of a small line above a large one, abstractly representing one object above another. Often these characters are embellishments on pictographic characters, such as the character for "root" (a tree with an additional line below the middle level), or one version of "sweet" (a tongue with a dot in the middle indicating the spot where sweetness is tasted).

Associative characters (about 13%) are combinations of two or more pictographic or indicative characters. There are two main types of these; the first kind consist of repeated versions of one character, such as "follow", made up of two people, one behind the other, or "forest", made up of three trees. The second kind consist of `situational sketches', as the character for "dawn", a sun over a horizon, or the character for "rest", made up of a person leaning on a tree.

The picto-phonetic method of construction was by far the most prolific, underlying around 80% of modern characters. This took advantage of the prevalence of homonymy in the spoken Chinese language to represent syllables by two pictographic, indicative, or associative parts. One of them represents a different word that happens to have the same or a similar pronunciation to this one, while the other is a simple sign that clarifies which word with this pronunciation is being represented. This clarifying component is usually called the radical, or `semantic component', while the other half is usually called the phonetic component. Although most of these constructions developed in periods when pronunciation was much different from now, characters with the same phonetic components still have the same or similar pronunciations most of the time, owing to the tendency for sounds to change in parallel across different words.

The remaining 2% or so of modern characters originate from the wholesale co-opting of a character to express a word with the same pronunciation but a different meaning. This method differs from the picto-phonetic one in that it doesn't add a radical to indicate the new meaning.

Development of the System

Development subsequent to the oracle bone period consisted of several stages.

  • Bronze incriptions: Many bronze works such as bells and cauldrons from the the later Shang and early Zhou dynasties (1st millenium B.C.E.) were engraved with characters. Their forms, like those on the oracle bones, were highly variable -- the same character was often written in different ways in different places -- but unlike the oracle bone inscriptions, their size and orientation were regular.

  • Seal script: The standard style of writing during the later Zhou and early Qin dynasties (end of 1st millenium B.C.E.) was more regular in form (the same characters were nearly always written the same way), and the shapes of all the characters were made more squarish.

  • Official script: Paradoxically, the so-called `official script' was originally a style used by the common people that was easier to write than the seal script. It had straight lines where there were curves, and simplified versions of radicals. During the later Qin period and Han dynasties (250 B.C.E. through 250 C.E.), the government gradually incorporated these modifications into the officially-sanctioned style. By the end, the characters had very similar forms to those used today.

  • Regular script: The end result of that gradual development stabilized around 250 C.E., and the changes since then have consisted mainly of cleaning up and streamlining a little more and straightening undulating strokes.

  • Simplification and standardization: During the 20th century, mainly after the Communist Revolution of 1949, the process of development of the regular script was continued further: alternative forms for the same character were eliminated, many shortened forms for characters in use among the common people were given the official nod, and a number of components were given less elaborate forms.

Simplified and Traditional Characters

The last set of changes has not been adopted by Taiwan. The reasons are no doubt partly political, but there is also a widespread belief that the simplification did not succeed in its avowed purpose of making the characters easier to learn. The chief reason is that the changes seem to reduce the natural systematicity of the characters somewhat, particularly as many of the modifications were applied inconsistently. There is also some sentiment that the traditional forms are more beautiful, which perhaps explains why they are still used in many circumstances on the mainland itself. Many disagree with this -- there is beauty in simplicity, and the traditional forms are often very cluttered in appearance. Also, the simplification may well have added more systematicity than it took away -- though we don't have statistics to support this. ;-)

Number of Characters

One of the most common questions about Chinese characters is how many there are. The answer depends on what you want to consider a character. If you mean all the characters that have ever been used during the `modern' period (regular script onwards), there are about 60,000. However, the vast majority of these are no longer used, and many were only ever used in very special circumstances, such as the name for one particular place or person. Most estimates place the total number of `viable' characters at from 10,000-15,000. But in fact, modern printers are able to get by with between 7,000 and 9,000 type-pieces, even for specialized material, making 7,000 an approximate upper limit for complete literacy. A typical book contains about 2,500-3,000 different characters, and children are taught 2,500 in elementary school and an additional 1,000 in middle school. A person with a college education might know 5,000. Frequency counts show that knowing the 2,500 most frequent characters suffices to recognize 99% of those that one comes across (in newspapers and non-specialist literature).


  1. Ramsey, S.R. (1987); The Languages of China; Princeton University Press, Princeton, New Jersey.
  2. Yin, Binyong, Rohsenow, J.S. (1994); Modern Chinese Characters; Sinolingua, Beijing.