Chinese characters

From Wikipedia, the free encyclopedia

Chinese characters
Script type
Logographic
Time period
c. 13th century BCE – present
Direction
  • Left-to-right (modern)
  • Top-to-bottom, columns right-to-left (historical)[a]
Languages (among others)
Related scripts
Parent systems
Child systems
ISO 15924
ISO 15924Hani (500), ​Han (Hanzi, Kanji, Hanja)
Unicode
Unicode alias
Han

(full list)
 This article contains phonetic transcriptions in the International Phonetic Alphabet (IPA). For an introductory guide on IPA symbols, see Help:IPA. For the distinction between [ ], / / and  , see IPA § Brackets and transcription delimiters.
Chinese characters
"Chinese character" written in traditional (left) and simplified (right) forms
Chinese name
Simplified Chinese汉字
Traditional Chinese漢字
Literal meaningHan characters
Vietnamese name
Vietnamese alphabet
  • chữ Hán
  • chữ Nho
  • Hán tự
Hán-Nôm
  • 𡨸漢
  • 𡨸儒
Chữ Hán漢字
Thai name
Thaiอักษรจีน
Zhuang name
Zhuang
  • 𭨡倱[1]
  • Sawgun
Korean name
Hangul한자
Hanja漢字
Japanese name
Kanji漢字
Hiraganaかんじ
Katakanaカンジ
Khmer name
Khmerតួអក្សរចិន

Chinese characters[b] are logographs used to write the Chinese languages and others from regions historically influenced by Chinese culture. Chinese characters have a documented history spanning over three millennia, representing one of the four independent inventions of writing accepted by scholars; of these, they comprise the only writing system continuously used since its invention. Over time, the function, style, and means of writing characters have evolved greatly. Informed by a long tradition of lexicography, modern states using Chinese characters have standardised their forms and pronunciations: broadly, simplified characters are used to write Chinese in mainland China, Singapore, and Malaysia, while traditional characters are used in Taiwan, Hong Kong, and Macau.

After being introduced to other countries in order to write Literary Chinese, characters were eventually adapted to write the local languages spoken throughout the Sinosphere. In Japanese, Korean, and Vietnamese, Chinese characters are known as kanji, hanja, and chữ Hán respectively. Each of these countries used existing characters to write both native and Sino-Xenic vocabulary, and created new characters for their own use. These languages each belong to separate language families, and generally function differently from Chinese. This has contributed to Chinese characters largely being replaced with alphabets in Korean and Vietnamese, leaving Japanese as the only major non-Chinese language still written with Chinese characters.

Unlike in alphabets, where letters correspond to a language's units of sound, called phonemes—Chinese characters correspond to morphemes, a language's smallest units of meaning. Writing systems that function this way are known as logographies. Morphemes in Chinese are usually a single syllable in length, but characters may represent morphemes comprising multiple syllables as well.[c] Chinese characters are not ideographs, as they correspond to the morphemes of a particular language, but not the abstracted ideas themselves. Most characters are made of smaller components that may provide information regarding the character's meaning or pronunciation.

Development[edit]

Chinese characters are accepted as representing one of four independent inventions of writing in human history.[d] According to Qiu Xigui, in each instance writing evolved from a system using two distinct types of ideographs. Ideographs could either be pictographs visually depicting objects or concepts, or fixed signs representing concepts only by shared convention. These systems are classified as proto-writing, because the techniques they used were insufficient to carry the meaning of spoken language by themselves.[4]

Qiu notes various innovations that were required for Chinese characters to emerge from proto-writing. Firstly, pictographs became distinct from simple pictures in use and appearance: for example, the pictograph , meaning 'large', was originally a picture of a large man, but one would need to be aware of its specific meaning in order to interpret the sequence 大鹿 as signifying 'large deer', rather than being a picture of a large man and a deer next to one another. Due to this process of abstraction, as well as to make characters easier to write, pictographs gradually became more simplified and regularized—often to the extent that the original objects represented are no longer obvious.[5]

The limitations of this system compelled an innovation which allowed spoken language to be encoded directly in the written symbols.[6] In each historical case, this was accomplished by some form of the rebus technique, where the symbol for a word is used to indicate a different word with a similar pronunciation, depending on context. This allowed for words that lacked a plausible pictographic representation to be written down for the first time. This technique, called jiajie (假借) in Chinese, pre-empted more sophisticated methods of character creation that would further expand the lexicon. The process whereby writing emerged from proto-writing took place over a long period; when the purely pictorial use of symbols disappeared, leaving only those representing spoken words, the process was complete.[7]

Classification[edit]

Chinese characters have been used in several different writing systems throughout history. The concept of a writing system includes the written symbols that are used, called graphemes—these may include characters, numerals, or punctuation—as well as the rules by which the graphemes are used to record language.[8] Chinese characters are logographs, graphemes that denote words or morphemes of the language. Writing systems that use logographs are called logographies, as contrasted with alphabets and syllabaries, where graphemes correspond to the phonetic units in a language.[9] In special cases, characters may correspond to non-morphemic syllables; due to this, written Chinese is often characterised as morphosyllabic.[10][e]

The Sinosphere has a long tradition of lexicography attempting to explain and refine the use of characters; for most of history, analysis revolved around a model first popularised in the 2nd-century Shuowen Jiezi dictionary.[12] Newer models have since appeared, often attempting to describe both the methods by which characters were created, the characteristics of their structures, and the way they presently function.[13]

Structural analysis[edit]

Most characters can be analysed structurally as compounds made of smaller components (偏旁; piānpáng), which may have their own functions. Phonetic components provide a hint to a character's pronunciation, and semantic components indicate some element of the character's meaning. Components that serve neither function may be classified as forms with no particular meaning, other than their presence distinguishing one character from another.[14]

A straightforward structural classification scheme may consist of three pure classes of semantographs, phonographs and signs—having only semantic, phonetic, and form components respectively, as well as four classes corresponding to each possible combination of the three component types.[15] According to Yang Runlu, of the 3,500 characters used frequently in Standard Chinese, pure semantographs are the rarest, accounting for about 5% of the lexicon, followed by pure signs with 18%, and semantic–form and phonetic–form compounds together accounting for 19%. The remaining 58% are phono-semantic compounds.[16]

Qiu presents "three principles" of character formation, with semantographs describing all characters whose forms are wholly related to their meaning, regardless of the method by which the meaning was originally depicted, phonographs that include a phonetic component, and loangraphs encompassing existing characters that have been borrowed to write other words. He also acknowledges the existence of character classes that fall outside of these principles, such as pure signs.[17]

Semantographs[edit]

Pictographs[edit]

Graphical evolution of pictographs
('Sun')
('mountain')
('elephant')

While relatively few in number, most of the earliest characters originated as pictographs, representational pictures of physical objects.[18] In practice, their forms have become regularised and simplified after centuries of iteration in order to make them easier to write. Examples include ('Sun'), ('Moon'), and ('tree').[A]

As character forms developed, distinct depictions of various physical objects within pictographs became reduced to instances of a single written component.[19] As such, what a pictograph is depicting is often not immediately evident, and may be considered as a pure sign without regard for its origin in picture-writing. However, if a character's use in compounds, such as in ('clear sky') still reflects its meaning and is not phonetic or arbitrary, it can still be considered as a semantic component.[20]

Due to the regularisation of character forms, individualised components may form part of a compound pictograph: For example, within a given character the component 'MOUTH' often carries a meaning related to mouths, but within ('tall')—a pictograph of a tall building—it instead depicts a window, ultimately lending to the character's meaning of 'tallness'. In another instance, the same 'mouth' radical depicts the lip of a vessel in the modern form of the pictograph ('full').[B]

Pictographs have often been extended from their original concrete meanings to take on additional layers of metaphor and synecdoche, which sometimes even displace the pictograph's original meaning. Over time, this process sometimes creates excess ambiguity between different senses of a character, which is then usually resolved by adding additional components to create new characters used for specific senses. This can result in new pictographs, but usually results in other character types.[21]

Indicatives[edit]

Also called simple ideographs, characters in this small category represent abstract concepts that lack concrete physical forms, but nonetheless can be depicted visually in an intuitive way. Examples include ('up') and ('down')—these characters originally had forms consisting of dots placed above and below a line, which later evolved into their present forms, which have less potential for graphical ambiguity in context.[22] More complex indicatives include ('convex'), ('concave'), and ('flat and level').[23]

Compound ideographs[edit]

Also referred to as logical aggregates, associative idea characters, or syssemantographs, characters in this class are formed by combining two or more pictographs or ideographs to suggest a new, synthetic meaning. A canonical example is ('bright'), interpreted as the juxtaposition of the two brightest objects in the sky: ('Sun'), and ('Moon'), together expressing their shared quality of brightness. Though the historicity of this particular etymology has been contested in recent scholarship, it is definitively a canonical reading: for example, the common compound word 明白 means 'understanding', touching on the derived association of with 'illumination'. The addition of the abbreviated 'GRASS' radical on top results in the compound ideograph ('to sprout'), alluding to the heliotropic behaviour of plant life. Other commonly cited examples include ('rest'), composed of pictographs 'MAN' and 'TREE', and ('good'), composed of 'WOMAN' and 'CHILD'.[C]

The compound character 好 illustrated as its component characters 女 and 子 positioned side by side
The compound character illustrated as its component characters and positioned side by side

Many traditional examples of compound ideographs are now believed to have actually originated as phono-semantic compounds, made obscure by subsequent changes in form.[24] Peter A. Boodberg and William Boltz go so far as to deny that any compound ideographs were devised in antiquity, maintaining that "secondary readings" that are now lost are responsible for the apparent absence of phonetic indicators,[25] but their arguments have been rejected by other scholars.[26]

An example of a modern compound ideograph used in written Chinese is ('concrete'), which combines the 'MAN', 'WORK', and 'STONE' radicals.[D] Compound ideographs are common in kokuji, characters originally coined in Japan.[27]

Phonographs[edit]

Phono-semantic compounds[edit]

These characters are composed of at least one semantic component and one phonetic component.[28] They may be formed by one of several methods, often a phonetic component added to disambiguate a loangraph or a semantic component added to represent an extended sense of the original character. A compound's phonetic component may have been selected as to indicate an additional layer of meaning to the character as a whole. As a result, determining whether a given character is a phono-semantic compound or an ideographic compound is often non-trivial.[29]

Examples of phono-semantic compounds include (; 'river'), (; 'lake'), (liú; 'stream'), (chōng; 'surge'), and (huá; 'slippery'). On the left-hand side of each, these characters have three short strokes: , a reduced form of the 'WATER' radical. In these cases, this indicates to the reader that the meaning of each character is related to the concept of "water". The remainder of each character is the phonetic component: () is pronounced identically to () in Standard Chinese, () is pronounced similarly to (), and (chōng) is pronounced similarly to (zhōng).[f]

While the phonetic components within some compounds do precisely relate the pronunciation, most only provide an approximation, even before the emergence of any later sound changes. Some may only share the initial or final sounds of their phonetic components.[32] The table below lists characters that each use for their phonetic part—save the final one, which uses a previous character in the list—it is apparent that none of them share its modern pronunciation. The Old Chinese pronunciation of has been reconstructed by Baxter and Sagart (2014) as /*lAjʔ/, similar to that for each compound.[33] The table illustrates the sound changes that have taken place since the Shang and Zhou dynasties, when most of the characters in question entered the lexicon. For a modern reader, the resulting drift is such that the phonetic component no longer provides any hint as to each character's pronunciation.[34]

Phono-semantic compounds sharing phonetic component
Char. Gloss Component OC[α] MC[β] Modern[γ]
Sem. Phon. Mandarin Cantonese Japanese
PTC [g] /*lAjʔ/ yaeX [jè] jaa5 [jaː˩˧] ya [ja̠]
'pool'
  • 'water'
  • /*lAjʔ/
/*Cə.lraj/ drje chí [ʈʂʰǐ] ci4 [tsʰiː˩] chi [tɕi]
'gallop'
  • 'horse'
/*[l]raj/
'loosen'
  • 'bow'
/*l̥ajʔ/ syeX chí [ʈʂʰǐ]
shǐ [ʂì]
ci4 [tsʰiː˩] chi [tɕi]
shi [ɕi]
'set up'
  • 'flag'
/*l̥aj/ sye shī [ʂí] si1 [siː˥] se [se̞]
shi [ɕi]
'ground'
  • 'earth'
/*[l]ˤej-s/ dijH [tî] dei6 [tei˨] ji [dʑi]
chi [tɕi]

3-PR
    • 𠂉
  • 'person'
/*l̥ˤaj/ tha [tʰá] taa1 [tʰaː˥] ta [ta̠]
3-PR-F
  • 'female'
[h] [h]
'drag'
  • 'hand'
  • /*l̥ˤaj/
/*l̥ˤaj/ thaH tuō [tʰwó] to1 [tʰɔː˥] ta [ta̠]
da [da̠]

This method is still used to form new characters: for example (; 'plutonium') is the 'GOLD' radical plus the phonetic ()—described in Chinese as " gives sound, gives meaning". Many Chinese names for chemical elements and other characters related to chemistry were formed in this way.[35]

Loangraphs[edit]

The phenomenon of an existing character being adapted to write another word with a similar pronunciation was necessary to the emergence of the Chinese writing system, and it has remained common in the writing system ever since. Some loangraphs may represent words that have never been written another way—this is often the case with abstract grammatical particles such as and —but this is not always so.[36]

Loangraphs are also used to write words borrowed from other languages, such as the various Buddhist terminology introduced to China in antiquity, as well as contemporary non-Chinese words and names. For example, in the name 罗马尼亚; 羅馬尼亞 (Luómǎníyà; 'Romania'), each character is commonly used as a loangraph for its respective syllable. However, the barrier between a character's pronunciation and meaning is never total: when transcribing into Chinese, loangraphs are often chosen deliberately as to create certain connotations. This is regularly done with corporate brand names: for example, Coca-Cola's Chinese name is 可口可乐; 可口可樂 (Kěkǒu Kělè; 'the mouth can be happy'), with the loangraphs selected as to possess a plausible meaning of "delicious and enjoyable".[37]

Signs[edit]

Some characters and components are merely signs, whose meaning purely derives from their having a fixed, distinctive form. Basic examples of pure signs are found with the numerals beyond four, e.g. ('five') and ('eight'), whose forms do not give visual hints to the quantities they represent.[38]

Traditional Shuowen Jiezi classification[edit]

The Shuowen Jiezi is a character dictionary authored by the scholar Xu Shen c. 100 CE. In its postface, Xu analyses what he sees as all the methods by which characters are created, introducing a categorisation scheme which would later become known as the liùshū (六書; 六书; 'six writings'). Mature formulations of this scheme stated that every character belonged to one of six categories, each mentioned with varying emphasis in the Shuowen Jiezi. For nearly two millennia afterwards, this framework would serve as the traditional lens through which characters were analysed throughout the Sinosphere.[39] Xu based most of his analysis on examples of Qin seal script that were written down several centuries before his time—these were usually the oldest forms available to him, but Xu stated that he was aware of the existence of even older forms.[40]

Modern scholars agree that the theory presented in the Shuowen Jiezi is problematic, failing to fully capture the nature of Chinese writing, both in the present, as well as at the time Xu was writing.[41][42] Traditional Chinese lexicography as embodied in the Shuowen Jiezi presupposes either a phonetic or semantic purpose for every character component, providing implausible etymologies for characters later accepted as being pure signs.[43][44] However, the model has proven resilient, and it continues to serve as a guide for students in the process of memorising characters. One of the most important innovations contained in the Shuowen Jiezi is its grouping of a particular component considered to be of particular structural importance called a radical. Over 500 radicals are recognised within the Shuowen Jiezi—while this number would be reduced substantially in future dictionaries, the underlying concept would remain ubiquitous.[45]

History[edit]

Comparison of the abstraction of pictographs over time in cuneiform, Egyptian hieroglyphs, and Chinese characters

According to Qiu Xigui, the broadest trend in the evolution of Chinese characters over their history has been simplification, both in graphical shape (字形; zìxíng), the "external appearances of individual graphs", and in graphical form (字体; 字體; zìtǐ), "overall changes in the distinguishing features of graphic[al] shape and calligraphic style, [...] in most cases refer[ring] to rather obvious and rather substantial changes".[46]

Traditional invention narrative[edit]

Several works of Classical Chinese literature indicate that knotted cords were used to keep records prior to the invention of writing.[47][48] Works that reference the practice include chapter 80 of the Tao Te Ching[49] and the "Xici II" chapter within the I Ching.[50]

According to tradition, Chinese characters were invented during the 3rd millennium BCE by Cangjie, a scribe of the legendary Yellow Emperor. Cangjie is said to have invented symbols called () due to his frustration with the limitations of knotting, taking inspiration from his study of animals, landscapes, and the stars in the sky. On the day that these first characters were created, grain rained down from the sky; that night, the people heard the wailing of ghosts and demons, lamenting that humans could no longer be cheated.[51]

Neolithic[edit]

A series of inscribed graphs and pictures have been discovered at Neolithic sites in China, including Jiahu (c. 6500 BCE), Dadiwan and Damaidi from the 6th millennium BCE, and Banpo from the 5th millennium BCE. The marks at these sites appear one at a time, and do not seem to imply any greater context. As such, Qiu concludes that "we do not have any basis for stating that these constituted writing nor is there reason to conclude that they were ancestral to Shang dynasty Chinese characters."[52] However, they do demonstrate sign use in the Yellow River valley from the Neolithic through to the Shang period.[53]

Oracle bone script[edit]

Ox scapula inscribed with characters recording the result of divinations

The earliest attested Chinese writing comprises a body of inscriptions made during the late Shang dynasty (c. 1250 – 1050 BCE), with the very earliest examples dated to c. 1250–1200 BCE.[54][55][56][57] Many of these inscriptions were made on oracle bones—usually either ox scapulae or turtle shells—and recorded official divinations carried out by the Shang royal house. Contemporaneous inscriptions in a related but distinct style were also made on ritual bronze vessels. This oracle bone script was first documented in 1899, after specimens being sold as "dragon bones" for medicinal purposes were discovered, with the symbols carved into them identified as early Chinese character forms. By 1928, the source of the bones had been traced to a village near Anyang in Henan, which was excavated by the Academia Sinica between 1928 and 1937. To date, over 150,000 oracle bone fragments have been found.[58]

Oracle bone inscriptions recorded divinations undertaken to communicate with the spirits of royal ancestors.[58] The inscriptions range from a few characters in length at their shortest, to around 40 characters at their longest. The Shang king would communicate with his ancestors by means of scapulimancy, inquiring about subjects such as the royal family, military success, and weather forecasting. The interpreted answers would be recorded on the divination material itself.[58]

Oracle bone script is the direct ancestor of later forms of written Chinese. The oldest known inscriptions already represent a well-developed writing system,[59][60] which suggests an initial emergence predating the late second millennium BCE. Although written Chinese is first attested in official divinations, it is widely believed that writing was also used for other purposes during the Shang, but that the media used in other contexts—likely bamboo and wooden slips—were less durable than bronzes or oracle bones, and have not been preserved.[61]

Zhou scripts[edit]

The Shi Qiang pan, a bronze ritual basin dated c. 900 BCE. Long inscriptions on the surface describe the deeds and virtues of the first seven Zhou kings.

The traditional notion of an orderly procession of scripts, with each suddenly invented and displacing the one previous, has been conclusively superseded by modern archaeological finds and scholarly research. More often, two or more scripts coexisted in a given area, and scripts evolved gradually. As early as the Shang, the oracle bone script existed as a simplified form alongside another used in bamboo books, as well as elaborate pictorial forms often used for clan emblems. These other forms have been preserved in bronze inscriptions.[62]

Study of these bronze inscriptions has revealed that the mainstream script underwent slow, gradual evolution during the Shang and Zhou dynasties, until assuming the form that is now known as small seal script within the state of Qin.[63][64] Other scripts in use during the late Zhou include the bird-worm seal script, as well as the regional forms used in non-Qin states. Examples of these styles were preserved as variants in the Shuowen Jiezi.[65]

Qin unification and small seal script[edit]

Following the Qin's conquest of the other Chinese states and the founding of the imperial Qin dynasty in 221 BCE, the Qin small seal script was standardised for use throughout the entire country. However, more than one script was in use at the time. A little-known, rectilinear, "vulgar" form had also been used in Qin for centuries prior to their conquest of China. The popularity of this form grew as the practice of writing became more widespread. By the Warring States period, an immature form of clerical script had emerged in Qin, often called "early clerical" or "proto-clerical". It was based on the vulgar form, and influenced by seal script. The coexistence of proto-clerical and seal script runs counter to the traditional belief that only the latter was used by the Qin, with clerical script being suddenly invented during the early Han.[66]

Han clerical script[edit]

The proto-clerical script matured gradually, and by the early Han period its sophistication was comparable to small seal script.[67] Recently discovered bamboo slips show the emergence of mature clerical script by the end of Emperor Wu of Han's reign (141–87 BCE).[68]

As in previous eras, multiple scripts were in use during the Han, although mature clerical script—also called 八分 (bāfēn)[69]—was dominant. An early type of cursive script was also in use as early as 24 BCE,[i] incorporating cursive forms popular at the time, as well as elements from the vulgar writing that originated in Qin state. By the time of the Jin dynasty, this Han cursive style became known as 章草 (zhāngcǎo), sometimes known in English as 'clerical cursive', 'ancient cursive', or 'draft cursive'. Some believe this name, which uses the character ('orderly'), arose because the style was considered by the Jin to be a more orderly form than what would become the modern form of cursive, called 今草 (jīncǎo), which had first emerged during the Jin and has remained in use since.[70]

Neo-clerical[edit]

Around the midpoint of the Eastern Han, a simplified and easier form of clerical script appeared, which Qiu terms 'neo-clerical' (新隶体; 新隸體; xīnlìtǐ).[71] By the end of the Han, this had become the dominant script used by scribes, though clerical script remained in use for formal works, such as engraved stelae. Qiu describes neo-clerical as a transitional form between clerical and regular script, remaining in use through the Three Kingdoms period and into the Jin dynasty.[72]

Semi-cursive[edit]

By the late Han, an early form of semi-cursive script[71] had begun developing from a cursive form of neo-clerical script.[j] This semi-cursive script was traditionally attributed to Liu Desheng (劉德升; c. 147 – 188 CE), although such attributions refer to early masters of a script rather than to their actual inventors, since the scripts generally evolved into being over time. Qiu provides examples of early semi-cursive script, lending credence to its having popular origins, rather than being solely Liu's invention.[73]

Regular script[edit]

A page from a printed Song publication in a regular script typeface, which resembles the handwriting of Tang-era calligrapher Ouyang Xun

The innovations of regular script have traditionally been credited to Cao Wei calligrapher Zhong Yao (c. 151 – 230), often called the "father of regular script". The earliest surviving manuscripts written in regular script are copies of Zhong Yao's work, including at least one copied by Wang Xizhi (303–361), often called the "Sage of Calligraphy". Regular script developed out of a neatly written form of early semi-cursive, with the addition of a 'pause' (; dùn) technique to end horizontal strokes, plus heavy tails on strokes which are written the downward-right diagonal. Thus, early regular script emerged from a neat, formal form of semi-cursive, which had itself emerged from neo-clerical, a simplified, convenient form of clerical script. It developed further during the Eastern Jin (317–420) in the hands of Wang Xizhi and his son Wang Xianzhi (344–386). However, its use was still not widespread, with most writers continuing to use the neo-clerical and semi-cursive styles for their daily writing. The modern cursive script began to emerge during this time, influenced by both semi-cursive and regular script and exemplified by calligraphers such as Wang. It was not until the Northern and Southern period (420–589) that regular script became the predominant form.[74]

Structure[edit]

Structural templates used in compounds, with red marking possible positions for radicals

Broadly, Chinese characters are rectilinear units of uniform width. Within the square allotted to each character, most are constructed from smaller components, which are in turn drawn with a series of strokes.[75] Strokes can be considered both the basic unit of handwriting, as well as the basic unit of graphemic organisation within the system. Individual strokes are generally categorised according to technique and graphemic function, as exemplified by the Eight Principles of Yong. In the transition from seal to clerical script, many formerly bespoke, interlinked character components became discrete and regularised.[76]

Characters are assembled according to predictable visual patterns, with some components usually not seen in certain positions within a character, and some taking distinct, visually congruous forms only when in a certain position—such as the 'KNIFE' radical appearing as on the right side of characters, but as at the top of characters. Both the order in which strokes are drawn within a given component, as well as the order components are written in a character is largely fixed.[77] This is summed up in practice with a few rules of thumb: generally components and characters are assembled from left-to-right, and from top-to-bottom, with 'enclosing' components started before, then closed after, the components they enclose.[78]

For example, is made up of two components, with each in turn composed of three strokes, drawn in the following order:

Character Component Stroke
(1)
(2)
(3)
(4)
(5)
(6)

Variants and allographs[edit]

Variants of the Chinese character for 'turtle', collected c. 1800 from printed sources. The traditional form (left) is used in Taiwan and Hong Kong. The simplified form is used in China, and the simplified form is used in Japan. A few of the above forms more closely resemble the modern simplified form of the character ('lightning')

Over a character's history, graphical variants called allographs emerge via several processes while retaining the semantics of previous forms. This is comparable to the visually distinct double-storey ⟨a⟩ and single-storey ⟨ɑ⟩ forms equally representing the Latin letter A. In Chinese, character variants also emerge to facilitate ease of handwriting or for aesthetic reasons, but also to create a more 'correct' composition to the writer according to the received principles of character formation.[79] For example, individual components may be replaced with visually-, phonetically-, or semantically similar alternatives.[80] The boundary between character structure and style, and thus between allographs of the same character versus semantically distinct characters, is often non-trivial or unclear.[81]

Methods and styles[edit]

Ordinary handwriting on a lunch menu in Hong Kong

There are numerous styles, or "scripts" (; ; shū) in which characters can be written. Most that are used throughout the Sinosphere originated within China, but may have minor regional variations. Styles created outside China tend to remain localised in their use, these include the Japanese edomoji and the Vietnamese lệnh thư script.[82]

Calligraphy[edit]

Chinese calligraphy of mixed styles by Song poet Mi Fu

Calligraphy was considered one of the four arts to be mastered by Chinese scholars. It is usually done with ink brush, with a deliberately minimalist set of rules. Strict regularity is not required, since strokes may be accentuated for dramatic effect of individual style. Calligraphy was considered an artful means by which scholars could express their thoughts and teachings.[83]

Printing and typefaces[edit]

There are four broad classes of typefaces for Chinese characters:[84]

  • Song typefaces, also called "Ming"—with "Song" generally used with simplified Chinese forms, and "Ming" with other forms—broadly correspond to Western serif styles. Broadly, Song typefaces are in the tradition of historical Chinese print; both names refer to eras where woodblock printing is considered to have flourished in the Sinosphere. While most typefaces during the Song dynasty (960–1279) resembled the regular script style of a particular calligrapher, most modern Song typefaces are designed for general purpose use, and with an emphasis on neutrality.
  • Sans-serif typefaces, called "black form" in Chinese and "Gothic" (ゴシック体 in Japanese, are characterised by simple lines of even thickness for each stroke, akin to sans-serif styles in Western typography.
  • "Kai" typefaces directly imitate handwritten regular script.
  • Fangsong or "Imitation Song" typefaces, also called merely "Song" in Japan, comprise semi-script styles in the Western paradigm.

Use with computers[edit]

The first four characters of the Thousand Character Classic in different typefaces and historical styles. From right to left: seal script, clerical script, regular script, Ming, and sans-serif

Even before the advent of computers, the very first electro-mechanical input/output and text encoding methods to be designed were done so for use with alphabet-based writing systems, exemplified by the design of typewriters and the Morse code and ASCII standards. Adaptation of these technologies for use with a logography of thousands of characters was non-trivial.[85]

Like English and other languages, Chinese characters are output on printers and screens in different fonts.[86] In addition to the international system of measuring with points, Chinese characters are also measured by a unit called zihao (字号), first invented for Chinese printing in 1859.[87]

Input methods[edit]

Predominantly, Chinese characters are input as strings of Latin characters, which enables the use of a standard keyboard. Phonetic encodings are usually based on existing transcription schemes, such as pinyin for Mandarin, and Jyutping for Cantonese. Writing a given character usually involves typing out its phonetic transcription, possibly followed by a number representing the tone: for example, 香港 ('Hong Kong') could be input as xiang1gang3 using pinyin, or as hoeng1gong2 using Jyutping.

Input codes for characters may also be based on their form. Using the existing rules of stroke order and how components are assembled into whole characters,[88] characters may be assigned a more unique shorthand than its phonetic transcription using one of several methods, potentially increasing the speed of typing. Popular form-based encoding methods include Wubi on the mainland, and Cangjie—named after the mythological inventor of writing—in Taiwan and Hong Kong. For example, ('border') is encoded as NGMWM using the Cangjie method, with each letter corresponding to the components 弓土一田一, with some omitted according to predictable rules.[89]

Contextual constraints may be used to improve candidate character selection. When ignoring tones, 大学 and 大雪 are both transcribed as daxue, the system may prioritize which candidate should appear first based on the surrounding context.[90]

Encoding and interchange[edit]

Text is represented digitally by a series of binary numbers called code points. The Unicode Standard is the predominant text encoding worldwide; according to the philosophy of the Unicode Consortium, each distinct graph is assigned a unique code point; specifying a particular allograph is a choice made by the typeface chosen to represent the text.[91] Unicode's Basic Multilingual Plane (BMP) represents the standard's 216 smallest code points.[92] Of these, 20992 (or 32%) are assigned to "CJK Unified Ideographs", a designation comprising characters used in each of the Chinese family of scripts. As of version 15.1, Unicode defines a total of 97670 Chinese characters.[93]

Prior to Unicode, in 1980 the Chinese government released GB 2312 as its standard encoding, which included 6763 simplified characters. Following the widespread adoption of Unicode, GB 2312 was supplanted by GB 18030 in 2005, which remains the official standard encoding used in China. It includes both simplified and traditional forms, with code points corresponding one-to-one with the corresponding segments of The Unicode Standard.[94] CNS 11643 is Taiwan's official standard character encoding; its 1992 revision included 48027 characters; its most recent revision as of 2024 includes more than 96000.[95] Originally developed during the 1980s by a consortium of five Taiwanese IT companies, Big5 is the second-most popular Chinese character encoding behind Unicode, particularly in Taiwan, Hong Kong, and Macau. As of 2024, Big5 encodes 3053 characters.[96]

Vocabulary and adaptation[edit]

Writing first emerged during a stage of development in the Chinese language known as Old Chinese. In most cases, each character corresponds to a morpheme that was originally an independent Old Chinese word.[97] However, in most modern varieties, many words are compounds of two or more morphemes, and are therefore written with several characters. In Japanese and Korean, morphemes are often multiple syllables, and as such single characters may represent several spoken syllables.[98]

Classical Chinese is the form of written Chinese attested in the classic works of Chinese literature from approximately the 5th century BCE to the 2nd century CE.[99] The language of the classics was imitated by later authors, and became entrenched as spoken language diverged throughout the country, establishing a form generally referred to as Literary Chinese. The use of Literary Chinese in the Sinosphere was loosely analogous to that of Latin in pre-modern Europe, and remained the predominant writing system until the 20th century. While not static over time, Literary Chinese retained many of the properties of spoken Old Chinese. With numerous sound mergers occurring in different varieties over time, polysyllabic words increasingly served to reduce ambiguity between words that had become homophonic.[100] It has been estimated that over two-thirds of the 3,000 most common words in modern Standard Chinese are polysyllabic, with the vast majority of these being two-syllable words.[101]

Chinese texts were read using different literary and colloquial readings across China, informed by the local spoken varieties. Moreover, Chinese characters were introduced to surrounding countries alongside the other facets of Chinese culture. The non-Sinitic-speaking elites of areas including Vietnam, Korea, Japan, and the Ryukyu Islands[102]—each adopted writing for record-keeping, histories, and official communications, forming what is now called the Sinosphere.[103]The Japanese, Korean, and Vietnamese languages are each from different families than Chinese; the notion of special literary reading techniques was further extended to adapt Literary Chinese text for those who did not necessarily speak Chinese themselves. For example, when read aloud by a Japanese speaker, the vocabulary and syntax of a literary text is adapted to reflect that of Japanese. However, when writing Japanese scribes would "reverse" this technique, retaining the Chinese structure and producing a fully normative Literary Chinese text. The resulting literary culture was less directly tied to a spoken language than those using phonetic scripts. This is exemplified by the cross-linguistic phenomenon of brushtalk, where mutual literacy allowed speakers of different languages to engage in face-to-face conversions.[104][105]

Following the introduction of Literary Chinese, people across the Sinosphere also began using characters to write local languages, though Literary Chinese remained predominant until the modern era throughout. In Japanese and Vietnamese, characters were used to write both the corresponding native vocabulary, as well as loanwords with pronunciations borrowed from Chinese, referred to as Sino-Xenic pronunciations. Characters in these languages may have native pronunciations, Sino-Xenic pronunciations, or both. Some characters have multiple Sino-Xenic readings borrowed at different points in time from different varieties of Chinese.[106] The comparison of Sino-Xenic pronunciations has been useful in the reconstruction of Middle Chinese phonology. In Korea, hanja were usually used to write Sino-Korean vocabulary, though there is evidence vernacular Korean readings were sometimes used with texts.[107]

Chinese characters were used in Vietnam during the millennium of Chinese rule that began in 111 BCE; they were adapted to write Vietnamese c. the 13th century, creating the chữ Nôm script. Writing also arrived in Korea during the 2nd century BCE, alongside cultural elements such as Buddhism; the practice of writing in Korea became widespread over the following three centuries. From Korea, writing spread to Japan during the 5th century CE.[108] Currently, the only non-Chinese language normally written with Chinese characters is Japanese. Vietnam abandoned the use of chữ Nôm and Literary Chinese in the early 20th century in favour of a Latin alphabet, and Korea has largely replaced the use of hanja with hangul. Due to a decreased emphasis on Chinese character education in South Korea, the use of hanja is rapidly disappearing.[109]

Old Chinese[edit]

Line drawings of various ordinary objects such as books, baskets, buildings, and musical instruments are displayed beside their corresponding Chinese characters
Excerpt from a 1436 primer on Chinese characters

Words in Old Chinese were generally monosyllabic; as such, each character denoted an independent word.[110] Affixes could be added to form a new word, which was often written with the same single character. In many cases, the pronunciations then diverged due to the systematic sound changes caused by the affixes. For example, many additional readings in modern varieties reflect the Middle Chinese 'departing tone', the major source of the 4th tone in modern Standard Chinese. Many scholars now believe that this Middle Chinese tone is the reflex of an Old Chinese derivational suffix /*-s/ called the qusheng 去聲 that served a range of semantic functions—possibly the only example of inflectional morphology extant in the otherwise analytic language.[111][112] For example:

Character OC[δ] MC[β] mod. Gloss
[113] *drjon drjwen' chuán 'to transmit'
*drjons drjwenH zhuàn 'a record'
[113] *maj ma 'to grind'
*majs maH 'grindstone'
宿[114] *sjuk sjuwk 'to stay overnight'
*sjuks sjuwH xiù 'celestial mansion'
[115] *hljot sywet shuō 'speak'
*hljots sywejH shuì 'exhort'

Another common sound change occurred between voiced and voiceless initials, though the phonemic voicing distinction has disappeared in most modern varieties. This is believed to reflect an Old Chinese de-transitivising prefix, but scholars disagree on whether the voiced or voiceless form reflects the original root. Each pair of examples below reflects two words of opposite transitivity.

Character OC[δ] MC[β] mod. Gloss
[116] *kens kenH jiàn 'to see'
*gens henH xiàn 'to appear'
[116] *prats pæjH bài[k] 'to defeat'
*brats bæjH 'to be defeated'
[117] *tjat tsyet zhé 'to bend'
*djat dzyet shé 'to be broken by bending'

Vernacular Chinese varieties[edit]

Multi-syllable words began entering the language during the Western Zhou period; it is estimated that between 25% and 30% of the vocabulary used in Warring States-era texts is polysyllabic. The process has accelerated over the centuries as phonetic change has increased the number of homophones.[118] The most common process of Chinese word formation after the Classical period has been to create compounds of existing words. Words have also been created by appending affixes to words, by reduplicating words, and by borrowing words from other languages.[119] While polysyllabic words are generally written with one character per syllable, abbreviations are occasionally used.[120]

In addition, there are a number of 'dialect characters' (方言字; fāngyánzì) that are not used in standard written vernacular Chinese—a form that largely corresponds to spoken Standard Chinese, in turn based on the Beijing dialect of Mandarin—but are found in other spoken varieties. The most complete example of an orthography based on a variety other than Standard Chinese is Written Cantonese. It is common to use standard characters to transcribe previously unwritten words in Chinese dialects when obvious cognates exist. When no obvious cognate exists due to factors like irregular sound changes, semantic drift, or an ultimate origin in a non-Chinese language substratum or loanword, characters are borrowed to transcribe the word—either ad hoc, or according to the rebus principle.[121] These new characters are generally phono-semantic compounds, e.g. ('I', 'person'), although there are examples of compound ideographs, e.g. ('bad').[E] In Taiwan, there is also a body of semi-official characters used to represent Taiwanese Hokkien and Hakka. An example of an Hakka vernacular character is (cii11, 'kill').[F]

Japanese[edit]

In Japanese, the word meaning "Chinese characters" is rendered as kanji. Japanese historically borrowed many words from Chinese, which were written with their original characters, while native Japanese words were also written with orthographic borrowings of Chinese characters with similar meanings. Most kanji arrived via both borrowing processes, and thus have both native Japanese readings, known as kun'yomi, as well as Chinese-original readings known as on'yomi. Moreover, Chinese words were often borrowed multiple times from different varieties, resulting in several distinct on'yomi readings for the same character.[122]

The Japanese writing system has also incorporated scripts called kana to represent sounds rather than morphemes. Prior to the Meiji era (1868–1912), writers simply used certain kanji to represent their sound values instead, in a system known as man'yōgana. Starting in the 9th century, specific man'yōgana were graphically simplified to create two distinct syllabaries called hiragana and katakana, which slowly replaced the earlier convention. Modern Japanese retains the use of kanji to represent most word stems, while kana syllabograms are generally used for grammatical affixes, particles, and loanwords. The forms of hiragana and katakana are visually distinct from one another, owing in large part to different methods of simplification: katakana were derived from subcomponents taken from each man'yōgana, while hiragana were derived from the cursive forms of man'yōgana in their entirety. In addition, the hiragana and katakana for some moras were derived from different man'yōgana.[123]

Due to Japanese being a synthetic language, many kanji have multi-syllable readings. For example, the kanji has a native kun'yomi reading of katana. In different contexts, it can also be read with the on'yomi reading , such as in the Chinese loanword 日本刀 (nihontō; '[Japanese] sword'), with a pronunciation corresponding to that in Chinese at the time of borrowing. Prior to the invention of katakana, loanwords were typically written with unrelated kanji with on'yomi readings matching the syllables in the loanword. These spellings are called ateji: for example, 亜米利加 was the ateji form for modern アメリカ (Amerika; 'America'). As opposed to man'yōgana used solely for their pronunciation, ateji still corresponded to specific Japanese words. Some are still in use: the official list of jōyō kanji includes 106 ateji readings.[124]

Korean[edit]

As early as the Gojoseon period, Literary Chinese was the dominant form of written communication in Korea. Although the hangul alphabet was invented by the Joseon king Sejong in 1443, it was not taken up by the Korean literati, and its use did not become widespread until the late 19th century.[125][126] Even today, much of the Korean vocabulary, especially in areas of science and sociology, comes directly from Chinese. However, due to the lack of tones in the Korean language, many dissimilar Sino-Korean words took on identical pronunciations, and as such are spelled identically in hangul.[127] For example, the phonetic dictionary entry for 기사, gisa yields more than 30 different entries. In the past, this ambiguity had been efficiently resolved by parenthetically displaying the associated hanja. While hanja are sometimes used for Sino-Korean vocabulary, their use for native Korean words is rare.

When learning to write hanja, students are taught to memorise a native Korean word with the same meaning and the Sino-Korean pronunciation for each character.[128] Examples of listings include:

Hanja Hangul Gloss
Native translation Sino-Korean
, mul , su 'water'
사람, saram , in 'person'
, keun , dae 'big'
작을, jakeul , so 'small'
아래, arae , ha 'down'
아비, abi , bu 'father'
나라 이름, nara ireum , han 'Korea'

South Korea[edit]

Hanja are still used in South Korea, particularly in newspapers, weddings, place names, and the practice of calligraphy—although to nowhere near the extent of kanji use in Japanese society. At present, Chinese characters are sometimes used for the disambiguation of homophonous words. Additionally, their use still possesses connotations of erudition and cultural Confucianism; knowledge of Chinese characters is considered to be a high class attribute by many Koreans, and an indispensable part of a classical education.[126] There is a clear trend toward the exclusive use of hangul in ordinary South Korean contexts.[129] The extent of hanja use has become a politically contentious issue in the country, with some seeing its total abandonment, including ending hanja education in schools, as a "purification" of the national language and culture. Others support returning to a level of ordinary hanja use previously seen during the 1970s and 80s.[130] There are hanja that are used more widely, alongside its hangul counterpart, such as the word 'voice', with the hanja still being considered higher in register.[131]

Policies regarding the teaching of hanja have historically vacillated, often swayed by the inclinations of individual education ministers. Students in grades 7–12 are presently taught 1,800 characters,[130] albeit with a principal focus on simple recognition, with the aim of achieving newspaper literacy.[126] Hanja retains its prominence in Korean academia, as the vast majority of Korean documents, history, and literature—such as the Veritable Records of the Joseon Dynasty—were written in Literary Chinese. Therefore, a working knowledge of Chinese characters is still important for anyone wishing to interpret and study older Korean texts, or anyone who wishes to read scholarship in the humanities. Working knowledge of hanja is also useful for understanding the etymology of Sino-Korean vocabulary.[132]

North Korea[edit]

A 1949 law in North Korea apparently banned the use of all so-called foreign languages, which has been interpreted as including hanja. However, due to the country's isolation accurate reports about its use of hanja are difficult to obtain. A textbook for university history departments published in the country in 1971 contained 3,323 distinct characters, and in the 1990s North Korean school children were still expected to learn 2,000 characters, more than in South Korea or Japan.[133] A 2013 textbook appears to integrate the use of hanja in secondary school education.[134] Currently, North Korea is estimated to teach around 3,000 hanja to North Korean students by the time they graduate university; in some cases, the characters appear within advertisements and newspapers, but cultural use is narrower than in the South, mostly restricted to dictionaries and textbooks.[135]

Vietnamese[edit]

The first two lines of the classic Vietnamese epic poem The Tale of Kieu, written in both chữ Nôm and the Vietnamese alphabet
  Borrowed characters representing Sino-Vietnamese words
  Borrowed characters representing native Vietnamese words
  Invented chữ Nôm representing native Vietnamese words

Until the early 20th century, Literary Chinese was used for all formal writing in Vietnam. However, the chữ Nôm script began to be developed around the 13th century to record folk literature in the Vietnamese language. Chinese characters, called chữ Hán (𡨸漢), chữ Nho (𡨸儒), or Hán tự (漢字), are rarely used in modern Vietnam; their use is often limited to traditional practices such as calligraphy.

The oldest written Chinese text found in Vietnam is an epigraphy dated to the year 618, erected by local Sui officials in Thanh Hóa.[136] Similar to Zhuang sawndip, some chữ Nôm characters were created by combining semantic character components with phonetic components that resembled Vietnamese syllables.[137] This process resulted in a highly complex system whose use was limited to a small portion of the Vietnamese population, never more than 5%.[138] The oldest chữ Nôm written alongside Chinese is a Buddhist inscription dated to 1209.[137] Before 1945, the library of the French School of the Far East (EFEO) in Hanoi collected a total of around 20,000 Chinese and Vietnamese epigraphy rubbings from throughout Indochina.[139] The oldest surviving extant manuscript in Vietnamese is a late 15th-century bilingual copy of the Buddhist Sutra of Filial Piety, currently kept by the EFEO. It features Chinese text in larger characters, and an Old Vietnamese translation in smaller characters glossing the text.[140] Every Hán Nôm book in Vietnam after the Phật thuyết is dated between the 17th and the 20th centuries, with most being hand-copied works, and few printed texts. By 1987, the library of the Institute of Hán-Nôm Studies in Hanoi had collected a total of 4,808 Hán Nôm manuscripts.[141]

A page from a bilingual copy of the Sutra of Filial Piety, with Literary Chinese alongside an early form of chữ Nôm, representing the Old Vietnamese pronunciation. Sometimes, pairs of characters are used to represent the consonant clusters present in Old Vietnamese

Literary Chinese and chữ Nôm fell out of use during the French colonial period, and were gradually replaced with the Vietnamese alphabet, which uses Latin characters and remains the primary writing system for Vietnamese.[142][143]

Vietnamese imperial edict in Literary Chinese

Other languages[edit]

Several minority languages of South and Southwest China were formerly written with scripts based on Chinese characters, but also included many locally created characters. The most extensive is the sawndip script used to write the Zhuang languages of Guangxi, which is still in use despite efforts to encourage the writing of Zhuang with a Latin-based alphabet. Other non-Sinitic languages of China written with Chinese characters include Miao, Yao, Bouyei, Mulam, Kam, Bai, and Hani.[144] All these languages are now officially written using Latin-based scripts. According to surveys, traditional sawndip script has twice as many users as the official Latin script.[145]

Dynasties founded by non-Han peoples that ruled northern China between the 10th and 13th centuries developed scripts that were inspired by Chinese characters but did not use them directly: the Khitan large script, Khitan small script, Tangut script, and Jurchen script—though Chinese characters were used to phonetically transcribe the language of the Jurchen people, renamed the 'Manchu' after the founding of the Qing dynasty. Nüshu was a script used by Yao women to write the Xiangnan Tuhua language,[146] and bopomofo is a semi-syllabary invented during the 20th century to phonetically represent Standard Chinese;[147] both use forms graphically derived from Chinese characters. Other scripts within China that have adapted some characters but are otherwise distinct include the Geba script, Sui script, Yi script, and the Lisu syllabary.[144]

Transcription[edit]

Excerpt from the Secret History of the Mongols featuring Chinese characters used to transcribe Mongolian, with glosses to the right of each row

In addition to Persian and Arabic scripts, the Mongolian language was also written with Chinese characters selected to represent the sounds of spoken Mongolian. This phonetic writing system using repurposed logographs was used in the only manuscripts of the Secret History of the Mongols that have survived from the medieval era.[148] According to the 19th century missionary John Gulick:

The inhabitants of other Asiatic nations, who have had occasion to represent the words of their several languages by Chinese characters, have as a rule used unaspirated characters for the sounds g, d, b. The Muslims from Arabia and Persia have followed this method ... The Mongols, Manchu, and Japanese also constantly select unaspirated characters to represent the sounds g, d, b, and j of their languages. These surrounding Asiatic nations, in writing Chinese words in their own alphabets, have uniformly used g, d, b, etc., to represent the unaspirated sounds.[149]

Special cases[edit]

Contractions and abbreviations[edit]

Some compound words and set phrases have been represented by single-character contractions, often considered ligatures instead of characters representing a single morpheme. They are often used in handwriting or for decorative purposes, but are sometimes seen in print. They are called 合文 (héwén), 合书; 合書 (héshū) or 合体字; 合體字 (hétǐzì) in Chinese; in the special case where two characters are combined, they are known as 'two-syllable characters' (双音节汉字; 雙音節漢字; shuāngyīnjié hànzì). For the sake of standardisation, the Chinese government has sought to limit the use of polysyllabic characters in writing.[2] A popular example is the 'double happiness' character formed as a ligature of 喜喜, and referred to by its disyllabic name 双喜; 雙喜 (shuāngxǐ).[G] Numerals are also sometimes written as ligatures—for example, 廿 (niàn; 'twenty') is normally read as 二十 (èrshí) in Standard Chinese,[H][2] and as jaa6 in Cantonese.[150]

In oracle bone script, personal names, ritual items, and even whole phrases were contracted into single characters: for example, 受又 (shòu yòu; 'receive blessings') becomes (yòu). An example found in medieval manuscripts writes 'bodhisattva' (菩薩; púsà) as a contracted character, composed of four arranged in a 2×2 grid, derived from the 'GRASS' components within the original characters. Other historical examples include contractions used to represent SI units, which have generally fallen out of use. In Chinese, SI units usually consist of two morphemes, such as 'centimetre' (厘米; límǐ) and 'kilowatt' (千瓦; qiānwǎ). In the 19th century, these were often contracted, with used for 千瓦 and used for 厘米. Some of these were also used in Japan, where they used pronunciations borrowed from European languages. Miscellaneous examples include 𱕸; (tuān), a contraction of 图书馆; 圖書館 (túshūguǎn; 'library').[I]

Multi-syllable morphemes[edit]

A small number of morphemes in Chinese are disyllabic, some of which even date back to the Classical period.[151] Excluding loanwords, these are typically words for plants and small animals, usually written with a pair of phono-semantic compounds sharing a common radical. Examples are 蝴蝶 (húdié; 'butterfly') and 珊瑚 (shānhú; 'coral')—the first character of 'butterfly' and the second character of 'coral' each have for a phonetic component, with the 'INSECT' and 'JADE' radicals as their respective semantic components, also present within the other character of each word. Neither of the aforementioned characters exist as independent morphemes, except as poetic abbreviations of the disyllabic words.

A notable example is the name for the pipa, a type of lute. The instrument's name 枇杷 was originally shared with one for the loquat,[l] which has a shape reminiscent of the instrument. The name for the instrument was originally written with the 'HAND' radical as 批把, referring to the upward and downward strokes made when playing the instrument. The name for the fruit was later changed to its present 枇杷, with the 'TREE' radical; the name for the instrument became 琵琶, with ('guqin') incorporated into both characters.[J]

The erhua phenomenon in some varieties of Mandarin is reflected in writing by means of a ; ér suffix. As such, some monosyllabic words may be written with two characters, such as huār (花儿; 'flower').

Rare and complex characters[edit]

Rare or antiquated character variants appear more often in personal or place names. Extremely stroke-rich characters tend to be rare. One of the most complex characters included in modern Chinese dictionaries is (nàng; 'snuffle') with 36 strokes.[K] Stroke-rich characters are often composed of other characters in triplicate or quadruplicate, such as the triplicated (bìng) with 39 strokes, and the quadruplicated (bèng) with 52, both meaning 'the loud noise of thunder'. (; 'appearance of a dragon in flight') consists of the 'DRAGON' radical in triplicate, for a total of 48 strokes. (; 'luxuriant', 'lush', 'gloomy') is the character with the most strokes in the jōyō kanji list, with 29. In Japanese, an 84-stroke kokuji exists: , normally read taito. It is composed of the 'cloud' character atop the aforementioned triple-'dragon' character, also meaning 'appearance of a dragon in flight'.[152]

Standardisation[edit]

In the modern period, each polity using Chinese characters has standardised their forms, pronunciation, and stroke orders. Most characters have a single standard stroke order, but some may differ by region, occasionally resulting in different stroke counts. The latest published standards for character forms are:

Polity Standard Characters Latest revision
 China Table of General Standard Chinese Characters 8105 2013[153]
 Hong Kong List of Graphemes of Commonly-Used Chinese Characters[m] 4762 2012[154]
Reference Glyphs for Chinese Computer Systems in Hong Kong[n] 2016[155]
 Taiwan[o] Chart of Standard Forms of Common National Characters 4808 1983[157]
Chart of Standard Forms of Less-Than-Common National Characters 6341 1983[158]
Chart of Rarely-Used National Characters 18388 2017[156]
 Japan Jōyō kanji 2136 2010[159]
 South Korea Basic Hanja for Educational Use 1800 2000[160]

Received forms[edit]

From left to right: the regional forms for the character in the Noto Serif CJK typeface family, as used in mainland China, Taiwan, and Hong Kong (top), as well as in Japan and Korea (bottom)

With the use of woodblock printing, there was a considerable standardisation in forms during the Ming and Qing dynasties, which developed into an orthography used in print, later dubbed the jiu zixing ('old character shapes'), which prefigured standardisation in the 20th century. The 1716 Kangxi Dictionary is emblematic of these forms.[161]

Simplified characters[edit]

The first official list of simplified characters, published in 1935 and consisting of 324 characters[162]

Though most closely associated with the People's Republic, the idea of a mass simplification of character forms first gained traction in China during the early 20th century. In 1909, the educator and linguist Lufei Kui formally proposed the use of simplified characters in education for the first time. Over the following years—marked by the 1911 Xinhai Revolution that toppled the Qing dynasty, followed by growing social and political discontent that further erupted into the 1919 May Fourth Movement—many anti-imperialist intellectuals throughout China began to see the country's writing system as a serious impediment to its modernisation. Many began calling for script reform, or even for Chinese characters to be entirely replaced with an alphabet. During the 1930s and 1940s, discussions regarding simplification took place within the ruling Kuomintang (KMT) party. Many members of the Chinese intelligentsia maintained that simplification would increase literacy rates throughout the country.[163] In 1935, the first official list of simplified forms was published, consisting of 324 characters collated by Qian Xuantong. However, fierce opposition within the KMT resulted in the list being rescinded in 1936.[164]

Traditional ()
Simplified ()
Comparison of strokes between character forms,[p] showing systematic simplification of the component 'GATE'

Cursive script served as a source for many simplified character forms; others had already been used in print, though usually not in formal works. The broader KMT initiative of simplifying the Chinese writing system with the goal of increasing functional literacy was ultimately inherited by the Communist Party (CCP), who began work on script reform in earnest following the proclamation of the People's Republic of China in 1949. Since the 1950s, the PRC has officially encouraged the use of simplified characters on the mainland. The Republic of China, as well as Hong Kong and Macau—still under colonial rule at the time—were not affected by the reforms.[165]

People's Republic of China[edit]

Most simplified forms in widespread use are the direct result of PRC initiatives during the 1950s and 1960s. Prior to 1958, when Zhou Enlai announced the government's intent to focus on simplifying the existing system, some within the Chinese intelligentsia, including Mao Zedong, also considered the total replacement of Chinese characters with an alphabet. Gwoyeu Romatzyh and Latinxua Sin Wenz were two alphabets that had been developed in previous decades, the latter by the Communists themselves, partly to investigate the viability of replacing Chinese characters.[166]

The PRC initiated the first round of simplifications with two documents published in 1956 and 1965. The reforms both simplified the forms of many characters in use, and reduced the total number of characters in the lexicon.[167] The majority of first round characters were drawn from conventional abbreviations or ancient forms.[168] For example, the orthodox character ('to come') was written as in the earlier clerical script. The latter form used one fewer stroke, and was thus adopted as a simplification. The ('cloud') character was written as in the ancient oracle bone script. This simpler form had remained in use later as a phonetic loan with a meaning of 'to say', and with the original meaning of 'cloud' it was instead written with an added 'RAIN' radical as a semantic indicator. When using simplified forms, these two characters are merged into .[L]

A second round of simplifications was promulgated in 1977, but was poorly received by the public and quickly fell out of official use. It was ultimately formally rescinded in 1986. The second round of simplifications were unpopular in large part because the vast majority of its forms were completely new, in contrast to the many familiar variants present in the first round.[169] Two revised lists of simplified forms were published in 1988: the List of Commonly Used Characters in Modern Chinese with 2,500 common characters and 1,000 less common characters, and the Chart of Generally Utilised Characters of Modern Chinese with 7,000 characters, including those in the smaller list. In 2013, the revised Table of General Standard Chinese Characters supplanted the 1988 lists as the standard. It includes a total of 8,105 characters, with 3,500 categorised as primary, 3,000 as secondary, and 1,605 as tertiary.[170] The Chinese Proficiency Test (HSK) covers 2,663 characters and 5,000 words at its highest level, while the Chinese Proficiency Grading Standards for International Chinese Language Education would cover 3,000 characters and 11,092 words at its highest level.[171][172][173]

Singapore[edit]

Singapore underwent three successive rounds of character simplification promulgated by the Ministry of Education, with the first two having some simplifications that differed from those used in mainland China. The first round was published in 1969, and consisted of 498 simplified and 502 traditional characters. The second round in 1974 consisted of 2287 simplified characters, including 49 differences from the PRC system that were removed with the final round in 1976.[174] In 1993, Singapore adopted the revisions made by mainland China in 1986. Unlike in mainland China, where personal names may only be registered using simplified characters, Singapore parents have the option of registering their children's names in traditional characters.[175]

Malaysia[edit]

Malaysia uses simplified characters in Chinese-language schools. Chinese-language newspapers in the country are published in either simplified or traditional characters—often, headlines are printed with traditional forms, and the body with simplified forms.[176]

Philippines[edit]

In the Philippines, most Chinese schools and businesses still use traditional characters with bopomofo, owing to Taiwanese influence due to a shared Hokkien heritage. Recently, more Chinese schools have switched to using simplified characters alongside pinyin, and many schools use some combination of the two. Since most of the readership of Chinese-language newspapers in the country belong to an older generation, they are still largely published using traditional characters.[177]

Traditional characters[edit]

Regional allographs of in Chinese, Japanese, Korean, and Vietnamese styles

Taiwan[edit]

In Taiwan, the Ministry of Education's Chart of Standard Forms of Common National Characters lists 4,808 characters; the Chart of Standard Forms of Less-Than-Common National Characters lists another 6,341 characters. The Test of Chinese as a Foreign Language (TOCFL) covers 8,000 words at its highest level. The Taiwan Benchmarks for the Chinese Language (TBCL), a guideline designed to describe levels of Chinese language proficiency, covers 3,100 characters and 14,425 words at the highest level.[178][179]

Hong Kong[edit]

In Hong Kong, which uses traditional characters, the Education and Manpower Bureau's List of Graphemes of Commonly-Used Chinese Characters, containing 4,759 characters, is intended for use in elementary and junior secondary education.

North America[edit]

Most Chinese-language newspapers and signage in the United States and Canada use traditional characters.[180] There is some effort to get municipal governments to implement more simplified character signage due to recent immigration from mainland China.[181]

Kanji[edit]

After World War II, the Japanese government instituted its own program of orthographic reforms. Some characters were assigned simplified forms called shinjitai; the older forms were then labelled kyūjitai. Inconsistent use of different variant forms was discouraged, and lists of characters to be taught to students at each grade level were developed. The first of these was the 1850-character tōyō kanji list in 1945, later replaced by the 1945-character jōyō kanji list in 1981. In 2010, the list of jōyō kanji was revised, expanding it to a total of 2136 characters. The Japanese government restricts characters that may be used in names: in addition to the jōyō kanji, names may also include the jinmeiyō kanji, an additional list of 983 characters historically prevalent in names.[182]

Hanja[edit]

The South Korean Basic Hanja for Educational Use is a set of 1,800 characters standardised in 1972, with the first 900 hanja taught to middle school students, and the rest taught to high school students.[160]

In March 1991, the Supreme Court of Korea published the 2,854-character Table of Hanja for Use in Personal Names.[183] The list expanded gradually: by 2015 there were 8,142 hanja, including the set of basic hanja, permitted for use in Korean names.[184]

Lexicography[edit]

Cumulative frequency of simplified Chinese characters in modern text[185]

Dozens of schemes have been devised for indexing Chinese characters and sorting them into dictionaries. Most of these are specific to the dictionary for which they were invented, and relatively few have seen widespread use. Often, character dictionaries incorporate several mechanisms by which users may locate entries. Methods for arranging Chinese dictionaries are divided into form-based orders that sort by visual properties, sound-based orders usually based on an extant transliteration scheme, and meaning-based orders.[186]

Many character dictionaries are indexed using a technique known as radical-and-stroke sorting, where characters are grouped by radical, which are in turn sorted by stroke number. Classification by radical was introduced by the Shuowen Jiezi, which used 540 radicals. The set of 214 Kangxi radicals were popularised by the 1716 Kangxi Dictionary but were originally introduced in the Zihui in 1615. Another form-based system is the four-corner method, where characters are classified according to the shapes at each of the character's corners. In modern Chinese, characters and words are also ordered by their frequency of use within a given corpus. Stroke-based sorting includes techniques that combine sorting by stroke count and stroke order. Most modern Chinese dictionaries arrange the main character entries alphabetically according to pinyin spelling, while also providing a traditional radical-based index.[187]

Studies have suggested that literate individuals within China have an active vocabulary of three to four thousand characters, while specialists in fields like classical literature or history may have a working vocabulary of five to six thousand.[188] Estimates of the total number of characters in modern use can be sourced from encoding schemes and dictionaries: according to sources from mainland China, Taiwan, Hong Kong, Japan, and Korea, this number is likely around 15,000.[189] There exist roughly 1,500 Japanese kokuji,[190] Korean gukja, over 10,000 sawndip used to write Zhuang, and almost 20,000 Nôm characters created in Vietnam.[191]

See also[edit]

Notes[edit]

  1. ^ Some Chinese-language works are still printed with vertical layouts, but this is increasingly uncommon.
  2. ^ 漢字; simplified as 汉字.
    Chinese pinyin: hànzì; Wade–Giles: han4-tzŭ4; Jyutping: hon3-zi6.
    Japanese rōmaji: kanji; Korean romanization: hanja; Vietnamese: Hán tự.
  3. ^ There are exceptions to these general correspondences, including § Multi-syllable morphemes, syllables written with multiple characters, particles and affixes lacking strong independent meaning, and multiple syllables written with a single character.[2]
  4. ^ Zev Handel lists:[3]
    1. Sumerian cuneiform emerging c. 3200 BCE
    2. Egyptian hieroglyphs emerging c. 3100 BCE
    3. Chinese characters emerging c. 13th century BCE
    4. Maya script emerging around 2000 years before present
  5. ^ According to Handel: "While monosyllabism generally trumps morphemicity—that is to say, a bisyllabic morpheme is nearly always written with two characters rather than one—there is an unmistakable tendency for script users to impose a morphemic identity on the linguistic units represented by these characters."[11]
  6. ^ Baxter provides the reconstructed Old Chinese pronunciations of this pair as /*ɡ-ljuŋ/[30] and /*k-ljuŋ/[31] respectively.
  7. ^ Originally a pictograph of a vulva. The Shuowen Jiezi gives the origin of as 女陰也; 'female yin [organ]'. By the 6th century BCE, the original definition had fallen into disuse. The use of the character in the definition itself is as a declarative sentence-final particle, and all appearances of the character in Classical texts from that time forward use it as a phonetic loan for the grammatical particle. In addition to being a Classical particle, in modern vernacular Chinese has acquired a meaning of 'also'.
  8. ^ a b was originally the third-person personal pronoun regardless of gender or animacy in Chinese. The feminine-specific form only emerged in the early 20th century, after the bulk of Japanese orthographic borrowing had already occurred.
  9. ^ Qiu 2000, pp. 132–133 provides archaeological evidence for this dating, in contrast to unsubstantiated claims dating the emergence of cursive anywhere from the Qin to the Eastern Han.
  10. ^ Qiu 2000, pp. 140–141 mentions examples of neo-clerical with "strong overtones of cursive script" from the late Eastern Han.
  11. ^ In this case, the pronunciations have converged in Standard Chinese, but they have not in other varieties.
  12. ^ Compare 卢橘; 盧橘 (lou4-gwat1), an unrelated name for the fruit which was eventually borrowed from Cantonese into English.
  13. ^ Reference for education
  14. ^ Reference for font foundries
  15. ^ Collectively the Standard Form of National Characters, which has been published online in full by Taiwan's Ministry of Education since 2017.[156]
  16. ^ The character ; is a plural suffix particle for pronouns.
  1. ^ Baxter–Sagart (2014) reconstruction of Old Chinese.
  2. ^ a b c Baxter's transcription for Middle Chinese.
  3. ^ Standard Chinese and Cantonese readings are given in pinyin and Jyutping, respectively. Japanese on'yomi readings are given in rōmaji.
  4. ^ a b Baxter (1992) reconstruction of Old Chinese.

References[edit]

Citations[edit]

  1. ^ 广西壮族自治区少数民族古籍整理出版规划领导小组 [Central Leadership Planning Group for the Organization and Publication of Early Written Materials of Guangxi Zhuang Ethnic Minority Autonomous Region], ed. (1989). 古壮字字典 [Dictionary of the Old Zhuang Script] (in Chinese) (2nd ed.). Nanning: Guangxi Nationalities Publishing House. ISBN 978-7-536-30614-1.
  2. ^ a b c Mair 2011.
  3. ^ Handel 2019, p. 1.
  4. ^ Qiu 2000, p. 2.
  5. ^ Qiu 2000, pp. 3–4.
  6. ^ Qiu 2000, p. 5.
  7. ^ Qiu 2000, p. 11.
  8. ^ Qiu 2000, p. 1; Handel 2019, pp. 4–5.
  9. ^ Qiu 2000, pp. 13–15.
  10. ^ Qiu 2000, pp. 22–26.
  11. ^ Handel 2019, p. 33.
  12. ^ Handel 2019, p. 51; Yong & Peng 2008, pp. 33–37.
  13. ^ Qiu 2000, pp. 19, 162–168.
  14. ^ Qiu 2000, pp. 14–18.
  15. ^ Yin 2007, pp. 97–100; Su 2014, pp. 102–111.
  16. ^ Yang 2008, pp. 147–148.
  17. ^ Qiu 2000, pp. 163–171.
  18. ^ Qiu 2000, p. 154; Norman 1988, p. 87.
  19. ^ Qiu 2000, pp. 44–45; Zhou 2003, p. 61.
  20. ^ Qiu 2000, pp. 18–19.
  21. ^ Yip 2000, pp. 39–42.
  22. ^ Qiu 2000, p. 46.
  23. ^ Norman 1988, pp. 87.
  24. ^ Sampson & Chen 2013, p. 261.
  25. ^ Boltz 1994, pp. 104–110.
  26. ^ Sampson & Chen 2013, pp. 265–268.
  27. ^ Handel 2019, p. 193.
  28. ^ Norman 1988, p. 88.
  29. ^ Qiu 2000, pp. 154.
  30. ^ Baxter 1992, p. 750.
  31. ^ Baxter 1992, p. 810.
  32. ^ Williams 2010.
  33. ^ Baxter & Sagart 2014, p. 371.
  34. ^ Norman 1988, p. 94.
  35. ^ Wright, David (2000). Translating Science: The Transmission of Western Chemistry Into Late Imperial China, 1840–1900. Brill. p. 211. ISBN 978-9-004-11776-1.
  36. ^ Qiu 2000, pp. 261–265.
  37. ^ Gnanadesikan, Amalia E. (2011). The Writing Revolution: Cuneiform to the Internet. John Wiley & Sons. p. 61. ISBN 978-1-4443-5985-5.
  38. ^ Qiu 2000, p. 168; Norman 1988, p. 79.
  39. ^ Norman 1988, pp. 67–69.
  40. ^ Norman 1988, pp. 86–87.
  41. ^ Qiu 2000, pp. 153–154, 161.
  42. ^ Norman 1988, p. 195.
  43. ^ Qiu 2013, pp. 102–108.
  44. ^ Norman 1988, pp. 89.
  45. ^ Handel 2019, p. 43; Yong & Peng 2008, pp. 102–103.
  46. ^ Qiu 2000, pp. 44–45.
  47. ^ Yang Yuxin (2018). Unveiling and Activating the "Uncertain Heritage" of Chinese Knotting (PDF). The Asian Conference on Cultural Studies 2018. p. 3.
  48. ^ Mair, Victor H. "Prehistoric notation systems in Peru, with Chinese parallels". Language Log. Retrieved 31 July 2023.
  49. ^ The Way of Lao Tzu (Tao Te Ching). Translated by Chan, Wing-tsit. The Bobbs-Merrill Company. 1963. p. 238. ISBN 0-02-320700-0. "Let the people again knot cords and use them (in place of writing)"
  50. ^ 系辞下 [Xi Ci II]. The Book of Changes 易經. Translated by Legge, James. 1899. Archived from the original on 24 September 2020 – via The Chinese Text Project. In the highest antiquity, government was carried on successfully by the use of knotted cords (to preserve the memory of things). In subsequent ages the sages substituted for these written characters and bonds. By means of these (the doings of) all the officers could be regulated, and (the affairs of) all the people accurately examined.
  51. ^ Yang, Lihui; An, Deming (2008). Handbook of Chinese Mythology. Oxford University Press. pp. 84–86. ISBN 978-0-195-33263-6.
  52. ^ Qiu 2000, p. 31.
  53. ^ Rincon, Paul (17 April 2003). "'Earliest Writing' Found in China". BBC News.
  54. ^ Bagley, Robert (2004). "Anyang writing and the origin of the Chinese writing system". In Houston, Stephen (ed.). The First Writing: Script Invention as History and Process. Cambridge University Press. pp. 190–249. ISBN 978-0-521-83861-0 – via Google Books.
  55. ^ Boltz, William G. (1999). "Language and Writing". In Loewe, Michael; Shaughnessy, Edward L. (eds.). The Cambridge History of Ancient China: From the Origins of Civilization to 221 BC. Cambridge University Press. p. 109. ISBN 978-0-521-47030-8. Retrieved 3 April 2019 – via Google Books.
  56. ^ Liu, Kexin; Wu, Xiaohong; Guo, Zhiyu; Yuan, Sixun; Ding, Xingfang; Fu, Dongpo; Pan, Yan (2021). "Radiocarbon Dating of Oracle Bones of the Late Shang Period in Ancient China". Radiocarbon. 63 (1): 155–175. Bibcode:2021Radcb..63..155L. doi:10.1017/RDC.2020.90.
  57. ^ Takashima, Ken-ichi (2012). "Literacy to the South and the East of Anyang in Shang China: Zhengzhou and Daxinzhuang". In Li, Feng; Branner, David Prager (eds.). Writing and Literacy in Early China: Studies from the Columbia Early China Seminar. University of Washington Press. p. 142. ISBN 978-0-295-80450-7.
  58. ^ a b c Kern 2010, p. 1.
  59. ^ Boltz 1986, p. 424.
  60. ^ Keightley 1996.
  61. ^ Kern 2010, p. 2.
  62. ^ Qiu 2000, pp. 63–64, 66, 86, 88–89, 104–107, 124.
  63. ^ Qiu 2000, pp. 59–150.
  64. ^ Chen Zhaorong (陳昭容) (2003). 秦系文字研究﹕从漢字史的角度考察 [Research on the Qin Writing System: Through the Lens of the History of Writing in China]. Institute of History and Philology Monograph (in Chinese). Academia Sinica. ISBN 978-9-576-71995-0.
  65. ^ Louis, François (2003). "Written Ornament: Ornamental Writing: Birdscript of the Early Han Dynasty and the Art of Enchanting". Ars Orientalis. 33: 18–27. ISSN 0571-1371. JSTOR 4434272.
  66. ^ Qiu 2000, p. 59, 104–107, 119.
  67. ^ Qiu 2000, p. 123.
  68. ^ Qiu 2000, pp. 119, 123–124.
  69. ^ Qiu 2000, p. 121.
  70. ^ Qiu 2000, pp. 130–138.
  71. ^ a b Qiu 2000, pp. 113, 139.
  72. ^ Qiu 2000, pp. 138–139.
  73. ^ Qiu 2000, pp. 139–142.
  74. ^ Qiu 2000, p. 143–148.
  75. ^ Peking University 2004, pp. 148–152; Zhang 2013.
  76. ^ Norman 1988, p. 86; Zhou 2003, p. 58.
  77. ^ Yin 2016, pp. 58–59.
  78. ^ Li, Wendan (2009). Chinese Writing and Calligraphy. Honolulu: University of Hawai‘i Press. p. 70. ISBN 978-0-824-83364-0.
  79. ^ Qiu 2000, pp. 204–215, 373.
  80. ^ Zhou 2003, pp. 57–60, 63–65.
  81. ^ Qiu 2000, pp. 297–300, 373.
  82. ^ Nawar, Haytham (2020). "Transculturalism and Posthumanism". Language of Tomorrow: Towards a Transcultural Visual Communication System in a Posthuman Condition. Intellect. pp. 130–155. ISBN 978-1-789-38183-2. JSTOR j.ctv36xvqb7.
  83. ^ Li, Wendan (2009). Chinese Writing and Calligraphy. Honolulu: University of Hawai‘i Press. pp. 180–183. ISBN 978-0-824-83364-0.
  84. ^ Lunde 2008, pp. 23–25.
  85. ^ Su 2014, p. 218.
  86. ^ Li 2013, p. 62.
  87. ^ Zhang 2006.
  88. ^ National Language Commission 1997.
  89. ^ Zhang 2016, p. 422.
  90. ^ Su 2014, p. 222.
  91. ^ Language Institute, Chinese Academy of Social Sciences 2020.
  92. ^ Unicode Character Count V15.1, 2023, archived from the original on 9 October 2023, retrieved 28 November 2023
  93. ^ "UAX #38: Unicode Han Database (Unihan)". The Unicode Consortium.
  94. ^ Lunde, Ken (4 August 2022). "The GB 18030-2022 Standard". Medium. Retrieved 7 August 2022.
  95. ^ "About CNS". Taiwan Ministry of Digital Affairs.
  96. ^ Lunde 2008, pp. 85–87, 112–122.
  97. ^ Norman 1988, pp. 74–75.
  98. ^ Tong, Xiuli; Liu, Phil D.; McBride-Chang, Catherine (2009). "Metalinguistic and subcharacter skills in Chinese literacy acquisition". In Wood, Clare Patricia; Connelly, Vincent (eds.). Contemporary Perspectives on Reading and Spelling. New York: Routledge. pp. 202–218. ISBN 978-0-415-49716-9 – via Google Books. p. 203: Often, the Chinese character can function as an independent unit in sentences, but sometimes it must be paired with another character or more to form a word. [...] Most words consist of two or more characters, and more than 80 per cent make use of lexical compounding of morphemes (Packard, 2000).
  99. ^ Vogelsang, Kai (2021). Introduction to Classical Chinese. Oxford University Press. pp. xvii–xix. ISBN 978-0-198-83497-7.
  100. ^ Wilkinson 2012, p. 22.
  101. ^ Yip 2000, p. 18.
  102. ^ Kornicki 2018, pp. 268–269.
  103. ^ Rabasa, José; Sato, Masayuki; Tortarolo, Edoardo; Woolf, Daniel, eds. (29 March 2012). The Oxford History of Historical Writing: Volume 3: 1400–1800. Vol. 3. Oxford University Press. p. 2. ISBN 978-0-199-21917-9. ...East Asia had been among the first regions of the world to produce written records of the past. Well into modern times Chinese script, the common script across East Asia, served—with local adaptations and variations—as the normative medium of record-keeping and written historical narrative, as well as official communication. This was true, not only in China itself, but in Korea, Japan, and Vietnam.
  104. ^ Denecke, Wiebke (2014). "Worlds Without Translation: Premodern East Asia and the Power of Character Scripts". In Bermann, Sandra; Porter, Catherine (eds.). A Companion to Translation Studies. Oxford: Wiley. pp. 204–216. ISBN 978-0-470-67189-4.
  105. ^ Kornicki 2018, pp. 72–73.
  106. ^ Handel 2019, p. 212.
  107. ^ Kornicki 2018, p. 168.
  108. ^ Handel 2019, pp. 64–65.
  109. ^ "공문서 한글전용·초중등 한자교육 선택 고시 '합헌'(종합)" (in Korean). Maeil Kyungje. 24 November 2016. Archived from the original on 4 February 2022. Retrieved 4 February 2022.
  110. ^ Norman 1988, p. 58.
  111. ^ Zhang, Shuya (2022). "Rethinking the *-s suffix in Old Chinese: with new evidence from Situ Rgyalrong" (PDF). Folia Linguistica. 56 (s43–s1): 129–167. doi:10.1515/flin-2022-2014. ISSN 0165-4004. S2CID 248002645 – via Academic Search Complete.
  112. ^ Baxter 1992, pp. 315–317.
  113. ^ a b Baxter 1992, p. 315.
  114. ^ Baxter 1992, p. 316.
  115. ^ Baxter 1992, pp. 197, 305.
  116. ^ a b Baxter 1992, p. 218.
  117. ^ Baxter 1992, p. 219.
  118. ^ Norman 1988, p. 112.
  119. ^ Norman 1988, pp. 155–156.
  120. ^ Norman 1988, p. 74.
  121. ^ Cheung Kwan-hin (張系顯); Bauer, Robert S. (2002). "The Representation of Cantonese with Chinese Characters". Journal of Chinese Linguistics Monograph Series (18): 12–20. ISSN 2409-2878. JSTOR 23826053.
  122. ^ Coulmas 1991, pp. 122–129.
  123. ^ Coulmas 1991, pp. 129–132.
  124. ^ Taylor, Insup; Taylor, M. Martin (2014). Writing and literacy in Chinese, Korean and Japanese. Studies in written language and literacy (Revised ed.). Amsterdam: John Benjamins. pp. 275–279. ISBN 978-9-027-21809-4.
  125. ^ "알고 싶은 한글". 국립국어원 (in Korean). National Institute of Korean Language. Retrieved 22 March 2018.
  126. ^ a b c Fischer, Stephen Roger (2004). A History of Writing. Globalities. London: Reaktion Books. pp. 189–194. ISBN 1-86189-101-6. Retrieved 3 April 2009.
  127. ^ Handel 2019, pp. 75–82.
  128. ^ Handel 2019, pp. 80–81.
  129. ^ Choo, Miho; O'Grady, William (1996). Handbook of Korean Vocabulary: An Approach to Word Recognition and Comprehension. University of Hawai‘i Press. pp. ix. ISBN 0-8248-1815-6.
  130. ^ a b Hannas 1997, pp. 68–72.
  131. ^ Pan, Yuling; Sha, Mandy (9 July 2019). The Sociolinguistics of Survey Translation. London: Routledge. ISBN 978-0-429-29491-4. S2CID 198632812.
  132. ^ Byon, Andrew Sangpil (2017). Modern Korean Grammar: A Practical Guide. Taylor & Francis. pp. 3–18. ISBN 978-1-351-74129-3.
  133. ^ Hannas 1997, p. 68.
  134. ^ "북한의 한문교과서를 보다". Chosun NK (in Korean). 14 March 2014.
  135. ^ Kim, Hye-jin 김혜진 (4 June 2001). 북한의 한자정책 – "漢字, 3000자까지 배우되 쓰지는 말라" [North Korea's Chinese character policy – "Learn up to 3,000 Chinese characters, but do not use them."]. Han Mun Love (in Korean). Chosun Ilbo. Archived from the original on 17 December 2014. Retrieved 21 November 2014.
  136. ^ Kiernan, Ben (2017). Viet Nam: A History from Earliest Times to the Present. Oxford University Press. p. 108. ISBN 978-0-190-62730-0.
  137. ^ a b Kornicki 2018, p. 63.
  138. ^ DeFrancis 1977, p. 19.
  139. ^ Clementin-Ojha, Catherine; Manguin, Pierre-Yves; Reid, Helen (2007). A Century in Asia: The History of the École Française D'Extrême-Orient, 1898–2006. Editions Didier Millet. p. 141. ISBN 978-9-81415-597-7.
  140. ^ Handel 2019, p. 135.
  141. ^ Shih, Chih-yu; Manomaivibool, Prapin; Marwah, Reena (2018). China Studies In South And Southeast Asia: Between Pro-china And Objectivism. World Scientific Publishing Company. p. 117. ISBN 978-9-81323-526-7.
  142. ^ Coulmas 1991, pp. 113–115.
  143. ^ DeFrancis 1977, pp. 75–219.
  144. ^ a b Zhou Youguang (周有光) (1991). Mair, Victor H. (ed.). "The Family of Chinese Character-Type Scripts (Twenty Members and Four Stages of Development)". Sino-Platonic Papers. 28. Archived from the original on 21 November 2023. Retrieved 7 June 2011.
  145. ^ Tang Weiping (唐未平) (2006). 广西壮族人文字使用现状及文字社会声望调查研究—以田阳、田东、东兰三县为例 [A Survey and Study on the Use Status and Literary Culture Attitude towards the Guangxi Zhuang Writing System—Using Counties Tianyang, Tiandong and Donglan as Examples] (碩士 thesis) (in Chinese). Guangxi daxue.
  146. ^ Zhao, Liming (1998). "Nüshu: Chinese women's characters". International Journal of the Sociology of Language. 129 (1). doi:10.1515/ijsl.1998.129.127. ISSN 0165-2516.
  147. ^ DeFrancis 1984, p. 242.
  148. ^ Hung, William (1951). "The Transmission of The Book Known as The Secret History of The Mongols". Harvard Journal of Asiatic Studies. 14 (3/4). Cambridge, MA: Harvard–Yenching Institute: 481. JSTOR 2718184.
  149. ^ Gulick, John (1870). "On the Best Method of Representing the Unaspirated Mutes of the Mandarin Dialect". The Chinese Recorder and Missionary Journal. 3: 153–155.
  150. ^ Matthews, Stephen; Yip, Virginia (2011). Cantonese: A Comprehensive Grammar (2nd ed.). London: Routledge. p. 445. ISBN 978-0-415-47131-2.
  151. ^ Norman 1988, pp. 8–9.
  152. ^ "漢字の現在:幽霊文字からキョンシー文字へ?" [From Ghost Character to Vampire Character?]. dictionary.sanseido-publ.co.jp (in Japanese). Retrieved 24 January 2015.
  153. ^ 国务院关于公布《通用规范汉字表》的通知 [Notice of the State Council on the Publication of the "General Standard Chinese Character List"] (in Chinese). State Council of the People's Republic of China. 5 June 2013.
  154. ^ 常用字字形表:二零零七年重排本:附粤普字音及英文解釋 [Commonly Used Characters Glyph Table: 2007 Rearranged Edition with Cantonese and Mandarin Pronunciations and English Explanations] (in Chinese). Hong Kong Education Bureau. 2012. ISBN 978-9-888-12393-3.
  155. ^ "Reference Glyphs for Chinese Computer Systems in Hong Kong". www.ccli.gov.hk. Common Chinese Language Interface Website. Retrieved 25 March 2024.
  156. ^ a b Dictionary of Chinese Character Variants.
  157. ^ 常用國字標準字體表 [Chart of Standard Forms of Common National Characters] (in Chinese). Taipei: Zhengzhong shuju. 1983. ISBN 978-9-570-90664-6.
  158. ^ Lunde 2008, pp. 81–82.
  159. ^ 改定常用漢字表、30日に内閣告示 閣議で正式決定 [The Amended List of Jōyō Kanji Receives Cabinet Notice on 30th: To Be Officially Confirmed in Cabinet Meeting.] (in Japanese). Nihon Keizai Shimbun. 24 November 2010. Archived from the original on 3 March 2016. Retrieved 1 February 2015.
  160. ^ a b Lunde 2008, pp. 84.
  161. ^ Yong & Peng 2008, pp. 280–282, 293–297.
  162. ^ Chen 1999, pp. 153.
  163. ^ Lü Bolin (呂柏林). 简化字的昨天、今天和明天 [Simplified Chinese characters for yesterday, today and tomorrow]. 乾坤再造在中华 (in Chinese). Archived from the original on 14 July 2011.
  164. ^ Chen 1999, pp. 150–153.
  165. ^ Chen 1999, pp. 151.
  166. ^ Chen 1999, pp. 182–186.
  167. ^ Chen 1999, pp. 154.
  168. ^ Ramsey 1987, p. 147.
  169. ^ Chen 1999, pp. 155–156.
  170. ^ 国务院关于公布《通用规范汉字表》的通知 [State Council Announcement of the Table of General Standard Chinese Characters] (in Chinese). Central People's Government of the People's Republic of China. 5 June 2013. Retrieved 8 November 2023.
  171. ^ "China's HSK Language Test to be Overhauled for the First 11 years". The Beijinger (blog). 3 April 2021.
  172. ^ Zhao Xiaoxie (赵晓霞) (9 April 2021). 《国际中文教育中文水平等级标准》来了 汉语水平考试会有啥变化 [HSK 3.0 is here. What changes will there be?] (in Chinese). People's Daily Overseas Edition. Archived from the original on 20 May 2021. Retrieved 20 May 2021. 日前,《国际中文教育中文水平等级标准》(GF0025-2021)(下称《标准》)由教育部、国家语言文字工作委员会发布,作为国家语委语言文字规范自2021年7月1日起正式实施。......汉语水平考试(HSK)自1984年开创以来已走过37年,经历了基础、初中等、高等"3等11级"的HSK1.0和"一级到六级"6个级别的HSK2.0两个阶段,即将迎来"三等九级"的HSK3.0新阶段。
  173. ^ "What is Chinese Proficiency Test?". China's University and College Admission System. Archived from the original on 22 December 2015.
  174. ^ Chen 1999, pp. 161.
  175. ^ Chia Shih Yar (谢世涯). 新加坡与中国调整简体字的 [A Comparative Study of the Revision of Simplified Chinese Characters Proposed by Singapore and China]. Paper presented at The International Conference on Culture of Chinese Character. Convened by Beijing Normal University and Liaoning People Publishing House. Dandong, Liaoning, China. 9-11 Nov 1998 (in Chinese) – via huayuqiao.org.
  176. ^ Lin Youshun (林友順) (June 2009). 大馬華社遊走於簡繁之間 [The Malaysian Chinese Community Wanders Between Simplified and Traditional Characters] (in Chinese). Yazhou Zhoukan. Archived from the original on 23 May 2021. Retrieved 30 March 2021.
  177. ^ Yang, Shimin (2014). Written at Science and Technology College, Jiangxi Normal University. "Several Thoughts on Current Chinese Education in the Philippines" (PDF). Nanchang: Atlantis Press. p. 197.
  178. ^ "Taiwan Benchmarks for the Chinese Language". National Academy for Educational Research. The TBCL sets out seven levels of Chinese language proficiency in the five skills: listening, speaking, reading, writing and translating. It also includes lists which contains 3,100 Chinese characters, 14,425 words, and 496 grammar points for learners of level 1 to 5.
  179. ^ Lin Qinglong (林慶隆) (1 August 2020). 遣辭用「據」:臺灣華語文能力第一套標準 [The First Set of Standards for Chinese Language Proficiency in Taiwan] (PDF) (in Chinese). Taipei: National Academy for Educational Research. ISBN 9789865460082. Archived (PDF) from the original on 20 May 2021. 本字表各級收錄字數:第1級246個字、第2級258個字、第3級297個字;第4級499個字、第5級600個字;第6級600個字、第7級600個字,共計3,100個字。
  180. ^ Hua, Vanessa (8 May 2006). "For Students of Chinese, Politics Fill the Characters / Traditionalists Bemoan Rise of Simplified Writing System Promoted by Communist Government to Improve Literacy". SFGATE. Retrieved 28 February 2018.
  181. ^ Kane, Mathew (November–December 2012). "Chinese Character Usage in New York City" (PDF). The ATA Chronicle. pp. 20–23. Archived from the original (PDF) on 15 December 2018. Retrieved 22 July 2019.
  182. ^ 人名用漢字に「渾」追加 司法判断を受け法務省 改正戸籍法施行規則を施行、計863字に ["渾" added to kanji usable in personal names; Ministry of Justice enacts revised Family Registration Law Enforcement Regulations following judicial ruling, totaling 863 characters.]. The Nikkei (in Japanese). 25 September 2017.
  183. ^ "Summary of the deliberation results of the Korean Language Council on the scope of Chinese characters for personal use". National Academy of the Korean Language (in Korean). 1991. Archived from the original on 19 March 2016.
  184. ^ "'인명용(人名用)' 한자 5761→8142자로 대폭 확대". Chosun Ilbo (in Korean). 20 October 2014. Retrieved 23 August 2017.
  185. ^ Da, Jun (2010). "Chinese Text Computing".
  186. ^ Su 2014, p. 183.
  187. ^ Yong & Peng 2008, pp. 145, 400–401.
  188. ^ Norman 1988, p. 73.
  189. ^ Su 2014, pp. 47, 51.
  190. ^ Sugawara Yoshizō (菅原義三), ed. (18 December 1990). 国字の字典 [Dictionary of National Characters] (in Japanese). Tōkyōdō Shuppan. ISBN 978-4-490-10279-6.
  191. ^ Phan, John (2013). "Chữ Nôm and the Taming of the South: A Bilingual Defense for Vernacular Writing in the Chỉ Nam Ngọc Âm Giải Nghĩa". Journal of Vietnamese Studies. 8 (1). University of California Press: 1. doi:10.1525/vs.2013.8.1.1.

Example lexemes[edit]

  1. ^
  2. ^ Hanyu Da Zidian, p. 4893, 高
  3. ^
  4. ^ Hanyu Da Zidian, p. 2594, 砼
  5. ^ Dictionary of Chinese Character Variants, ,
  6. ^ Dictionary of Frequently-Used Taiwan Hakka,
  7. ^ "囍". Education Encyclopedia (in Chinese).
  8. ^ Li Na (李娜) Chen Shuangxin (陈双新) (12 August 2018). ""廿"该怎么读" [How to pronounce "廿"]. news.gmw.cn (in Chinese). Guangming Online. Retrieved 31 October 2023.
  9. ^ "圕"字怎么念?什么意思?谁造的?. Singtao Net (in Chinese). 21 April 2006. Archived from the original on 3 October 2011.
    ""圕" zì zì zěnme niàn? Tái jiàoyù bùmén fùzé rén bèi kǎo dào" "圕"字字怎么念?台教育部门负责人被考倒. Xinhua News Agency (in Chinese). 20 March 2009. Archived from the original on 25 March 2009.
  10. ^ Dictionary of Chinese Character Variants, ,
  11. ^ A Chinese–English Dictionary 1994, 齉
  12. ^
  • 汉语大字典 [Hanyu Da Zidian] (in Chinese) (8 vol. ed.). Sichuan cishu chubanshe. 1990. ISBN 978-7-805-43156-7 – via Home in Mists.
  • 汉英词典 [A Chinese–English Dictionary] (in Chinese and English). Beijing Foreign Studies University. 1994. ISBN 978-7-560-00739-7.
  • 異體字字典 [Dictionary of Chinese Character Variants] (in Chinese). Academica Sinica. 2017.
  • 客家語常用詞辭典 [Dictionary of Frequently-Used Taiwan Hakka] (in Mandarin and Hakka Chinese). Taiwan Ministry of Education. 2019.{{cite book}}: CS1 maint: unrecognized language (link)
  • 國語辭典簡編本 [Concised Mandarin Chinese Dictionary] (in Chinese and English) (3rd ed.). Taiwan Ministry of Education. 2021.

Works cited[edit]

 This article incorporates text from Chinese Recorder and Missionary Journal, vol. 3, a publication from 1871, now in the public domain in the United States.

Further reading[edit]

Works of historical interest[edit]

External links[edit]

Online references and databases[edit]

  • Unihan Database – Official Unicode site on Chinese characters and Han unification, with reference glyphs, readings, and meanings for all characters encoded in the standard
  • Chinese Text Project Dictionary – Comprehensive character dictionary, including data for all Chinese characters within Unicode, and exemplary examples of use in Classical Chinese texts
  • zi.tools – Character lookup by component description, character etymology, phonology, orthography, and dictionary
  • Chinese Etymology by Richard Sears
  • Chinese text computing – Statistics regarding the use of Chinese characters, by Jun Da