Chinese in OS X 10.4 Tiger

Installation

Under the Language tab in System Preferences... International, you will find a list of languages supported by OS X 10.4. Chinese is installed by default. The language at the top of the list is used by the Finder. Adjustments to this list affect the default font behavior in applications that use Apple's built-in text engine, like Mail, Safari, and iWork. Unless you are running the Finder in Japanese or Korean (or Traditional Chinese), we recommend that you order the East Asian languages as follows: Simplified Chinese (简体中文), Traditional Chinese (繁體中文), Japanese (日本語), Korean Hangul (한글).

The "Order for sorted lists" pop-up menu has five choices for Chinese:

  1. Chinese ~ Sort by standard Unicode order. By radical, then number of strokes. This is the same as choosing "Standard" at the top of the list.
  2. Chinese (big5han) ~ Sort by Big Five code.
  3. Chinese (gb2312han) ~ Sort by GB code.
  4. Chinese (Pinyin Order) ~ Sort by pronunciation, in Hanyu Pinyin romanization.
  5. Chinese (Stroke Order) ~ Sort by number of strokes, then radical.

Note: These sort orders only apply to Unicode's CJK Unified Ideographs block. Characters from CJK Unified Ideographs Extension A and Extension B are listed in turn by Unicode order (radical/stroke), after the original block.

To see if an application can be localized for Chinese (i.e., run with menus and dialogs in Chinese), select its icon in the Finder and choose Get Info from the File menu. See the Languages section of the window that opens. Apple uses "zh_CN" for Simplified Chinese and "zh_TW" for Traditional Chinese. To localize an application for Chinese, uncheck all languages listed above Chinese in the Language tab (see above).

Troubleshooting:

  • You may need to use the File Name Encoding Repair utility to see Chinese file and folder names created in OS 9 and earlier.

Fonts

Five groups of Chinese fonts are installed in Tiger:

  • Five GB 18030 fonts:
    • In the /System/Library/Fonts folder: 华文黑体 ST (SinoType) Hei Regular and 华文细黑 ST Hei Light. They both appear in the Font panel under STHeiti.
    • In the /Library/Fonts folder: 华文楷体 ST Kai Regular, 华文宋体 ST Song Regular, and 华文仿宋 ST Fangsong Regular.
  • Two GB 2312 fonts:
    • In the /System/Library/Fonts folder: Hei.
    • In the /Library/Fonts folder: Kai.
  • Two Big Five fonts that support the Big-5E and HKSCS 2001 extensions:
    • In the /System/Library/Fonts folder: LiHei 儷黑 Pro.
    • In the /Library/Fonts folder: LiSong 儷宋 Pro.
    • These contain a selection of 17,607 characters from the Unicode CJK Unified Ideographs block, 512 from Extension A, and 1,640 from Extension B.
  • Three standard Big Five fonts:
    • In the /System/Library/Fonts folder: Apple LiGothic Medium.
    • In the /Library/Fonts folder: Apple LiSung Light and BiauKai.

OS X also makes the fonts in your OS 9/Classic System folder available. Thus, if you install the Chinese Language Kit, the old Macintosh system fonts Taipei and Beijing will be available. You can also install these fonts directly into OS X, but it really is not necessary if you already have them installed in OS 9/Classic.

For more information and a complete list of the fonts that come with Tiger, see: http://docs.info.apple.com/article.html?artnum=301332

Input Methods

Input Menu

Under the Input Menu tab in System Preferences... International, you will find check boxes that activate the components of the Traditional Chinese input method and the Simplified Chinese input method. Make sure that the "Show input menu in menu bar" box is also checked. You can also check the Character Palette box to make it appear, and so on:

Input menu prefs

"Keyboard Shortcuts..." leads to the Keyboard Shortcuts tab in System Preferences... Keyboard & Mouse, you will find two keyboard shortcuts listed under the "Input Menu" heading:

  • Command-space [⌘Space] ~ Selects the previous input source. Toggles back and forth between the last two input sources selected in the Input menu.
  • Command-option-space [⌘⌥Space] ~ Selects the next input source. Cycles through the keyboards and input methods in the Input menu.

Checking the "Use one input source in all documents" box means that when you switch from one document to another, the active input source remains the same. If you check the other box, "Allow a different input source for each document," then the input source will change to that which was active the last time you were in the document you switch to.

The Chinese input methods and plug-ins you choose will appear right away in the Input menu itself, which appears on the right side of the Menu bar:

Input menu

To activate a keyboard or input method, choose it from the menu. Its icon will appear in the Menu bar and it will have a check mark beside it in the menu. In the above example, U.S. and Japanese keyboards and input modes are followed by ITABC, a built-in Simplified Chinese Pinyin input method. Next comes Hanin, a built-in Traditional Chinese input method, and then Biauyin, a Traditional Chinese input method plug-in for typing Chinese romanizations. The last item is then QIM, a powerful third-party input method that supports both Simplified and Traditional Chinese.

Chinese Input

To input Chinese, you can use Apple's built-in Traditional Chinese input method (TCIM) or Simplified Chinese input method (SCIM). There are also other input methods available.

The Input window is the first step in entering Chinese characters and words. As you type the input keys for a character, they appear in a window with a line under them to indicate they are in the active input area. Inline input eliminates the Input window and causes the input area to appear in place in the text of your document. As you type the input keys for a character, they appear in the text with a line under them to indicate the active input area. Most applications support inline input by default. Some applications allow the user to turn it on or off.

After you are done typing the input keys, press the space bar and the Candidate window will appear if it hasn't already:

Candidate window

Characters are arranged in rows. Use the up or down arrow keys to move between rows and the right or left arrow keys to move within a row, or use the mouse. Press the return key to enter the selected character into text. There are two shortcuts to enter characters into text:

  • Type the number next to a character to select and enter it.
  • Double-click on a character to select and enter it.

You can adjust the direction, font, and point size for the Candidate window in the TCIM and SCIM Preferences.

Roman Input

You can easily switch to a Roman keyboard layout using the Input menu and its keyboard shortcuts.

In addition, the Chinese input methods allow you to directly enter Roman characters as either single-byte (a.k.a. "half-width") or double-byte (a.k.a. "full-width") characters:

  • To enter single-byte Roman characters, press the caps lock key and type as you normally do.
  • To enter double-byte Roman characters, choose "Use Two Byte Roman Characters" from the Input menu. Double-byte Roman characters align with Chinese text. This is a useful property in certain contexts, such as tables and forms.
  • To enter one lower-case double-byte Roman character, press the tilde (~) key, then press the key you want. Useful for entering numbers.

Character Palette

In Cocoa applications, the Character Palette is always accessible via Edit > Special Characters... There are multiple ways to view Chinese characters. To input characters into text in an application, just double-click on the character you want, or use the "Insert" button:

SC

  • Simplified Chinese displays the GB 18030 character set in the "by Radical" tab (shown above, includes both Simplified and Traditional characters). You can also use the "by Category" tab (Unicode blocks). If you highlight a character and then pause the mouse over it, a little info panel will appear, giving the UTF-16, UTF-8, and GB code points.
  • Traditional Chinese displays the Big Five character set in the "by Radical" tab (Traditional characters only). If you highlight an individual character and then pause the mouse over it, a little info panel will appear, giving the UTF-16, UTF-8, and Big-5E and/or HKSCS-2001 code points.
  • All Characters displays all of the characters defined in Unicode. Chinese characters are found in the "by Radical" tab.
  • Code Tables displays Chinese characters in both the "Unicode" tab and the "Other Encodings" tab. Other Encodings provides tables of four Chinese encodings: Big-5E, HKSCS-2001, GB2312, and GB18030.
  • Glyph displays the complete contents of the selected font.

In the Character Info section (shown above), you will find a list of characters related to the selected character, along with the input key sequences for the Apple input methods. You can drag/copy any character from an application and drop/paste it into the Character Info section to get information about that character.

In the Font Variation section, click on the triangle to see all available glyphs for the selected character in the different fonts on the system. In addition, you can choose between "glyph variants" for a single Unicode character. Currently, the only fonts that contain glyph variants are Japanese: the Hiragino fonts and Adobe's Kozuka Pro fonts. Try U+9957, for example. Not all applications support glyph variants.

The pop-up menu in the bottom left of the palette provides access to Font Book via "Manage Fonts..."

Pop-up

If you select a character in an application like TextEdit or Pages and then choose "Show Character Selected in Application" in the Character Palette, it will jump to that character.

Last but not least, there is the search window at the bottom right of the palette. Here you can search for Chinese characters using their Hanyu Pinyin readings, in two categories, Simplified Chinese (the GB 18030 character set) Pinyin and Traditional Chinese Pinyin. Double-click on a character in the list of search results to bring it up in the Character Palette.

Search

You can also search for Zhuyin readings, Japanese readings, Korean readings, Unicode character names, code points, and so on.

Utilities

The TCIMTool and the SCIMTool are located in the /System/Library/Components folder, within the TCIM.component and the SCIM.component.

Chinese Text Converter

Stored in the /System/Library/Services folder. Converts plain-text documents between Chinese encodings. This functions in the Services menu for Cocoa applications and it is also built into the Input menu for the Chinese input methods, which function in all applications.

Input Method Plug-in Converter

Stored in the /System/Library/CoreServices folder. Converts plain text (.txt) source files into Chinese input method plug-in data (.dat) files. Use the "Generate IM Plug-in" command in the TCIM or SCIM Input menu to install plug-ins. They will be installed into the /Library/Input Method Plug-Ins folder and they should appear immediately in the Input menu. To uninstall them, simply remove them from the /Library/Input Method Plug-Ins folder. Then logout and login. Then select the plug-in from the Input menu and it will disappear.

Troubleshooting:

  • If you do not have a pre-installed /Library/Input Method Plug-ins/ folder, you will have to create one. If you do not have multiple user accounts on your machine, you can simply add a new folder to your /Library/ folder and name it "Input Method Plug-ins". If you do have multiple user accounts on your machine, you must log in as the root user first and only then add the new folder.

File Name Encoding Repair

OS 8 and OS 9 use Apple's WorldScript encodings to enter and display file names. OS X uses Unicode to enter and display file names. OS 9 converts file names to Unicode for use on OS X, but when the encoding differs from the system default in OS 9 (for example, a Chinese file name on an English system) the conversion to Unicode can be incorrect. This utility corrects many common cases of incorrect conversion. See: http://www.apple.com/support/downloads/filenameencodingrepairutility.html

UnicodeChecker

UnicodeChecker allows you to browse the Unicode character set on your machine. For any character, it will tell you the decimal Unicode number, hexadecimal Unicode scalar value, hexadecimal UTF-8, UTF-16 and UTF-32 code, Unicode name, and more. You can also install Unihan.txt for direct access to the information from the Unihan database. Use version 1.15.1 for OS X 10.4 and above. See: http://www.earthlingsoft.net/UnicodeChecker/

Jedit X

This text editor handles problems with CJK text documents especially well. Download Rev. 2 for OS X 10.4 and above: http://www.artman21.com/en/jedit_x/download.html

Applications

Mail 2

Mail in Tiger automatically sets the encoding of outgoing messages based on content. If your system is set to run in English (in the Language tab of System Preferences... International), or anything other than Chinese or Japanese, the default encoding for outgoing Chinese messages is UTF-8. When the system language is set to Traditional Chinese, the default is Big Five. For Simplified Chinese it is GB 2312, and for Japanese it is ISO-2022-JP.

You can manually set the encoding of an outgoing message in Message > Text Encoding. For example, "Simplified Chinese (EUC)" sets the charset name to GB2312. Unfortunately, this only changes the encoding of the body of the message. The encoding of the message subject does not change. One solution for this problem is to set your system language to Chinese or Japanese. The message subject line will then be set to the default for that language, as discussed above. Another solution has been outlined by Tom Gewecke in the Apple Discussions User Tips Library.

TextEdit 1.4

You can customize the pop-up menu for encodings in Preferences. At the bottom of the menu is a "Customize Encodings List..." command, which brings you to a checklist of all supported encodings.