IME on Linux January 21, 2008Posted by globalizer in Unicode.
Here’s a short guide to enabling an IME (Input Method Editor) on Linux. For no particular reason, other than the fact that I have just had to retrace my steps to figure out how to do it for some education material I am preparing.
This should work in a more or less similar fashion across various Linux distributions, but in the steps below I use Debian.
The SCIM (Smart Common Input Method) IME comes pre-installed on the Debian distribution, but you do have to do a little bit of additional install and configuration to really get it to work.
Install the locales you want to use the IME in. I recommend using the UTF-8 locales for ja_JP, zh_CN, zh_TW and ko_KR.
On Debian, it is very easy to simply use the command “dpkg-reconfigure locales”. This brings up a dialog where you can select the locales (from a very long and impressive list, btw., especially when you consider that all those locales actually contain real localized versions). Once you are done and exit the dialog, the locales are automatically generated for you (even though you are told you need to do this yourself).
Install appropriate fonts. Again, on Debian, this is really easy. Use the “apt-cache search” command to look for available fonts – “apt-cache search Chinese” to find Chinese fonts, for instance. Or even better, “apt-cache search ttf | grep Chinese” to whittle the results down to something like this:
cjk-latex – A LaTeX macro package for CJK (Chinese/Japanese/Korean)
ttf-arphic-bkai00mp – “AR PL KaitiM Big5” Chinese TrueType font by Arphic Technology
ttf-arphic-bsmi00lp – “AR PL Mingti2L Big5” Chinese TrueType font by Arphic Technology
ttf-arphic-gbsn00lp – “AR PL SungtiL GB” Chinese TrueType font by Arphic Technology
ttf-arphic-gkai00mp – “AR PL KaitiM GB” Chinese TrueType font by Arphic Technology
ttf-arphic-ukai – “AR PL ZenKai Uni” Chinese Unicode TrueType font Kaiti style
ttf-arphic-uming – “AR PL ShanHeiSun Uni” Chinese Unicode TrueType font Mingti style
ttf2pt1-chinese – Chinese fonts encoding maps for ttf2pt1
ttfprint – A utility to print Chinese text using TrueType fonts
Then use “apt-get install ttf-arphic-uming” to install one of the Unicode fonts. Repeat for the script/languages you want to support.
Install and configure SCIM. This is the only step that required some rooting around on the web for instructions, and I actually did not succeed in finding any that I would call really good or final.
Anyway, here’s what I did:
A. “apt-cache search scim” (our trusty old aptitude again).
This gives you a fair-sized list of packages, and it is not entirely clear exactly what you need. The base package, scim, comes with the Debian distribution. I picked the following additional packages to do “apt-get install” on, and that seems to work for me for Chinese, Japanese and Korean so far:
scim-chewing – Chewing IM engine module for SCIM
scim-chinese – Smart pinyin IM engine module for SCIMscim-gtk2-immodule – GTK2 IMModule with SCIM
scim-hangul – Hangul Input for SCIM
scim-m17n – M17N Input for SCIMscim-tables-ja – SCIM Japanese Input Method table
data (Hiragana, Katagana, etc.)
scim-tables-ko – SCIM Korean Input Method table data (Hangul, Hanja, etc.)
scim-tables-zh – SCIM Chinese Input Method table data (WuBi, CangJie, etc.)
B. Add the locales you want to use scim for in /etc/scim/global
My file had this line to begin with:
/SupportedUnicodeLocales = en_US.UTF-8
I added the other locales I wanted to use it in, separated with commas:
/SupportedUnicodeLocales = en_US.UTF-8,ja_JP.utf8,zh_CN.utf8,zh_TW.utf8,ko_KR.utf8
C. Create a scim startup script: /etc/X11/Xsession.d/74custom-scim_startup
with the following content:
D. Execute “scim -d” when you log in or when the shell starts.
Now I can turn the scim IME on and off by using the Ctrl+space key combination, and this gives me access to these selections:
I thus have access to a variety of input methods for each language, allowing even somebody like me, with absolutely no ability to speak or read any of these languages, to input characters from the respective scripts with relative ease. I can simply input characters from for instance the Pinyin romanization system and get suggestions for Chinese characters that correspond to the sounds in Chinese Mandarin: