Tk Kanji is a graphical user interface to the freely available Kanji dictionary files compiled by Jim Breen at Monash University in Melbourne. It is a very rough cut at an application which could do a lot more. For the time being, it provides a functional browser to the dictionary, a couple of kanji study drills to aid my progress through James W. Heisig's Remembering the Kanji, I: A complete course on how not to forget the Meaning and Writing of Japanese characters, a game to entertain my 3 year old, an example of an internationalized application in Tcl/Tk, and a, no doubt, large number flawed design decisions and bugs which I'll correct when I start doing a better job of earning a living.
Kanji means "Chinese Character" in Japanese. It denotes the set of Chinese pictographs and ideographs adopted for the purposes of writing Japanese starting in the 4th century of the common era. There are 2229 kanji in common use in Japan, according to the Jouyou/Jinmeiyou lists compiled by the Japanese government. The JISX-0208-1990 encoding specifies the 6355 kanji which are most likely to be encountered, which include the Jouyou/Jinmeiyou lists. The JISX-0212-1990 encoding specifies an additional 5801 kanji which are less frequently encountered. Tk Kanji allows you to browse kanjidic and kanjd212, Jim Breen's dictionaries describing the 12156 kanji covered by the JISX encodings.
Tk Kanji is a Tcl/Tk application. Tcl is a scripting language which runs on Unix, Windows 95/98/NT, and Apple MacIntosh computers. Tk is a graphical user interface toolkit based on Tcl which runs on the same platforms. Tk Kanji illustrates the speed with which cross-platform applications can be built in Tcl/Tk, the extent and ease of use of the internationalization features of Tcl/Tk, and a bug in the Tcl internationalization support.
It's hard to appreciate how rapidly a Tcl/Tk programmer can generate useful applications. My study of Kanji began when the books which Amazon shipped on July 31, 1999 arrived. I began building a Kanji study application sometime after that. But work on Tk Kanji could only begin when I finally stumbled onto Jim Breen's dictionaries on August 13, 1999. So in Tk Kanji version 0.1 you are seeing the fruits of no more than 7 days work.
It's also very hard to spot the part of Tk Kanji which
makes it a Japanese competent program, so I'll tell you
where it is. If you look at the procedures which read
files, you'll find the lines:
fconfigure $fp -encoding $encoding
fconfigure $fp -encoding euc-jp
These lines tell Tcl the encoding used in the files
it is reading. The second line explicitly specifies that
the file uses the Extended Unix Code for Japanese.
Knowing the encoding of a file, Tcl is able to read the
file and translate its contents into
Unicode, a character
set which represents all the ways currently used by human
beings to write their languages. That, and installation
of the appropriate fonts, is all it took to bootstrap
Tcl/Tk to the point where it was displaying error alerts
with kanji embedded in them.
The bug in Tcl/Tk's internationalization that kanjd212 turned up is that only half of the euc-jp encoding is implemented. The Extended Unix Code is used in Chinese, Japanese, and Korean to mix single byte codes for the Latin alphabet, double byte codes for kanji, an alternate single byte code set for hangul and kana, and an alternate double byte code for even more kanji. The Japanese code is the only one that uses the alternate double byte coding. Tcl/Tk doesn't currently support the alternate double byte coding. I expect this to be fixed in a future release of Tcl/Tk. Tk Kanji uses a work around to read kanjd212, but other files will display the string "\x8f" followed by a kanji from the JISX-0208-1990 character set whenever a JISX-0212-1990 character is encountered. My apologies for any confusion this causes.
To install Tk Kanji you will need:
-
Tcl/Tk version 8.1.1 or later. Scriptics has Windows and MacIntosh self installing binaries, sources, and perhaps even a few prebuilt Unix binaries. I haven't had time to test against the latest Tcl/Tk release, 8.2.0, but it almost certainly will work, too.
-
Japanese fonts. You may have some Japanese fonts already installed on your computer. The configuration page in Tk Kanji will let you preview the presentation of kanji in all the available font families at the sizes it uses.
There will be some deja vu involved in the previewing because Tk substitutes a font family if the family selected doesn't contain the kanji characters, so don't be alarmed if the Courier, Helvetica, and Mincho families look suspiciously similar. If all the choices look ragged or boring at large point sizes, then you probably want some nicer fonts.
For Windows, you can take Internet Explorer 5 to its "windows update" menu item in the "tools menu" and then select the Japanese Language Support update for download. This gets you, I believe, the "MS Gothic" font. The Far-Eastern support kit for Office has a "MS Mincho" font. You can find an installer for this at the Nihongo Archive.
For X11, the best fonts derive from the GNU support for internationalized emacs. You can find these at the GNU ftp archives, along with instructions for installing them and tricking your Xserver into finding them. The Nihongo Archive also has the GNU fonts, and I also found them packaged into rpm format via RPM finder but they seem to have since disappeared.
-
Kanji dictionaries. You will need a copy of kanjidic.gz. There are instructions for decompressing gzipped files at the Nihongo archive. You might want a copy of kanjd212.gz, but you don't need it to use Tk Kanji. You might also find the documentation files for kanjidic, kanjidic.doc, and for kanjd212, kanjd212.doc, helpful in explaining some of what TkKanji is showing you.
-
Tk Kanji itself. You can download either as a gzipped tar archive (24K), or as a zip archive (35K), or as a tclshar archive (130K). Each of these will create a directory named TkKanji in the current directory and unpack the Tk Kanji sources into it. If everything else is set, you should be able to execute tkkanji.tcl, specify the location of your dictionaries, and go.
The tclshar archive can be unpacked by the copy of Tcl/Tk that you installed in the first step, so it's the best choice of archive if zip and gzipped tar don't ring any bells.
Thanks are due to several people and organizations.
The work of the Unicode Consortium made all the world's writing systems available for inclusion in computer programs.
The developers of Tcl/Tk made Unicode available to programs running on Unix, Windows, and MacIntosh computers.
GNU and Microsoft supplied the fonts to display the Unicode.
Jim Breen and his students and collaborators at the Nihongo Archives compiled and provided the dictionary which make the program really interesting.
Bruce Gingery, and Larry Virden provided feedback on web page configuration and installation instruction lapses.
Domo arigato.
Tk Kanji is Copyright © 1999 by Roger E Critchlow Jr, Santa Fe, New Mexico, USA.
This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.