HOW TO ADD A NEW CHARACTER SET MAPPING. * Create a struct unicode_info structure. This structure defines the official character set name, as well as pointers to conversion functions. * Add the name of the character set, and the name of your structure to unicode/charsetlist.txt. Multiple entries in unicode/charsetlist.txt can be used to define aliases for the same character set. Example - "IBM869" and "CP869" both specify the same character set, they both point to the unicode_IBM_869 object, which is defined in ibm869.c There's an automatically generated source file, charsetlist.c, which is generated by a script from charsetlist.txt. That's how character sets end up being linked into the code, and how individual character sets can be selectively included or excluded. The struct unicode_info structure contains pointers to the following functions: + Convert text in this character set to unicode. + Convert unicode to text in this character set. + Convert text in this character set to uppercase. + Convert text in this character set to lowercase. + Convert text in this character set to titlecase. If the character set allows for convenient conversion to upper/lower/titlecase, the conversion code should be coded directly. Otherwise, the library has a set of convenient functions that go against the unicode master table. Text in any character set can upper/lower/titlecased by converting it to unicode, running it through unicode_uc/unicode_lc/unicode_tc, then converting unicode back to the original character set. See utf8_chset.c for an example. Note that unicode_uc/unicode_lc/unicode_tc carries a heavy penalty, and should be avoided. unicode_[ult]c() adds about 26Kb of data tables. Finally, all this code has to be added to libunicode.a. It can simply be added to libunicode_a_SOURCES. If, after doing all that, run make to build libunicode.a and the unicode-info program. Run unicode-info. If the character set is listed by unicode-info, you should be all set, provided that the conversion functions actually work as advertised.