+++ /dev/null
-
-HOW TO ADD A NEW CHARACTER SET MAPPING.
-
- * Create a struct unicode_info structure. This structure defines the
- official character set name, as well as pointers to conversion functions.
-
- * Add the name of the character set, and the name of your structure to
- unicode/charsetlist.txt. Multiple entries in unicode/charsetlist.txt can
- be used to define aliases for the same character set. Example - "IBM869"
- and "CP869" both specify the same character set, they both point to the
- unicode_IBM_869 object, which is defined in ibm869.c
-
- There's an automatically generated source file, charsetlist.c, which is
- generated by a script from charsetlist.txt. That's how character sets end up
- being linked into the code, and how individual character sets can be
- selectively included or excluded.
-
- The struct unicode_info structure contains pointers to the following
- functions:
-
- + Convert text in this character set to unicode.
-
- + Convert unicode to text in this character set.
-
- + Convert text in this character set to uppercase.
-
- + Convert text in this character set to lowercase.
-
- + Convert text in this character set to titlecase.
-
- If the character set allows for convenient conversion to
- upper/lower/titlecase, the conversion code should be coded directly.
- Otherwise, the library has a set of convenient functions that go against
- the unicode master table. Text in any character set can
- upper/lower/titlecased by converting it to unicode, running it through
- unicode_uc/unicode_lc/unicode_tc, then converting unicode back to the
- original character set. See utf8_chset.c for an example.
-
- Note that unicode_uc/unicode_lc/unicode_tc carries a heavy penalty, and
- should be avoided. unicode_[ult]c() adds about 26Kb of data tables.
-
- Finally, all this code has to be added to libunicode.a. It can simply be
- added to libunicode_a_SOURCES.
-
- If, after doing all that, run make to build libunicode.a and the
- unicode-info program. Run unicode-info. If the character set is listed by
- unicode-info, you should be all set, provided that the conversion functions
- actually work as advertised.
-