--- /dev/null
+
+HOW TO ADD A NEW CHARACTER SET MAPPING.
+
+ * Create a struct unicode_info structure. This structure defines the
+ official character set name, as well as pointers to conversion functions.
+
+ * Add the name of the character set, and the name of your structure to
+ unicode/charsetlist.txt. Multiple entries in unicode/charsetlist.txt can
+ be used to define aliases for the same character set. Example - "IBM869"
+ and "CP869" both specify the same character set, they both point to the
+ unicode_IBM_869 object, which is defined in ibm869.c
+
+ There's an automatically generated source file, charsetlist.c, which is
+ generated by a script from charsetlist.txt. That's how character sets end up
+ being linked into the code, and how individual character sets can be
+ selectively included or excluded.
+
+ The struct unicode_info structure contains pointers to the following
+ functions:
+
+ + Convert text in this character set to unicode.
+
+ + Convert unicode to text in this character set.
+
+ + Convert text in this character set to uppercase.
+
+ + Convert text in this character set to lowercase.
+
+ + Convert text in this character set to titlecase.
+
+ If the character set allows for convenient conversion to
+ upper/lower/titlecase, the conversion code should be coded directly.
+ Otherwise, the library has a set of convenient functions that go against
+ the unicode master table. Text in any character set can
+ upper/lower/titlecased by converting it to unicode, running it through
+ unicode_uc/unicode_lc/unicode_tc, then converting unicode back to the
+ original character set. See utf8_chset.c for an example.
+
+ Note that unicode_uc/unicode_lc/unicode_tc carries a heavy penalty, and
+ should be avoided. unicode_[ult]c() adds about 26Kb of data tables.
+
+ Finally, all this code has to be added to libunicode.a. It can simply be
+ added to libunicode_a_SOURCES.
+
+ If, after doing all that, run make to build libunicode.a and the
+ unicode-info program. Run unicode-info. If the character set is listed by
+ unicode-info, you should be all set, provided that the conversion functions
+ actually work as advertised.
+