*** empty log message ***
[bpt/emacs.git] / README.unicode
1 -*-text-*-
2
3 Problems, fixmes and other issues in the emacs-unicode branch
4 -------------------------------------------------------------
5
6 Notes by fx to record various things of variable importance. handa
7 needs to check them -- don't take too seriously, especially with
8 regard to completeness.
9
10 _Do take seriously that you don't want this branch unless you're
11 actually working on it; you risk your data by actually using it._ If
12 you just want to edit Unicode and/or unify iso-8859 et al, see the
13 existing support and the extra stuff at
14 <URL:ftp://dlpx1.dl.ac.uk/fx/emacs/Mule>, mostly now in the CVS trunk.
15 (Editing support is mostly orthogonal to the internal representation.)
16
17 * SINGLE_BYTE_CHAR_P returns true for Latin-1 characters, which has
18 undesirable effects.
19
20 * Rationalize character syntax and its relationship to the Unicode
21 database. Specifically, the latin-N.el files aren't consistent for
22 common characters (and obviously have redundancies except in
23 unibyte mode).
24
25 * Fontset handling and customization needs work. We want to relate
26 fonts to scripts, probably based on the Unicode blocks. The
27 presence of small-repertoire 10646-encoded fonts in XFree 4 is a
28 pain, not currently worked round.
29
30 * Work is also needed on charset and coding system priorities.
31
32 * The relevant bits of latin1-disp.el need porting (and probably
33 re-naming/updating). See also cyril-util.el.
34
35 * Quail files need more work now the encoding is irrelevant.
36
37 * What to do with the old coding categories stuff?
38
39 * Syntax for symbols &c in characters.el needs looking at.
40
41 * The preferred-coding-system property of charsets should probably be
42 junked unless it can be made more useful now.
43
44 * find-coding-systems-for-charsets needs re-writing or removing.
45
46 * find-multibyte-characters needs looking at.
47
48 * Implement Korean cp949/UHC and any other important missing
49 charsets.
50
51 * Check up on definitions of tcvn and alternativnj.
52
53 * Lazy-load tables for unify-charset somehow?
54
55 * Translation tables for {en,de}code currently aren't supported.
56
57 * Defining CCL coding systems currently doesn't work.
58
59 * iso-2022 charsets get unified on i/o.
60
61 * Revisit locale processing: look at treating the language and
62 charset parts separately. (Language should affect things like
63 speling and calendar, but that's not a Unicode issue.)
64
65 * Handle Unicode combining characters usefully, e.g. diacritics, and
66 handle more scripts specifically (à la Devanagari). There are
67 issues with canonicalization.
68
69 * Bidi is a separate issue with no support currently.
70
71 * DTRT with X keysyms. We should get the right unicode for a given
72 keysym, not decode raw bytes in some ill-defined coding system.
73 (fx has some data on keysyms v. unicodes.)
74
75 * We need tabular input methods, e.g. for maths symbols. (Not
76 specific to Unicode.)
77
78 * Need multibyte text in menus, e.g. for the above. (Not specific to
79 Unicode.)
80
81 * Still can't have case pairs which have different byte lengths --
82 can that be fixed for Turkish, at least?
83
84 * There's currently no support for Unicode normalization.
85
86 * Populate char-width-table correctly for Unicode chanaracters and
87 worry about what happens when double-width charsets covering
88 non-CJK characters are unified.
89
90 * Emacs 20/21 .elc files are currently not loadable. It may or may
91 not be possible to do this properly.
92
93 * Encoding issues in babyl files/rmail need sorting out.
94
95 * Gnus still needs some attention, and we need to get changes
96 accepted by Gnus maintainers...
97
98 * You can grep the code for lots of fixmes.