README.unicode

   1                                                                    -*-text-*-
   2
   3 Problems, fixmes and other issues in the emacs-unicode branch
   4
   5 Notes by fx to record a few things.  handa needs to check them --
   6 don't take too seriously, especially with regard to completeness.
   7
   8 Do take seriously that you don't want this CVS branch unless you're
   9 actually working on it.  If you just want to edit Unicode and/or unify
  10 iso-8859 et al, see the existing support and the extra stuff at
  11 <URL:ftp://dlpx1.dl.ac.uk/fx/emacs/Mule>.  Editing support is mostly
  12 orthogonal to the internal representation.
  13
  14  * SINGLE_BYTE_CHAR_P returns true for Latin-1 characters.
  15
  16  * Grok UTF-8 surrogates.
  17
  18  * Rationalize character syntax and its relationship to the Unicode
  19    database.  Specifically, the latin-N.el files aren't consistent for
  20    common characters.
  21
  22  * Fontset handling and customization needs work.
  23
  24  * Likewise for charset and coding system priorities.
  25
  26  * The relevant bits of latin1-disp.el need porting (and probably
  27    re-naming/updating).  See also cyril-util.el.
  28
  29  * Quail files need work now the encoding is irrelevant.  E.g. make
  30    unified Latin pre- and post- methods.
  31
  32  * What to do with the old coding categories stuff?
  33
  34  * Syntax for symbols &c in characters needs looking at.
  35
  36  * The preferred-coding-system property of charsets should probably be
  37    junked unless it can be made more useful now.
  38
  39  * find-coding-systems-for-charsets needs re-writing.
  40
  41  * find-multibyte-characters needs looking at.
  42
  43  * Implement Korean cp949/UHC and any other important missing
  44    charsets.
  45
  46  * Check up on tcvn and alternativnj.
  47
  48  * Lazy-load tables for unify-charset somehow?
  49
  50  * Should translation tables for {en,de}code and input work now or be
  51    scrapped?
  52
  53  * Defining CCL coding systems currently doesn't work.
  54
  55  * iso-2022 charsets get unified on i/o.
  56
  57  * Revisit locale processing: look at treating the language and
  58    charset parts separately.  (Language should affect things like
  59    speling and calendar, but that's not a Unicode issue.)
  60
  61  * Handle Unicode combining characters usefully, e.g. diacritics, and
  62    handle more scripts specifically (á la Devanagari).  There are
  63    issues with canonicalization.
  64
  65  * Bidi is a separate issue.
  66
  67  * DTRT with X keysyms.  We should get the right unicode for a given
  68    keysym, not decode raw bytes in some ill-defined coding system.
  69    (fx has some data on keysyms v. unicodes.)
  70
  71  * We need tabular input methods, e.g. for maths symbols.  (Not
  72    specific to Unicode.)
  73
  74  * Need multibyte text in menus, e.g. for the above.  (Not specific to
  75    Unicode.)