Commit | Line | Data |
---|---|---|
23f87bed | 1 | ;;; utf7.el --- UTF-7 encoding/decoding for Emacs -*-coding: iso-8859-1;-*- |
e84b4b86 | 2 | |
ab422c4d | 3 | ;; Copyright (C) 1999-2013 Free Software Foundation, Inc. |
c113de23 GM |
4 | |
5 | ;; Author: Jon K Hellan <hellan@acm.org> | |
23f87bed | 6 | ;; Maintainer: bugs@gnus.org |
c113de23 GM |
7 | ;; Keywords: mail |
8 | ||
9 | ;; This file is part of GNU Emacs. | |
10 | ||
5e809f55 | 11 | ;; GNU Emacs is free software: you can redistribute it and/or modify |
c113de23 | 12 | ;; it under the terms of the GNU General Public License as published by |
5e809f55 GM |
13 | ;; the Free Software Foundation, either version 3 of the License, or |
14 | ;; (at your option) any later version. | |
c113de23 GM |
15 | |
16 | ;; GNU Emacs is distributed in the hope that it will be useful, | |
17 | ;; but WITHOUT ANY WARRANTY; without even the implied warranty of | |
18 | ;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the | |
19 | ;; GNU General Public License for more details. | |
20 | ||
21 | ;; You should have received a copy of the GNU General Public License | |
5e809f55 | 22 | ;; along with GNU Emacs. If not, see <http://www.gnu.org/licenses/>. |
c113de23 GM |
23 | |
24 | ;;; Commentary: | |
23f87bed MB |
25 | |
26 | ;; UTF-7 - A Mail-Safe Transformation Format of Unicode - RFC 2152 | |
27 | ;; This is a transformation format of Unicode that contains only 7-bit | |
28 | ;; ASCII octets and is intended to be readable by humans in the limiting | |
29 | ;; case that the document consists of characters from the US-ASCII | |
30 | ;; repertoire. | |
31 | ;; In short, runs of characters outside US-ASCII are encoded as base64 | |
32 | ;; inside delimiters. | |
33 | ;; A variation of UTF-7 is specified in IMAP 4rev1 (RFC 2060) as the way | |
34 | ;; to represent characters outside US-ASCII in mailbox names in IMAP. | |
35 | ;; This library supports both variants, but the IMAP variation was the | |
36 | ;; reason I wrote it. | |
37 | ;; The routines convert UTF-7 -> UTF-16 (16 bit encoding of Unicode) | |
38 | ;; -> current character set, and vice versa. | |
39 | ;; However, until Emacs supports Unicode, the only Emacs character set | |
40 | ;; supported here is ISO-8859.1, which can trivially be converted to/from | |
41 | ;; Unicode. | |
42 | ;; When decoding results in a character outside the Emacs character set, | |
43 | ;; an error is thrown. It is up to the application to recover. | |
44 | ||
45 | ;; UTF-7 should be done by providing a coding system. Mule-UCS does | |
46 | ;; already, but I don't know if it does the IMAP version and it's not | |
47 | ;; clear whether that should really be a coding system. The UTF-16 | |
48 | ;; part of the conversion can be done with coding systems available | |
49 | ;; with Mule-UCS or some versions of Emacs. Unfortunately these were | |
50 | ;; done wrongly (regarding handling of byte-order marks and how the | |
51 | ;; variants were named), so we don't have a consistent name for the | |
52 | ;; necessary coding system. The code below doesn't seem to DTRT | |
53 | ;; generally. E.g.: | |
54 | ;; | |
55 |