Commit | Line | Data |
---|---|---|
23f87bed | 1 | ;;; utf7.el --- UTF-7 encoding/decoding for Emacs -*-coding: iso-8859-1;-*- |
e84b4b86 | 2 | |
5f3710a2 | 3 | ;; Copyright (C) 1999, 2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, |
5df4f04c | 4 | ;; 2008, 2009, 2010, 2011 Free Software Foundation, Inc. |
c113de23 GM |
5 | |
6 | ;; Author: Jon K Hellan <hellan@acm.org> | |
23f87bed | 7 | ;; Maintainer: bugs@gnus.org |
c113de23 GM |
8 | ;; Keywords: mail |
9 | ||
10 | ;; This file is part of GNU Emacs. | |
11 | ||
5e809f55 | 12 | ;; GNU Emacs is free software: you can redistribute it and/or modify |
c113de23 | 13 | ;; it under the terms of the GNU General Public License as published by |
5e809f55 GM |
14 | ;; the Free Software Foundation, either version 3 of the License, or |
15 | ;; (at your option) any later version. | |
c113de23 GM |
16 | |
17 | ;; GNU Emacs is distributed in the hope that it will be useful, | |
18 | ;; but WITHOUT ANY WARRANTY; without even the implied warranty of | |
19 | ;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the | |
20 | ;; GNU General Public License for more details. | |
21 | ||
22 | ;; You should have received a copy of the GNU General Public License | |
5e809f55 | 23 | ;; along with GNU Emacs. If not, see <http://www.gnu.org/licenses/>. |
c113de23 GM |
24 | |
25 | ;;; Commentary: | |
23f87bed MB |
26 | |
27 | ;; UTF-7 - A Mail-Safe Transformation Format of Unicode - RFC 2152 | |
28 | ;; This is a transformation format of Unicode that contains only 7-bit | |
29 | ;; ASCII octets and is intended to be readable by humans in the limiting | |
30 | ;; case that the document consists of characters from the US-ASCII | |
31 | ;; repertoire. | |
32 | ;; In short, runs of characters outside US-ASCII are encoded as base64 | |
33 | ;; inside delimiters. | |
34 | ;; A variation of UTF-7 is specified in IMAP 4rev1 (RFC 2060) as the way | |
35 | ;; to represent characters outside US-ASCII in mailbox names in IMAP. | |
36 | ;; This library supports both variants, but the IMAP variation was the | |
37 | ;; reason I wrote it. | |
38 | ;; The routines convert UTF-7 -> UTF-16 (16 bit encoding of Unicode) | |
39 | ;; -> current character set, and vice versa. | |
40 | ;; However, until Emacs supports Unicode, the only Emacs character set | |
41 | ;; supported here is ISO-8859.1, which can trivially be converted to/from | |
42 | ;; Unicode. | |
43 | ;; When decoding results in a character outside the Emacs character set, | |
44 | ;; an error is thrown. It is up to the application to recover. | |
45 | ||
46 | ;; UTF-7 should be done by providing a coding system. Mule-UCS does | |
47 | ;; already, but I don't know if it does the IMAP version and it's not | |
48 | ;; clear whether that should really be a coding system. The UTF-16 | |
49 | ;; part of the conversion can be done with coding systems available | |
50 | ;; with Mule-UCS or some versions of Emacs. Unfortunately these were | |
51 | ;; done wrongly (regarding handling of byte-order marks and how the | |
52 | ;; variants were named), so we don't have a consistent name for the | |
53 | ;; necessary coding system. The code below doesn't seem to DTRT | |
54 | ;; generally. E.g.: | |
55 | ;; | |
56 |