Merge from emacs--devo--0
[bpt/emacs.git] / etc / NEWS.unicode
1 GNU Emacs NEWS -- history of user-visible changes.
2
3 Copyright (C) 2007 Free Software Foundation, Inc.
4 Copyright (C) 2007
5 National Institute of Advanced Industrial Science and Technology (AIST)
6 Registration Number H14PRO021
7 See the end of the file for license conditions.
8
9 Please send Emacs bug reports to bug-gnu-emacs@gnu.org.
10 If possible, use M-x report-emacs-bug.
11
12 This file is about changes in the Emacs "unicode" branch.
13
14 \f
15 * Changes in Emacs Unicode
16
17 ** The Emacs character set is now a superset of Unicode.
18 (It has about four times the code space, which should be plenty).
19
20 The internal encoding used for buffers and strings is now
21 Unicode-based and called `utf-8-emacs'. utf-8-emacs is backwards
22 compatible with the UTF-8 encoding of Unicode. The `emacs-mule'
23 coding system can still read and write data in the old internal
24 encoding.
25
26 Since the internal encoding is also used by default for byte-compiled
27 files -- i.e. the normal coding system for byte-compiled Lisp files is
28 now utf-8-Emacs -- Lisp containing non-ASCII characters which is
29 compiled by Emacs 23 can't be read by earlier versions of Emacs. Files
30 compiled by Emacs 20, 21, or 22 are loaded correctly as emacs-mule
31 (whether or not they contain multibyte characters), which makes loading
32 them somewhat slower than Emacs 23-compiled files. Thus it may be worth
33 recompiling existing .elc files which don't need to be shared with older
34 Emacsen.
35
36 ** There are assorted new coding systems/aliases -- see
37 M-x list-coding-systems.
38
39 ** New charset implementation with many new charsets.
40 See M-x list-character-sets. New charsets can be defined conveniently
41 as tables of unicodes.
42
43 The dimension of a charset is now 0, 1, 2, or 3, and the size of each
44 dimension is no longer limited to 94 or 96.
45
46 A dynamic charset priority list is used to infer the charset of
47 characters for display.
48
49 ** New minor mode Auto Composition Mode composes characters automatically
50 when they are displayed. This mode is globally on by default.
51
52 ** Emacs now supports local fonts (fonts installed in the same machine
53 as Emacs is running) by freetype and fontconfig libraries. On X, they
54 are drived via Xft library with antialias support. Fontconfig-like
55 font names (e.g. monospace-12) are also accepted.
56
57 ** New language environments Chinese-GBK, Chinese-GB18030, and
58 TaiViet.
59
60 ** The following facilities are obsolete:
61
62 Minor modes: unify-8859-on-encoding-mode, unify-8859-on-decoding-mode
63
64 \f
65 * Lisp changes in Emacs Unicode
66
67 ** Character code, representation, and charset changes.
68
69 Now character code space is 0x0..0x3FFFFF with no gap. Among them,
70 characters of code 0x0..0x10FFFF are Unicode characters of the same
71 code points. Characters of code 0x3FFF80..0x3FFFFF are raw 8-bit
72 bytes.
73
74 Generic characters no longer exist.
75
76 In buffer and string, characters are represented by UTF-8 byte
77 sequence in a multibyte buffer/string.
78
79 The concept of charset is changed. A single character may belong to
80 multiple charset (e.g. a-grave (U+00E0) belongs to charsets unicode,
81 iso-8859-1, iso-8859-3, and etc).
82
83 *** The new function `characterp' returns t if and only if the argument
84 is a character.
85
86 *** The new function `max-char' returns the maximum character code
87 (currently it is #x3FFFFF).
88
89 *** The function `encode-char' and `decode-char' now accepts any
90 character sets.
91
92 *** The function `define-charset' now accepts completely different
93 form of argments (old-style arguments still works).
94
95 *** The new function `define-charset-alias' defines an alias of a
96 charset.
97
98 *** The value of the function `char-charset' depends of the current
99 priorities of charsets.
100
101 *** The new function `charset-priority-list' returns the list of
102 charsets ordered by priority.
103
104 *** The new function `set-charset-priority' sets pliorities of
105 charsets.
106
107 *** The new function `unibyte-charset' returns the current unibyte
108 charset. The unibyte charset determins how unibyte/multibyte
109 conversion is done.
110
111 *** The new function `set-unibyte-charset' sets the unibyte charset.
112
113 *** The new function `unibyte-string' make a unibyte string from
114 bytes.
115
116 ** Code conversion changes
117
118 *** The new function `define-coding-system' should be used to define a
119 coding system instead of `make-coding-system' (which is obsolete now).
120
121 *** The functions `encode-coding-region' and `decode-coding-region'
122 have the optional 4th argument to specify where the result of
123 conversion should go.
124
125 *** The functions `encode-coding-string' and `decode-coding-string'
126 have the optional 4th argument specifying a buffer to store the result
127 of conversion.
128
129 *** The new fuction `with-coding-priority' executs the body part with
130 the specified coding system priority order.
131
132 *** The new function `check-coding-systems-region' checks if the text
133 in the region is encodable by the specified coding systems.
134
135 *** The new function `coding-system-aliases' returns a list of aliases
136 of a coding system.
137
138 *** The new function `coding-system-charset-list' returns a list of
139 charsets supported by a coding system.
140
141 *** The new funciton `coding-system-priority-list' returns a list of
142 coding systems ordered by their priorities.
143
144 *** Thew new function `set-coding-system-priority' sets priorities of
145 coding systems.
146
147 ** Composition changes
148
149 *** New functions and variables `auto-composition-mode' and
150 `global-auto-composition-mode' toggles the new minor mode Auto
151 Composition Mode locally and globally.
152
153 *** New variable `auto-composition-function' is a function used in
154 Auto Composition Mode to compose characters. The default value is the
155 function `auto-compose-chars'.
156
157 *** New variable `auto-compose-current-font' is set to the current
158 font-object while characters are being composed in Auto Composition
159 Mode.
160
161 ** Font Backend changes.
162
163 *** New frame parameter `font-backend' specifies a list of
164 font-backends supported by the frame's graphic device. On X, they are
165 currently `x' and `xft'.
166
167 *** New function `fontp' checks if the argument is a font-spec
168 or font-entity.
169
170 *** New function `font-spec' creates a new font-spec object.
171
172 *** New function `font-get' returns a font property value.
173
174 *** New function `font-put' sets a font property value.
175
176 *** New function `list-fonts' returns a list of font-entities matching
177 with the give specificaiton.
178
179 *** New function `list-families' returns a list family names of
180 available fonts.
181
182 *** New function `font-font' returns a font-entity best matching with
183 the given specification.
184
185 *** New function `font-xlfd-name' returns an XLFD name of a give font
186 (font-spec, font-entity, or font-object).
187
188 *** New function `clear-font-cache' clears all font caches.
189
190 ** The function get-char-code-property now accepts many Unicode base
191 character properties. They are `name', `general-category',
192 `canonical-combining-class', `bidi-class', `decomposition',
193 `decimal-digit-value', `digit-value', `numeric-value', `mirrord',
194 `old-name', `iso-10646-comment', `uppercase', `lowercase', and
195 `titlecase'.
196
197 ** Thew new function `define-char-code-property' defines a character
198 code property.
199
200 ** The new function `char-code-property-description' returns the
201 description string of a cahracter code property.
202
203 *** The new variable `find-word-boundary-function-table' is a
204 char-table of functions to search for a word boundary.
205
206 *** The new variable `char-script-table' is a char-table of script
207 names.
208
209 *** The new variable `char-width-table' is a char-table of character
210 widths.
211
212 *** The new variable `print-charset-text-property' controls how to
213 handle `charset' text property on printing a string.
214
215 *** Thew new variable `printable-chars' is a char-table defining if a
216 character is printable or not.
217
218 *** The new function `robin-define-package' defines a Robin package
219 which is an input method system different from Quail.
220
221 *** The new function `robin-modify-package' modifies an existing Robin
222 package.
223
224 *** The new function `robin-use-package' start using a Robin package
225 as an input method.
226
227 ** The functions `modify-syntax-entry' and `modify-category-entry' now
228 accepts a cons of characters as the first argument, and modify all
229 entries in that range of characters.
230
231 ** The function `set-fontset-font' now accepts a script name as the
232 second argument, and has the optional 5th argument to control how to
233 set the font.
234
235 ** The functions `char-bytes', `chars-in-region', `set-coding-priority',
236 , `make-coding-system', and `char-valid-p' are now obsolete.
237
238 \f
239 * Incompatible Lisp changes
240
241 ** The behavior of map-char-table has changed. It may call the
242 specified function with a cons (FROM . TO) as a key if characters in
243 that range has the same value.
244
245 ** The value of the function `charset-id' is now always 0.
246
247 ** The functions `register-char-codings' and `coding-system-spec' are
248 deleted.
249
250 \f
251 ----------------------------------------------------------------------
252 This file is part of GNU Emacs.
253
254 GNU Emacs is free software; you can redistribute it and/or modify
255 it under the terms of the GNU General Public License as published by
256 the Free Software Foundation; either version 2, or (at your option)
257 any later version.
258
259 GNU Emacs is distributed in the hope that it will be useful,
260 but WITHOUT ANY WARRANTY; without even the implied warranty of
261 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
262 GNU General Public License for more details.
263
264 You should have received a copy of the GNU General Public License
265 along with GNU Emacs; see the file COPYING. If not, write to the
266 Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor,
267 Boston, MA 02110-1301, USA.
268
269 \f
270 Local variables:
271 mode: outline
272 paragraph-separate: "[ \f]*$"
273 end:
274
275 arch-tag: e21801b9-0724-4cda-8c07-7d60bf3db3fd