[bpt/emacs.git] / etc / TO-DO

Things useful to do for GNU Emacs:

* Primitive for random access insertion of part of a file.

* Making I/O streams for files, so that read and prin1 can
 be used on files directly.  The I/O stream itself would
 serve as a function to read or write one character.

* If a file you can't write is in a directory you can write,
 make sure it works to modify and save this file.

* Make dired's commands handle correctly the case where
 ls has listed several subdirectories' contents.
 It needs to be able to tell which directory each file
 is really in, by searching backward for the line
 which identifies the start of a directory.

* Add more dired commands, such as sorting (use the
 sort utility through call-process-region).

* Make display.c record inverse-video-ness on
 a character by character basis.  Then make non-full-screen-width
 mode lines inverse video, and display the marked location in
 inverse video.

* VMS code to list a file directory.  Make dired work.

Long range:

   Ideas for extending GNU Emacs to deal with arbitrary character sets.

I would like GNU Emacs to be extended to handle all the world's alphabets
and word signs.  I don't expect to have time to do such a thing in the next
few years, so here are my ideas on the best way to do it.

* Each graphic is represented by a sequence of ordinary 8-bit characters.

* All the characters that make up such a sequence have codes >= 0200.

* The first character of such a sequence is between 0200 and 0237.

* The remaining characters of such a sequence are all 0240 or higher.

* The first character of the sequence determines the number of characters
in the sequence.  Thus, 0200...0207 could start two-character sequences,
0210...0227 could start three-character sequences, and 0230 could start
four-character sequences.  (Codes 0231...0237 would be reserved.)

*  Several common  alphabets,  and  some mathematical   symbols,  would get
two-character sequences.  (Probably Greek,  Russian,  Hebrew(?), Arabic(?),
Korean, and Japanese kana).  The remaining alphabets, and  some versions of
Chinese,  would   get  three-character sequences.    Other  sets of Chinese
characters would get four-character sequences.

Each country that uses Chinese characters has its own standard character
set, and it is not easy to correlate them to avoid overlap.  So there may
need to be several sets of Chinese characters.  That is why they need so
much code space.

True support for Hebrew and Arabic requires dealing with the problem of
writing direction for mixed text; I don't know what to do for that.

* The functions that use syntax table would determine the
syntax of a sequence from its first character.

* Functions in indent.c for computing widths and columns would
determine the width of a sequence from its first character.
So would display routines.

* Only a few other editing routines would need any change.  In
particular, searching and regexp matching might not need any change.

* Most of the work required would be in redisplay.  The only case that
needs to be supported is with X windows, since ordinary terminals
can't display all these characters anyway.

* There might need to be code to translate files from this format
to whatever format is typically stored on disk.


I would be very unhappy with half-measures, such as support for
Japanese only.
Commit	Line	Data
9789a4be ER	1	Things useful to do for GNU Emacs:
	2
	3	* Primitive for random access insertion of part of a file.
	4
	5	* Making I/O streams for files, so that read and prin1 can
	6	be used on files directly. The I/O stream itself would
	7	serve as a function to read or write one character.
	8
	9	* If a file you can't write is in a directory you can write,
	10	make sure it works to modify and save this file.
	11
	12	* Make dired's commands handle correctly the case where
	13	ls has listed several subdirectories' contents.
	14	It needs to be able to tell which directory each file
	15	is really in, by searching backward for the line
	16	which identifies the start of a directory.
	17
	18	* Add more dired commands, such as sorting (use the
	19	sort utility through call-process-region).
	20
	21	* Make display.c record inverse-video-ness on
	22	a character by character basis. Then make non-full-screen-width
	23	mode lines inverse video, and display the marked location in
	24	inverse video.
	25
	26	* VMS code to list a file directory. Make dired work.
33d92c1f ER	27
	28	Long range:
	29
	30	Ideas for extending GNU Emacs to deal with arbitrary character sets.
	31
	32	I would like GNU Emacs to be extended to handle all the world's alphabets
	33	and word signs. I don't expect to have time to do such a thing in the next
	34	few years, so here are my ideas on the best way to do it.
	35
	36	* Each graphic is represented by a sequence of ordinary 8-bit characters.
	37
	38	* All the characters that make up such a sequence have codes >= 0200.
	39
	40	* The first character of such a sequence is between 0200 and 0237.
	41
	42	* The remaining characters of such a sequence are all 0240 or higher.
	43
	44	* The first character of the sequence determines the number of characters
	45	in the sequence. Thus, 0200...0207 could start two-character sequences,
	46	0210...0227 could start three-character sequences, and 0230 could start
	47	four-character sequences. (Codes 0231...0237 would be reserved.)
	48
	49	* Several common alphabets, and some mathematical symbols, would get
	50	two-character sequences. (Probably Greek, Russian, Hebrew(?), Arabic(?),
	51	Korean, and Japanese kana). The remaining alphabets, and some versions of
	52	Chinese, would get three-character sequences. Other sets of Chinese
	53	characters would get four-character sequences.
	54
	55	Each country that uses Chinese characters has its own standard character
	56	set, and it is not easy to correlate them to avoid overlap. So there may
	57	need to be several sets of Chinese characters. That is why they need so
	58	much code space.
	59
	60	True support for Hebrew and Arabic requires dealing with the problem of
	61	writing direction for mixed text; I don't know what to do for that.
	62
	63	* The functions that use syntax table would determine the
	64	syntax of a sequence from its first character.
	65
	66	* Functions in indent.c for computing widths and columns would
	67	determine the width of a sequence from its first character.
	68	So would display routines.
	69
	70	* Only a few other editing routines would need any change. In
	71	particular, searching and regexp matching might not need any change.
	72
	73	* Most of the work required would be in redisplay. The only case that
	74	needs to be supported is with X windows, since ordinary terminals
	75	can't display all these characters anyway.
	76
	77	* There might need to be code to translate files from this format
	78	to whatever format is typically stored on disk.
	79
	80
	81	I would be very unhappy with half-measures, such as support for
	82	Japanese only.
	83