In that case, a sequence of one-byte codes has a slightly different
form.
- At first, all characters in eight-bit-control are represented by
+ Firstly, all characters in eight-bit-control are represented by
one-byte sequences which are their 8-bit code.
Next, character composition data are represented by the byte
METHOD is 0xF0 plus one of composition method (enum
composition_method),
- BYTES is 0x20 plus a byte length of this composition data,
+ BYTES is 0xA0 plus the byte length of these composition data,
- CHARS is 0x20 plus a number of characters composed by this
+ CHARS is 0xA0 plus the number of characters composed by these
data,
COMPONENTs are characters of multibyte form or composition
if (from < GPT && to >= GPT)
move_gap_both (to, to_byte);
+ /* If we an anchor byte `\0' follows the region, we include it in
+ the detecting source. Then code detectors can handle the tailing
+ byte sequence more accurately.
+
+ Fix me: This is not an perfect solution. It is better that we
+ add one more argument, say LAST_BLOCK, to all detect_coding_XXX.
+ */
if (to == Z || (to == GPT && GAP_SIZE > 0))
include_anchor_byte = 1;
return detect_coding_system (BYTE_POS_ADDR (from_byte),
- /* "+ include_anchor_byteq" is to
- include the anchor byte `\0'. With
- this, code detectors can check if
- tailing bytes are valid. */
to_byte - from_byte + include_anchor_byte,
!NILP (highest),
!NILP (current_buffer
return detect_coding_system (XSTRING (string)->data,
/* "+ 1" is to include the anchor byte
`\0'. With this, code detectors can
- check if tailing bytes are
- valid. */
+ handle the tailing bytes more
+ accurately. */
STRING_BYTES (XSTRING (string)) + 1,
!NILP (highest),
STRING_MULTIBYTE (string));