From 083f9d74d12c68751a1c529bb0c073f489be1c63 Mon Sep 17 00:00:00 2001 From: Kevin Ryde Date: Wed, 5 Oct 2005 01:24:12 +0000 Subject: [PATCH] (Regexp Functions): Notes on zero bytes and locale character set. --- doc/ref/api-data.texi | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/doc/ref/api-data.texi b/doc/ref/api-data.texi index 4b739ff28..042af4521 100755 --- a/doc/ref/api-data.texi +++ b/doc/ref/api-data.texi @@ -3705,6 +3705,16 @@ This regular expression interface was modeled after that implemented by SCSH, the Scheme Shell. It is intended to be upwardly compatible with SCSH regular expressions. +Zero bytes (@code{#\nul}) cannot be used in regex patterns or input +strings, since the underlying C functions treat that as the end of +string. If there's a zero byte an error is thrown. + +Patterns and input strings are treated as being in the locale +character set if @code{setlocale} has been called (@pxref{Locales}), +and in a multibyte locale this includes treating multi-byte sequences +as a single character. (Guile strings are currently merely bytes, +though this may change in the future, @xref{Conversion to/from C}.) + @deffn {Scheme Procedure} string-match pattern str [start] Compile the string @var{pattern} into a regular expression and compare it with @var{str}. The optional numeric argument @var{start} specifies -- 2.20.1