2 @c This is part of the GNU Guile Reference Manual.
3 @c Copyright (C) 1996, 1997, 2000, 2001, 2002, 2003, 2004, 2006, 2007
4 @c Free Software Foundation, Inc.
5 @c See the file guile.texi for copying conditions.
8 @node Internationalization
9 @section Support for Internationalization
11 @cindex internationalization
14 Guile provides internationalization@footnote{For concision and style,
15 programmers often like to refer to internationalization as ``i18n''.}
16 support for Scheme programs in two ways. First, procedures to
17 manipulate text and data in a way that conforms to particular cultural
18 conventions (i.e., in a ``locale-dependent'' way) are provided in the
19 @code{(ice-9 i18n)}. Second, Guile allows the use of GNU
20 @code{gettext} to translate program message strings.
23 * i18n Introduction:: Introduction to Guile's i18n support.
24 * Text Collation:: Sorting strings and characters.
25 * Character Case Mapping:: Case mapping.
26 * Number Input and Output:: Parsing and printing numbers.
27 * Accessing Locale Information:: Detailed locale information.
28 * Gettext Support:: Translating message strings.
32 @node i18n Introduction, Text Collation, Internationalization, Internationalization
33 @subsection Internationalization with Guile
35 In order to make use of the functions described thereafter, the
36 @code{(ice-9 i18n)} module must be imported in the usual way:
39 (use-modules (ice-9 i18n))
42 @cindex libguile-i18n-v-@value{LIBGUILE_I18N_MAJOR}
44 C programs can use the C functions corresponding to the procedures of
45 this module by including @code{<libguile/i18n.h>} and by linking
46 against @code{libguile-i18n-v-@value{LIBGUILE_I18N_MAJOR}}.
48 @cindex cultural conventions
50 The @code{(ice-9 i18n)} module provides procedures to manipulate text
51 and other data in a way that conforms to the cultural conventions
52 chosen by the user. Each region of the world or language has its own
53 customs to, for instance, represent real numbers, classify characters,
54 collate text, etc. All these aspects comprise the so-called
55 ``cultural conventions'' of that region or language.
58 @cindex locale category
60 Computer systems typically refer to a set of cultural conventions as a
61 @dfn{locale}. For each particular aspect that comprise those cultural
62 conventions, a @dfn{locale category} is defined. For instance, the
63 way characters are classified is defined by the @code{LC_CTYPE}
64 category, while the language in which program messages are issued to
65 the user is defined by the @code{LC_MESSAGES} category
66 (@pxref{Locales, General Locale Information} for details).
70 The procedures provided by this module allow the development of
71 programs that adapt automatically to any locale setting. As we will
72 see later, many of these procedures can optionally take a @dfn{locale
73 object} argument. This additional argument defines the locale
74 settings that must be followed by the invoked procedure. When it is
75 omitted, then the current locale settings of the process are followed
76 (@pxref{Locales, @code{setlocale}}).
78 The following procedures allow the manipulation of such locale
81 @deffn {Scheme Procedure} make-locale category-list locale-name [base-locale]
82 @deffnx {C Function} scm_make_locale (category_list, locale_name, base_locale)
83 Return a reference to a data structure representing a set of locale
84 datasets. @var{locale-name} should be a string denoting a particular
85 locale (e.g., @code{"aa_DJ"}) and @var{category-list} should be either
86 a list of locale categories or a single category as used with
87 @code{setlocale} (@pxref{Locales, @code{setlocale}}). Optionally, if
88 @code{base-locale} is passed, it should be a locale object denoting
89 settings for categories not listed in @var{category-list}.
91 The following invocation creates a locale object that combines the use
92 of Swedish for messages and character classification with the
93 default settings for the other categories (i.e., the settings of the
94 default @code{C} locale which usually represents conventions in use in
98 (make-locale (list LC_MESSAGE LC_CTYPE) "sv_SE")
101 The following example combines the use of Esperanto messages and
102 conventions with monetary conventions from Croatia:
105 (make-locale LC_MONETARY "hr_HR"
106 (make-locale LC_ALL "eo_EO"))
109 A @code{system-error} exception (@pxref{Handling Errors}) is raised by
110 @code{make-locale} when @var{locale-name} does not match any of the
111 locales compiled on the system. Note that on non-GNU systems, this
112 error may be raised later, when the locale object is actually used.
116 @deffn {Scheme Procedure} locale? obj
117 @deffnx {C Function} scm_locale_p (obj)
118 Return true if @var{obj} is a locale object.
121 @defvr {Scheme Variable} %global-locale
122 @defvrx {C Variable} scm_global_locale
123 This variable is bound to a locale object denoting the current process
124 locale as installed using @code{setlocale ()} (@pxref{Locales}). It
125 may be used like any other locale object, including as a third
126 argument to @code{make-locale}, for instance.
130 @node Text Collation, Character Case Mapping, i18n Introduction, Internationalization
131 @subsection Text Collation
133 The following procedures provide support for text collation, i.e.,
134 locale-dependent string and character sorting.
136 @deffn {Scheme Procedure} string-locale<? s1 s2 [locale]
137 @deffnx {C Function} scm_string_locale_lt (s1, s2, locale)
138 @deffnx {Scheme Procedure} string-locale>? s1 s2 [locale]
139 @deffnx {C Function} scm_string_locale_gt (s1, s2, locale)
140 @deffnx {Scheme Procedure} string-locale-ci<? s1 s2 [locale]
141 @deffnx {C Function} scm_string_locale_ci_lt (s1, s2, locale)
142 @deffnx {Scheme Procedure} string-locale-ci>? s1 s2 [locale]
143 @deffnx {C Function} scm_string_locale_ci_gt (s1, s2, locale)
144 Compare strings @var{s1} and @var{s2} in a locale-dependent way. If
145 @var{locale} is provided, it should be locale object (as returned by
146 @code{make-locale}) and will be used to perform the comparison;
147 otherwise, the current system locale is used. For the @code{-ci}
148 variants, the comparison is made in a case-insensitive way.
151 @deffn {Scheme Procedure} string-locale-ci=? s1 s2 [locale]
152 @deffnx {C Function} scm_string_locale_ci_eq (s1, s2, locale)
153 Compare strings @var{s1} and @var{s2} in a case-insensitive, and
154 locale-dependent way. If @var{locale} is provided, it should be
155 a locale object (as returned by @code{make-locale}) and will be used to
156 perform the comparison; otherwise, the current system locale is used.
159 @deffn {Scheme Procedure} char-locale<? c1 c2 [locale]
160 @deffnx {C Function} scm_char_locale_lt (c1, c2, locale)
161 @deffnx {Scheme Procedure} char-locale>? c1 c2 [locale]
162 @deffnx {C Function} scm_char_locale_gt (c1, c2, locale)
163 @deffnx {Scheme Procedure} char-locale-ci<? c1 c2 [locale]
164 @deffnx {C Function} scm_char_locale_ci_lt (c1, c2, locale)
165 @deffnx {Scheme Procedure} char-locale-ci>? c1 c2 [locale]
166 @deffnx {C Function} scm_char_locale_ci_gt (c1, c2, locale)
167 Compare characters @var{c1} and @var{c2} according to either
168 @var{locale} (a locale object as returned by @code{make-locale}) or
169 the current locale. For the @code{-ci} variants, the comparison is
170 made in a case-insensitive way.
173 @deffn {Scheme Procedure} char-locale-ci=? c1 c2 [locale]
174 @deffnx {C Function} scm_char_locale_ci_eq (c1, c2, locale)
175 Return true if character @var{c1} is equal to @var{c2}, in a case
176 insensitive way according to @var{locale} or to the current locale.
179 @node Character Case Mapping, Number Input and Output, Text Collation, Internationalization
180 @subsection Character Case Mapping
182 The procedures below provide support for ``character case mapping'',
183 i.e., to convert characters or strings to their upper-case or
184 lower-case equivalent. Note that SRFI-13 provides procedures that
185 look similar (@pxref{Alphabetic Case Mapping}). However, the SRFI-13
186 procedures are locale-independent. Therefore, they do not take into
187 account specificities of the customs in use in a particular language
188 or region of the world. For instance, while most languages using the
189 Latin alphabet map lower-case letter ``i'' to upper-case letter ``I'',
190 Turkish maps lower-case ``i'' to ``Latin capital letter I with dot
191 above''. The following procedures allow programmers to provide
192 idiomatic character mapping.
194 @deffn {Scheme Procedure} char-locale-downcase chr [locale]
195 @deffnx {C Function} scm_char_locale_upcase (chr, locale)
196 Return the lowercase character that corresponds to @var{chr} according
197 to either @var{locale} or the current locale.
200 @deffn {Scheme Procedure} char-locale-upcase chr [locale]
201 @deffnx {C Function} scm_char_locale_downcase (chr, locale)
202 Return the uppercase character that corresponds to @var{chr} according
203 to either @var{locale} or the current locale.
206 @deffn {Scheme Procedure} string-locale-upcase str [locale]
207 @deffnx {C Function} scm_string_locale_upcase (str, locale)
208 Return a new string that is the uppercase version of @var{str}
209 according to either @var{locale} or the current locale.
212 @deffn {Scheme Procedure} string-locale-downcase str [locale]
213 @deffnx {C Function} scm_string_locale_downcase (str, locale)
214 Return a new string that is the down-case version of @var{str}
215 according to either @var{locale} or the current locale.
218 Note that in the current implementation Guile has no notion of
219 multibyte characters and in a multibyte locale characters may not be
222 @node Number Input and Output, Accessing Locale Information, Character Case Mapping, Internationalization
223 @subsection Number Input and Output
225 The following procedures allow programs to read and write numbers
226 written according to a particular locale. As an example, in English,
227 ``ten thousand and a half'' is usually written @code{10,000.5} while
228 in French it is written @code{10 000,5}. These procedures allow such
229 differences to be taken into account.
232 @deffn {Scheme Procedure} locale-string->integer str [base [locale]]
233 @deffnx {C Function} scm_locale_string_to_integer (str, base, locale)
234 Convert string @var{str} into an integer according to either
235 @var{locale} (a locale object as returned by @code{make-locale}) or
236 the current process locale. If @var{base} is specified, then it
237 determines the base of the integer being read (e.g., @code{16} for an
238 hexadecimal number, @code{10} for a decimal number); by default,
239 decimal numbers are read. Return two values (@pxref{Multiple
240 Values}): an integer (on success) or @code{#f}, and the number of
241 characters read from @var{str} (@code{0} on failure).
243 This function is based on the C library's @code{strtol} function
244 (@pxref{Parsing of Integers, @code{strtol},, libc, The GNU C Library
249 @deffn {Scheme Procedure} locale-string->inexact str [locale]
250 @deffnx {C Function} scm_locale_string_to_inexact (str, locale)
251 Convert string @var{str} into an inexact number according to either
252 @var{locale} (a locale object as returned by @code{make-locale}) or
253 the current process locale. Return two values (@pxref{Multiple
254 Values}): an inexact number (on success) or @code{#f}, and the number
255 of characters read from @var{str} (@code{0} on failure).
257 This function is based on the C library's @code{strtod} function
258 (@pxref{Parsing of Floats, @code{strtod},, libc, The GNU C Library
262 @deffn {Scheme Procedure} number->locale-string number [fraction-digits [locale]]
263 Convert @var{number} (an inexact) into a string according to the
264 cultural conventions of either @var{locale} (a locale object) or the
265 current locale. Optionally, @var{fraction-digits} may be bound to an
266 integer specifying the number of fractional digits to be displayed.
269 @deffn {Scheme Procedure} monetary-amount->locale-string amount intl? [locale]
270 Convert @var{amount} (an inexact denoting a monetary amount) into a
271 string according to the cultural conventions of either @var{locale} (a
272 locale object) or the current locale. If @var{intl?} is true, then
273 the international monetary format for the given locale is used
274 (@pxref{Currency Symbol, international and locale monetary formats,,
275 libc, The GNU C Library Reference Manual}).
279 @node Accessing Locale Information, Gettext Support, Number Input and Output, Internationalization
280 @subsection Accessing Locale Information
283 @cindex low-level locale information
284 It is sometimes useful to obtain very specific information about a
285 locale such as the word it uses for days or months, its format for
286 representing floating-point figures, etc. The @code{(ice-9 i18n)}
287 module provides support for this in a way that is similar to the libc
288 functions @code{nl_langinfo ()} and @code{localeconv ()}
289 (@pxref{Locale Information, accessing locale information from C,,
290 libc, The GNU C Library Reference Manual}). The available functions
293 @deffn {Scheme Procedure} locale-encoding [locale]
294 Return the name of the encoding (a string whose interpretation is
295 system-dependent) of either @var{locale} or the current locale.
298 The following functions deal with dates and times.
300 @deffn {Scheme Procedure} locale-day day [locale]
301 @deffnx {Scheme Procedure} locale-day-short day [locale]
302 @deffnx {Scheme Procedure} locale-month month [locale]
303 @deffnx {Scheme Procedure} locale-month-short month [locale]
304 Return the word (a string) used in either @var{locale} or the current
305 locale to name the day (or month) denoted by @var{day} (or
306 @var{month}), an integer between 1 and 7 (or 1 and 12). The
307 @code{-short} variants provide an abbreviation instead of a full name.
310 @deffn {Scheme Procedure} locale-am-string [locale]
311 @deffnx {Scheme Procedure} locale-pm-string [locale]
312 Return a (potentially empty) string that is used to denote @i{ante
313 meridiem} (or @i{post meridiem}) hours in 12-hour format.
316 @deffn {Scheme Procedure} locale-date+time-format [locale]
317 @deffnx {Scheme Procedure} locale-date-format [locale]
318 @deffnx {Scheme Procedure} locale-time-format [locale]
319 @deffnx {Scheme Procedure} locale-time+am/pm-format [locale]
320 @deffnx {Scheme Procedure} locale-era-date-format [locale]
321 @deffnx {Scheme Procedure} locale-era-date+time-format [locale]
322 @deffnx {Scheme Procedure} locale-era-time-format [locale]
323 These procedures return format strings suitable to @code{strftime}
324 (@pxref{Time}) that may be used to display (part of) a date/time
325 according to certain constraints and to the conventions of either
326 @var{locale} or the current locale (@pxref{The Elegant and Fast Way,
327 the @code{nl_langinfo ()} items,, libc, The GNU C Library Reference
331 @deffn {Scheme Procedure} locale-era [locale]
332 @deffnx {Scheme Procedure} locale-era-year [locale]
333 These functions return, respectively, the era and the year of the
334 relevant era used in @var{locale} or the current locale. Most locales
335 do not define this value. In this case, the empty string is returned.
336 An example of a locale that does define this value is the Japanese
340 The following procedures give information about number representation.
342 @deffn {Scheme Procedure} locale-decimal-point [locale]
343 @deffnx {Scheme Procedure} locale-thousands-separator [locale]
344 These functions return a string denoting the representation of the
345 decimal point or that of the thousand separator (respectively) for
346 either @var{locale} or the current locale.
349 @deffn {Scheme Procedure} locale-digit-grouping [locale]
350 Return a (potentially circular) list of integers denoting how digits
351 of the integer part of a number are to be grouped, starting at the
352 decimal point and going to the left. The list contains integers
353 indicating the size of the successive groups, from right to left. If
354 the list is non-circular, then no grouping occurs for digits beyond
357 For instance, if the returned list is a circular list that contains
358 only @code{3} and the thousand separator is @code{","} (as is the case
359 with English locales), then the number @code{12345678} should be
360 printed @code{12,345,678}.
363 The following procedures deal with the representation of monetary
364 amounts. Some of them take an additional @var{intl?} argument (a
365 boolean) that tells whether the international or local monetary
366 conventions for the given locale are to be used.
368 @deffn {Scheme Procedure} locale-monetary-decimal-point [locale]
369 @deffnx {Scheme Procedure} locale-monetary-thousands-separator [locale]
370 @deffnx {Scheme Procedure} locale-monetary-grouping [locale]
371 These are the monetary counterparts of the above procedures. These
372 procedures apply to monetary amounts.
375 @deffn {Scheme Procedure} locale-currency-symbol intl? [locale]
376 Return the currency symbol (a string) of either @var{locale} or the
379 The following example illustrates the difference between the local and
380 international monetary formats:
383 (define us (make-locale LC_MONETARY "en_US"))
384 (locale-currency-symbol #f us)
386 (locale-currency-symbol #t us)
391 @deffn {Scheme Procedure} locale-monetary-fractional-digits intl? [locale]
392 Return the number of fractional digits to be used when printing
393 monetary amounts according to either @var{locale} or the current
394 locale. If the locale does not specify it, then @code{#f} is
398 @deffn {Scheme Procedure} locale-currency-symbol-precedes-positive? intl? [locale]
399 @deffnx {Scheme Procedure} locale-currency-symbol-precedes-negative? intl? [locale]
400 @deffnx {Scheme Procedure} locale-positive-separated-by-space? intl? [locale]
401 @deffnx {Scheme Procedure} locale-negative-separated-by-space? intl? [locale]
402 These procedures return a boolean indicating whether the currency
403 symbol should precede a positive/negative number, and whether a
404 whitespace should be inserted between the currency symbol and a
405 positive/negative amount.
408 @deffn {Scheme Procedure} locale-monetary-positive-sign [locale]
409 @deffnx {Scheme Procedure} locale-monetary-negative-sign [locale]
410 Return a string denoting the positive (respectively negative) sign
411 that should be used when printing a monetary amount.
414 @deffn {Scheme Procedure} locale-positive-sign-position
415 @deffnx {Scheme Procedure} locale-negative-sign-position
416 These functions return a symbol telling where a sign of a
417 positive/negative monetary amount is to appear when printing it. The
422 The currency symbol and quantity should be surrounded by parentheses.
424 Print the sign string before the quantity and currency symbol.
426 Print the sign string after the quantity and currency symbol.
427 @item sign-before-currency-symbol
428 Print the sign string right before the currency symbol.
429 @item sign-after-currency-symbol
430 Print the sign string right after the currency symbol.
432 Unspecified. We recommend you print the sign after the currency
438 Finally, the two following procedures may be helpful when programming
441 @deffn {Scheme Procedure} locale-yes-regexp [locale]
442 @deffnx {Scheme Procedure} locale-no-regexp [locale]
443 Return a string that can be used as a regular expression to recognize
444 a positive (respectively, negative) response to a yes/no question.
445 For the C locale, the default values are typically @code{"^[yY]"} and
446 @code{"^[nN]"}, respectively.
451 (format #t "Does Guile rock?~%")
452 (let ((answer (read-line)))
453 (cond ((string-match (locale-yes-regexp) answer)
455 ((string-match (locale-no-regexp) answer)
458 "What do you mean?")))
461 For an internationalized yes/no string output, @code{gettext} should
462 be used (@pxref{Gettext Support}).
465 Example uses of some of these functions are the implementation of the
466 @code{number->locale-string} and @code{monetary-amount->locale-string}
467 procedures (@pxref{Number Input and Output}), as well as that the
468 SRFI-19 date and time convertion to/from strings (@pxref{SRFI-19}).
471 @node Gettext Support, , Accessing Locale Information, Internationalization
472 @subsection Gettext Support
474 Guile provides an interface to GNU @code{gettext} for translating
475 message strings (@pxref{Introduction,,, gettext, GNU @code{gettext}
478 Messages are collected in domains, so different libraries and programs
479 maintain different message catalogues. The @var{domain} parameter in
480 the functions below is a string (it becomes part of the message
483 When @code{gettext} is not available, or if Guile was configured
484 @samp{--without-nls}, dummy functions doing no translation are
485 provided. When @code{gettext} support is available in Guile, the
486 @code{i18n} feature is provided (@pxref{Feature Tracking}).
488 @deffn {Scheme Procedure} gettext msg [domain [category]]
489 @deffnx {C Function} scm_gettext (msg, domain, category)
490 Return the translation of @var{msg} in @var{domain}. @var{domain} is
491 optional and defaults to the domain set through @code{textdomain}
492 below. @var{category} is optional and defaults to @code{LC_MESSAGES}
495 Normal usage is for @var{msg} to be a literal string.
496 @command{xgettext} can extract those from the source to form a message
497 catalogue ready for translators (@pxref{xgettext Invocation,, Invoking
498 the @command{xgettext} Program, gettext, GNU @code{gettext}
502 (display (gettext "You are in a maze of twisty passages."))
505 @code{_} is a commonly used shorthand, an application can make that an
506 alias for @code{gettext}. Or a library can make a definition that
507 uses its specific @var{domain} (so an application can change the
508 default without affecting the library).
511 (define (_ msg) (gettext msg "mylibrary"))
512 (display (_ "File not found."))
515 @code{_} is also a good place to perhaps strip disambiguating extra
516 text from the message string, as for instance in @ref{GUI program
517 problems,, How to use @code{gettext} in GUI programs, gettext, GNU
518 @code{gettext} utilities}.
521 @deffn {Scheme Procedure} ngettext msg msgplural n [domain [category]]
522 @deffnx {C Function} scm_ngettext (msg, msgplural, n, domain, category)
523 Return the translation of @var{msg}/@var{msgplural} in @var{domain},
524 with a plural form chosen appropriately for the number @var{n}.
525 @var{domain} is optional and defaults to the domain set through
526 @code{textdomain} below. @var{category} is optional and defaults to
527 @code{LC_MESSAGES} (@pxref{Locales}).
529 @var{msg} is the singular form, and @var{msgplural} the plural. When
530 no translation is available, @var{msg} is used if @math{@var{n} = 1},
531 or @var{msgplural} otherwise. When translated, the message catalogue
532 can have a different rule, and can have more than two possible forms.
534 As per @code{gettext} above, normal usage is for @var{msg} and
535 @var{msgplural} to be literal strings, since @command{xgettext} can
536 extract them from the source to build a message catalogue. For
541 (format #t (ngettext "~a file processed\n"
542 "~a files processed\n" n)
545 (done 1) @print{} 1 file processed
546 (done 3) @print{} 3 files processed
549 It's important to use @code{ngettext} rather than plain @code{gettext}
550 for plurals, since the rules for singular and plural forms in English
551 are not the same in other languages. Only @code{ngettext} will allow
552 translators to give correct forms (@pxref{Plural forms,, Additional
553 functions for plural forms, gettext, GNU @code{gettext} utilities}).
556 @deffn {Scheme Procedure} textdomain [domain]
557 @deffnx {C Function} scm_textdomain (domain)
558 Get or set the default gettext domain. When called with no parameter
559 the current domain is returned. When called with a parameter,
560 @var{domain} is set as the current domain, and that new value
561 returned. For example,
564 (textdomain "myprog")
569 @deffn {Scheme Procedure} bindtextdomain domain [directory]
570 @deffnx {C Function} scm_bindtextdomain (domain, directory)
571 Get or set the directory under which to find message files for
572 @var{domain}. When called without a @var{directory} the current
573 setting is returned. When called with a @var{directory},
574 @var{directory} is set for @var{domain} and that new setting returned.
578 (bindtextdomain "myprog" "/my/tree/share/locale")
579 @result{} "/my/tree/share/locale"
582 When using Autoconf/Automake, an application should arrange for the
583 configured @code{localedir} to get into the program (by substituting,
584 or by generating a config file) and set that for its domain. This
585 ensures the catalogue can be found even when installed in a
586 non-standard location.
589 @deffn {Scheme Procedure} bind-textdomain-codeset domain [encoding]
590 @deffnx {C Function} scm_bind_textdomain_codeset (domain, encoding)
591 Get or set the text encoding to be used by @code{gettext} for messages
592 from @var{domain}. @var{encoding} is a string, the name of a coding
593 system, for instance @nicode{"8859_1"}. (On a Unix/POSIX system the
594 @command{iconv} program can list all available encodings.)
596 When called without an @var{encoding} the current setting is returned,
597 or @code{#f} if none yet set. When called with an @var{encoding}, it
598 is set for @var{domain} and that new setting returned. For example,
601 (bind-textdomain-codeset "myprog")
603 (bind-textdomain-codeset "myprog" "latin-9")
607 The encoding requested can be different from the translated data file,
608 messages will be recoded as necessary. But note that when there is no
609 translation, @code{gettext} returns its @var{msg} unchanged, ie.@:
610 without any recoding. For that reason source message strings are best
613 Currently Guile has no understanding of multi-byte characters, and
614 string functions won't recognise character boundaries in multi-byte
615 strings. An application will at least be able to pass such strings
616 through to some output though. Perhaps this will change in the
621 @c TeX-master: "guile.texi"
622 @c ispell-local-dictionary: "american"