Merge from emacs--devo--0
[bpt/emacs.git] / doc / misc / nxml-mode.texi
1 \input texinfo @c -*- texinfo -*-
2 @c %**start of header
3 @setfilename ../../info/nxml-mode
4 @settitle nXML Mode
5 @c %**end of header
6
7 @dircategory Emacs
8 @direntry
9 * nXML Mode: (nxml-mode). XML editing mode with RELAX NG support.
10 @end direntry
11
12 @node Top
13 @top nXML Mode
14
15 This manual documents nxml-mode, an Emacs major mode for editing
16 XML with RELAX NG support. This manual is not yet complete.
17
18 @menu
19 * Completion::
20 * Inserting end-tags::
21 * Paragraphs::
22 * Outlining::
23 * Locating a schema::
24 * DTDs::
25 * Limitations::
26 @end menu
27
28 @node Completion
29 @chapter Completion
30
31 Apart from real-time validation, the most important feature that
32 nxml-mode provides for assisting in document creation is "completion".
33 Completion assists the user in inserting characters at point, based on
34 knowledge of the schema and on the contents of the buffer before
35 point.
36
37 The traditional GNU Emacs key combination for completion in a
38 buffer is @kbd{M-@key{TAB}}. However, many window systems
39 and window managers use this key combination themselves (typically for
40 switching between windows) and do not pass it to applications. It's
41 hard to find key combinations in GNU Emacs that are both easy to type
42 and not taken by something else. @kbd{C-@key{RET}} (i.e.
43 pressing the Enter or Return key, while the Ctrl key is held down) is
44 available. It won't be available on a traditional terminal (because
45 it is indistinguishable from Return), but it will work with a window
46 system. Therefore we adopt the following solution by default: use
47 @kbd{C-@key{RET}} when there's a window system and
48 @kbd{M-@key{TAB}} when there's not. In the following, I
49 will assume that a window system is being used and will therefore
50 refer to @kbd{C-@key{RET}}.
51
52 Completion works by examining the symbol preceding point. This
53 is the symbol to be completed. The symbol to be completed may be the
54 empty. Completion considers what symbols starting with the symbol to
55 be completed would be valid replacements for the symbol to be
56 completed, given the schema and the contents of the buffer before
57 point. These symbols are the possible completions. An example may
58 make this clearer. Suppose the buffer looks like this (where @point{}
59 indicates point):
60
61 @example
62 <html xmlns="http://www.w3.org/1999/xhtml">
63 <h@point{}
64 @end example
65
66 @noindent
67 and the schema is XHTML. In this context, the symbol to be completed
68 is @samp{h}. The possible completions consist of just
69 @samp{head}. Another example, is
70
71 @example
72 <html xmlns="http://www.w3.org/1999/xhtml">
73 <head>
74 <@point{}
75 @end example
76
77 @noindent
78 In this case, the symbol to be completed is empty, and the possible
79 completions are @samp{base}, @samp{isindex},
80 @samp{link}, @samp{meta}, @samp{script},
81 @samp{style}, @samp{title}. Another example is:
82
83 @example
84 <html xmlns="@point{}
85 @end example
86
87 @noindent
88 In this case, the symbol to be completed is empty, and the possible
89 completions are just @samp{http://www.w3.org/1999/xhtml}.
90
91 When you type @kbd{C-@key{RET}}, what happens depends
92 on what the set of possible completions are.
93
94 @itemize @bullet
95 @item
96 If the set of completions is empty, nothing
97 happens.
98 @item
99 If there is one possible completion, then that completion is
100 inserted, together with any following characters that are
101 required. For example, in this case:
102
103 @example
104 <html xmlns="http://www.w3.org/1999/xhtml">
105 <@point{}
106 @end example
107
108 @noindent
109 @kbd{C-@key{RET}} will yield
110
111 @example
112 <html xmlns="http://www.w3.org/1999/xhtml">
113 <head@point{}
114 @end example
115 @item
116 If there is more than one possible completion, but all
117 possible completions share a common non-empty prefix, then that prefix
118 is inserted. For example, suppose the buffer is:
119
120 @example
121 <html x@point{}
122 @end example
123
124 @noindent
125 The symbol to be completed is @samp{x}. The possible completions
126 are @samp{xmlns} and @samp{xml:lang}. These share a
127 common prefix of @samp{xml}. Thus, @kbd{C-@key{RET}}
128 will yield:
129
130 @example
131 <html xml@point{}
132 @end example
133
134 @noindent
135 Typically, you would do @kbd{C-@key{RET}} again, which would
136 have the result described in the next item.
137 @item
138 If there is more than one possible completion, but the
139 possible completions do not share a non-empty prefix, then Emacs will
140 prompt you to input the symbol in the minibuffer, initializing the
141 minibuffer with the symbol to be completed, and popping up a buffer
142 showing the possible completions. You can now input the symbol to be
143 inserted. The symbol you input will be inserted in the buffer instead
144 of the symbol to be completed. Emacs will then insert any required
145 characters after the symbol. For example, if it contains:
146
147 @example
148 <html xml@point{}
149 @end example
150
151 @noindent
152 Emacs will prompt you in the minibuffer with
153
154 @example
155 Attribute: xml@point{}
156 @end example
157
158 @noindent
159 and the buffer showing possible completions will contain
160
161 @example
162 Possible completions are:
163 xml:lang xmlns
164 @end example
165
166 @noindent
167 If you input @kbd{xmlns}, the result will be:
168
169 @example
170 <html xmlns="@point{}
171 @end example
172
173 @noindent
174 (If you do @kbd{C-@key{RET}} again, the namespace URI will
175 be inserted. Should that happen automatically?)
176 @end itemize
177
178 @node Inserting end-tags
179 @chapter Inserting end-tags
180
181 The main redundancy in XML syntax is end-tags. nxml-mode provides
182 several ways to make it easier to enter end-tags. You can use all of
183 these without a schema.
184
185 You can use @kbd{C-@key{RET}} after @samp{</}
186 to complete the rest of the end-tag.
187
188 @kbd{C-c C-f} inserts an end-tag for the element containing
189 point. This command is useful when you want to input the start-tag,
190 then input the content and finally input the end-tag. The @samp{f}
191 is mnemonic for finish.
192
193 If you want to keep tags balanced and input the end-tag at the
194 same time as the start-tag, before inputting the content, then you can
195 use @kbd{C-c C-i}. This inserts a @samp{>}, then inserts
196 the end-tag and leaves point before the end-tag. @kbd{C-c C-b}
197 is similar but more convenient for block-level elements: it puts the
198 start-tag, point and the end-tag on successive lines, appropriately
199 indented. The @samp{i} is mnemonic for inline and the
200 @samp{b} is mnemonic for block.
201
202 Finally, you can customize nxml-mode so that @kbd{/}
203 automatically inserts the rest of the end-tag when it occurs after
204 @samp{<}, by doing
205
206 @display
207 @kbd{M-x customize-variable @key{RET} nxml-slash-auto-complete-flag @key{RET}}
208 @end display
209
210 @noindent
211 and then following the instructions in the displayed buffer.
212
213 @node Paragraphs
214 @chapter Paragraphs
215
216 Emacs has several commands that operate on paragraphs, most
217 notably @kbd{M-q}. nXML mode redefines these to work in a way
218 that is useful for XML. The exact rules that are used to find the
219 beginning and end of a paragraph are complicated; they are designed
220 mainly to ensure that @kbd{M-q} does the right thing.
221
222 A paragraph consists of one or more complete, consecutive lines.
223 A group of lines is not considered a paragraph unless it contains some
224 non-whitespace characters between tags or inside comments. A blank
225 line separates paragraphs. A single tag on a line by itself also
226 separates paragraphs. More precisely, if one tag together with any
227 leading and trailing whitespace completely occupy one or more lines,
228 then those lines will not be included in any paragraph.
229
230 A start-tag at the beginning of the line (possibly indented) may
231 be treated as starting a paragraph. Similarly, an end-tag at the end
232 of the line may be treated as ending a paragraph. The following rules
233 are used to determine whether such a tag is in fact treated as a
234 paragraph boundary:
235
236 @itemize @bullet
237 @item
238 If the schema does not allow text at that point, then it
239 is a paragraph boundary.
240 @item
241 If the end-tag corresponding to the start-tag is not at
242 the end of its line, or the start-tag corresponding to the end-tag is
243 not at the beginning of its line, then it is not a paragraph
244 boundary. For example, in
245
246 @example
247 <p>This is a paragraph with an
248 <emph>emphasized</emph> phrase.
249 @end example
250
251 @noindent
252 the @samp{<emph>} start-tag would not be considered as
253 starting a paragraph, because its corresponding end-tag is not at the
254 end of the line.
255 @item
256 If there is text that is a sibling in element tree, then
257 it is not a paragraph boundary. For example, in
258
259 @example
260 <p>This is a paragraph with an
261 <emph>emphasized phrase that takes one source line</emph>
262 @end example
263
264 @noindent
265 the @samp{<emph>} start-tag would not be considered as
266 starting a paragraph, even though its end-tag is at the end of its
267 line, because there the text @samp{This is a paragraph with an}
268 is a sibling of the @samp{emph} element.
269 @item
270 Otherwise, it is a paragraph boundary.
271 @end itemize
272
273 @node Outlining
274 @chapter Outlining
275
276 nXML mode allows you to display all or part of a buffer as an
277 outline, in a similar way to Emacs' outline mode. An outline in nXML
278 mode is based on recognizing two kinds of element: sections and
279 headings. There is one heading for every section and one section for
280 every heading. A section contains its heading as or within its first
281 child element. A section also contains its subordinate sections (its
282 subsections). The text content of a section consists of anything in a
283 section that is neither a subsection nor a heading.
284
285 Note that this is a different model from that used by XHTML.
286 nXML mode's outline support will not be useful for XHTML unless you
287 adopt a convention of adding a @code{div} to enclose each
288 section, rather than having sections implicitly delimited by different
289 @code{h@var{n}} elements. This limitation may be removed
290 in a future version.
291
292 The variable @code{nxml-section-element-name-regexp} gives
293 a regexp for the local names (i.e. the part of the name following any
294 prefix) of section elements. The variable
295 @code{nxml-heading-element-name-regexp} gives a regexp for the
296 local names of heading elements. For an element to be recognized
297 as a section
298
299 @itemize @bullet
300 @item
301 its start-tag must occur at the beginning of a line
302 (possibly indented);
303 @item
304 its local name must match
305 @code{nxml-section-element-name-regexp};
306 @item
307 either its first child element or a descendant of that
308 first child element must have a local name that matches
309 @code{nxml-heading-element-name-regexp}; the first such element
310 is treated as the section's heading.
311 @end itemize
312
313 @noindent
314 You can customize these variables using @kbd{M-x
315 customize-variable}.
316
317 There are three possible outline states for a section:
318
319 @itemize @bullet
320 @item
321 normal, showing everything, including its heading, text
322 content and subsections; each subsection is displayed according to the
323 state of that subsection;
324 @item
325 showing just its heading, with both its text content and
326 its subsections hidden; all subsections are hidden regardless of their
327 state;
328 @item
329 showing its heading and its subsections, with its text
330 content hidden; each subsection is displayed according to the state of
331 that subsection.
332 @end itemize
333
334 In the last two states, where the text content is hidden, the
335 heading is displayed specially, in an abbreviated form. An element
336 like this:
337
338 @example
339 <section>
340 <title>Food</title>
341 <para>There are many kinds of food.</para>
342 </section>
343 @end example
344
345 @noindent
346 would be displayed on a single line like this:
347
348 @example
349 <-section>Food...</>
350 @end example
351
352 @noindent
353 If there are hidden subsections, then a @code{+} will be used
354 instead of a @code{-} like this:
355
356 @example
357 <+section>Food...</>
358 @end example
359
360 @noindent
361 If there are non-hidden subsections, then the section will instead be
362 displayed like this:
363
364 @example
365 <-section>Food...
366 <-section>Delicious Food...</>
367 <-section>Distasteful Food...</>
368 </-section>
369 @end example
370
371 @noindent
372 The heading is always displayed with an indent that corresponds to its
373 depth in the outline, even it is not actually indented in the buffer.
374 The variable @code{nxml-outline-child-indent} controls how much
375 a subheading is indented with respect to its parent heading when the
376 heading is being displayed specially.
377
378 Commands to change the outline state of sections are bound to
379 key sequences that start with @kbd{C-c C-o} (@kbd{o} is
380 mnemonic for outline). The third and final key has been chosen to be
381 consistent with outline mode. In the following descriptions
382 current section means the section containing point, or, more precisely,
383 the innermost section containing the character immediately following
384 point.
385
386 @itemize @bullet
387 @item
388 @kbd{C-c C-o C-a} shows all sections in the buffer
389 normally.
390 @item
391 @kbd{C-c C-o C-t} hides the text content
392 of all sections in the buffer.
393 @item
394 @kbd{C-c C-o C-c} hides the text content
395 of the current section.
396 @item
397 @kbd{C-c C-o C-e} shows the text content
398 of the current section.
399 @item
400 @kbd{C-c C-o C-d} hides the text content
401 and subsections of the current section.
402 @item
403 @kbd{C-c C-o C-s} shows the current section
404 and all its direct and indirect subsections normally.
405 @item
406 @kbd{C-c C-o C-k} shows the headings of the
407 direct and indirect subsections of the current section.
408 @item
409 @kbd{C-c C-o C-l} hides the text content of the
410 current section and of its direct and indirect
411 subsections.
412 @item
413 @kbd{C-c C-o C-i} shows the headings of the
414 direct subsections of the current section.
415 @item
416 @kbd{C-c C-o C-o} hides as much as possible without
417 hiding the current section's text content; the headings of ancestor
418 sections of the current section and their child section sections will
419 not be hidden.
420 @end itemize
421
422 When a heading is displayed specially, you can use
423 @key{RET} in that heading to show the text content of the section
424 in the same way as @kbd{C-c C-o C-e}.
425
426 You can also use the mouse to change the outline state:
427 @kbd{S-mouse-2} hides the text content of a section in the same
428 way as@kbd{C-c C-o C-c}; @kbd{mouse-2} on a specially
429 displayed heading shows the text content of the section in the same
430 way as @kbd{C-c C-o C-e}; @kbd{mouse-1} on a specially
431 displayed start-tag toggles the display of subheadings on and
432 off.
433
434 The outline state for each section is stored with the first
435 character of the section (as a text property). Every command that
436 changes the outline state of any section updates the display of the
437 buffer so that each section is displayed correctly according to its
438 outline state. If the section structure is subsequently changed, then
439 it is possible for the display to no longer correctly reflect the
440 stored outline state. @kbd{C-c C-o C-r} can be used to refresh
441 the display so it is correct again.
442
443 @node Locating a schema
444 @chapter Locating a schema
445
446 nXML mode has a configurable set of rules to locate a schema for
447 the file being edited. The rules are contained in one or more schema
448 locating files, which are XML documents.
449
450 The variable @samp{rng-schema-locating-files} specifies
451 the list of the file-names of schema locating files that nXML mode
452 should use. The order of the list is significant: when file
453 @var{x} occurs in the list before file @var{y} then rules
454 from file @var{x} have precedence over rules from file
455 @var{y}. A filename specified in
456 @samp{rng-schema-locating-files} may be relative. If so, it will
457 be resolved relative to the document for which a schema is being
458 located. It is not an error if relative file-names in
459 @samp{rng-schema-locating-files} do not not exist. You can use
460 @kbd{M-x customize-variable @key{RET} rng-schema-locating-files
461 @key{RET}} to customize the list of schema locating
462 files.
463
464 By default, @samp{rng-schema-locating-files} list has two
465 members: @samp{schemas.xml}, and
466 @samp{@var{dist-dir}/schema/schemas.xml} where
467 @samp{@var{dist-dir}} is the directory containing the nXML
468 distribution. The first member will cause nXML mode to use a file
469 @samp{schemas.xml} in the same directory as the document being
470 edited if such a file exist. The second member contains rules for the
471 schemas that are included with the nXML distribution.
472
473 @menu
474 * Commands for locating a schema::
475 * Schema locating files::
476 @end menu
477
478 @node Commands for locating a schema
479 @section Commands for locating a schema
480
481 The command @kbd{C-c C-s C-w} will tell you what schema
482 is currently being used.
483
484 The rules for locating a schema are applied automatically when
485 you visit a file in nXML mode. However, if you have just created a new
486 file and the schema cannot be inferred from the file-name, then this
487 will not locate the right schema. In this case, you should insert the
488 start-tag of the root element and then use the command @kbd{C-c
489 C-a}, which reapplies the rules based on the current content of
490 the document. It is usually not necessary to insert the complete
491 start-tag; often just @samp{<@var{name}} is
492 enough.
493
494 If you want to use a schema that has not yet been added to the
495 schema locating files, you can use the command @kbd{C-c C-s C-f}
496 to manually select the file contaiing the schema for the document in
497 current buffer. Emacs will read the file-name of the schema from the
498 minibuffer. After reading the file-name, Emacs will ask whether you
499 wish to add a rule to a schema locating file that persistently
500 associates the document with the selected schema. The rule will be
501 added to the first file in the list specified
502 @samp{rng-schema-locating-files}; it will create the file if
503 necessary, but will not create a directory. If the variable
504 @samp{rng-schema-locating-files} has not been customized, this
505 means that the rule will be added to the file @samp{schemas.xml}
506 in the same directory as the document being edited.
507
508 The command @kbd{C-c C-s C-t} allows you to select a schema by
509 specifying an identifier for the type of the document. The schema
510 locating files determine the available type identifiers and what
511 schema is used for each type identifier. This is useful when it is
512 impossible to infer the right schema from either the file-name or the
513 content of the document, even though the schema is already in the
514 schema locating file. A situation in which this can occur is when
515 there are multiple variants of a schema where all valid documents have
516 the same document element. For example, XHTML has Strict and
517 Transitional variants. In a situation like this, a schema locating file
518 can define a type identifier for each variant. As with @kbd{C-c
519 C-s C-f}, Emacs will ask whether you wish to add a rule to a schema
520 locating file that persistently associates the document with the
521 specified type identifier.
522
523 The command @kbd{C-c C-s C-l} adds a rule to a schema
524 locating file that persistently associates the document with
525 the schema that is currently being used.
526
527 @node Schema locating files
528 @section Schema locating files
529
530 Each schema locating file specifies a list of rules. The rules
531 from each file are appended in order. To locate a schema each rule is
532 applied in turn until a rule matches. The first matching rule is then
533 used to determine the schema.
534
535 Schema locating files are designed to be useful for other
536 applications that need to locate a schema for a document. In fact,
537 there is nothing specific to locating schemas in the design; it could
538 equally well be used for locating a stylesheet.
539
540 @menu
541 * Schema locating file syntax basics::
542 * Using the document's URI to locate a schema::
543 * Using the document element to locate a schema::
544 * Using type identifiers in schema locating files::
545 * Using multiple schema locating files::
546 @end menu
547
548 @node Schema locating file syntax basics
549 @subsection Schema locating file syntax basics
550
551 There is a schema for schema locating files in the file
552 @samp{locate.rnc} in the schema directory. Schema locating
553 files must be valid with respect to this schema.
554
555 The document element of a schema locating file must be
556 @samp{locatingRules} and the namespace URI must be
557 @samp{http://thaiopensource.com/ns/locating-rules/1.0}. The
558 children of the document element specify rules. The order of the
559 children is the same as the order of the rules. Here's a complete
560 example of a schema locating file:
561
562 @example
563 <?xml version="1.0"?>
564 <locatingRules xmlns="http://thaiopensource.com/ns/locating-rules/1.0">
565 <namespace ns="http://www.w3.org/1999/xhtml" uri="xhtml.rnc"/>
566 <documentElement localName="book" uri="docbook.rnc"/>
567 </locatingRules>
568 @end example
569
570 @noindent
571 This says to use the schema @samp{xhtml.rnc} for a document with
572 namespace @samp{http://www.w3.org/1999/xhtml}, and to use the
573 schema @samp{docbook.rnc} for a document whose local name is
574 @samp{book}. If the document element had both a namespace URI
575 of @samp{http://www.w3.org/1999/xhtml} and a local name of
576 @samp{book}, then the matching rule that comes first will be
577 used and so the schema @samp{xhtml.rnc} would be used. There is
578 no precedence between different types of rule; the first matching rule
579 of any type is used.
580
581 As usual with XML-related technologies, resources are identified
582 by URIs. The @samp{uri} attribute identifies the schema by
583 specifying the URI. The URI may be relative. If so, it is resolved
584 relative to the URI of the schema locating file that contains
585 attribute. This means that if the value of @samp{uri} attribute
586 does not contain a @samp{/}, then it will refer to a filename in
587 the same directory as the schema locating file.
588
589 @node Using the document's URI to locate a schema
590 @subsection Using the document's URI to locate a schema
591
592 A @samp{uri} rule locates a schema based on the URI of the
593 document. The @samp{uri} attribute specifies the URI of the
594 schema. The @samp{resource} attribute can be used to specify
595 the schema for a particular document. For example,
596
597 @example
598 <uri resource="spec.xml" uri="docbook.rnc"/>
599 @end example
600
601 @noindent
602 specifies that that the schema for @samp{spec.xml} is
603 @samp{docbook.rnc}.
604
605 The @samp{pattern} attribute can be used instead of the
606 @samp{resource} attribute to specify the schema for any document
607 whose URI matches a pattern. The pattern has the same syntax as an
608 absolute or relative URI except that the path component of the URI can
609 use a @samp{*} character to stand for zero or more characters
610 within a path segment (i.e. any character other @samp{/}).
611 Typically, the URI pattern looks like a relative URI, but, whereas a
612 relative URI in the @samp{resource} attribute is resolved into a
613 particular absolute URI using the base URI of the schema locating
614 file, a relative URI pattern matches if it matches some number of
615 complete path segments of the document's URI ending with the last path
616 segment of the document's URI. For example,
617
618 @example
619 <uri pattern="*.xsl" uri="xslt.rnc"/>
620 @end example
621
622 @noindent
623 specifies that the schema for documents with a URI whose path ends
624 with @samp{.xsl} is @samp{xslt.rnc}.
625
626 A @samp{transformURI} rule locates a schema by
627 transforming the URI of the document. The @samp{fromPattern}
628 attribute specifies a URI pattern with the same meaning as the
629 @samp{pattern} attribute of the @samp{uri} element. The
630 @samp{toPattern} attribute is a URI pattern that is used to
631 generate the URI of the schema. Each @samp{*} in the
632 @samp{toPattern} is replaced by the string that matched the
633 corresponding @samp{*} in the @samp{fromPattern}. The
634 resulting string is appended to the initial part of the document's URI
635 that was not explicitly matched by the @samp{fromPattern}. The
636 rule matches only if the transformed URI identifies an existing
637 resource. For example, the rule
638
639 @example
640 <transformURI fromPattern="*.xml" toPattern="*.rnc"/>
641 @end example
642
643 @noindent
644 would transform the URI @samp{file:///home/jjc/docs/spec.xml}
645 into the URI @samp{file:///home/jjc/docs/spec.rnc}. Thus, this
646 rule specifies that to locate a schema for a document
647 @samp{@var{foo}.xml}, Emacs should test whether a file
648 @samp{@var{foo}.rnc} exists in the same directory as
649 @samp{@var{foo}.xml}, and, if so, should use it as the
650 schema.
651
652 @node Using the document element to locate a schema
653 @subsection Using the document element to locate a schema
654
655 A @samp{documentElement} rule locates a schema based on
656 the local name and prefix of the document element. For example, a rule
657
658 @example
659 <documentElement prefix="xsl" localName="stylesheet" uri="xslt.rnc"/>
660 @end example
661
662 @noindent
663 specifies that when the name of the document element is
664 @samp{xsl:stylesheet}, then @samp{xslt.rnc} should be used
665 as the schema. Either the @samp{prefix} or
666 @samp{localName} attribute may be omitted to allow any prefix or
667 local name.
668
669 A @samp{namespace} rule locates a schema based on the
670 namespace URI of the document element. For example, a rule
671
672 @example
673 <namespace ns="http://www.w3.org/1999/XSL/Transform" uri="xslt.rnc"/>
674 @end example
675
676 @noindent
677 specifies that when the namespace URI of the document is
678 @samp{http://www.w3.org/1999/XSL/Transform}, then
679 @samp{xslt.rnc} should be used as the schema.
680
681 @node Using type identifiers in schema locating files
682 @subsection Using type identifiers in schema locating files
683
684 Type identifiers allow a level of indirection in locating the
685 schema for a document. Instead of associating the document directly
686 with a schema URI, the document is associated with a type identifier,
687 which is in turn associated with a schema URI. nXML mode does not
688 constrain the format of type identifiers. They can be simply strings
689 without any formal structure or they can be public identifiers or
690 URIs. Note that these type identifiers have nothing to do with the
691 DOCTYPE declaration. When comparing type identifiers, whitespace is
692 normalized in the same way as with the @samp{xsd:token}
693 datatype: leading and trailing whitespace is stripped; other sequences
694 of whitespace are normalized to a single space character.
695
696 Each of the rules described in previous sections that uses a
697 @samp{uri} attribute to specify a schema, can instead use a
698 @samp{typeId} attribute to specify a type identifier. The type
699 identifier can be associated with a URI using a @samp{typeId}
700 element. For example,
701
702 @example
703 <locatingRules xmlns="http://thaiopensource.com/ns/locating-rules/1.0">
704 <namespace ns="http://www.w3.org/1999/xhtml" typeId="XHTML"/>
705 <typeId id="XHTML" typeId="XHTML Strict"/>
706 <typeId id="XHTML Strict" uri="xhtml-strict.rnc"/>
707 <typeId id="XHTML Transitional" uri="xhtml-transitional.rnc"/>
708 </locatingRules>
709 @end example
710
711 @noindent
712 declares three type identifiers @samp{XHTML} (representing the
713 default variant of XHTML to be used), @samp{XHTML Strict} and
714 @samp{XHTML Transitional}. Such a schema locating file would
715 use @samp{xhtml-strict.rnc} for a document whose namespace is
716 @samp{http://www.w3.org/1999/xhtml}. But it is considerably
717 more flexible than a schema locating file that simply specified
718
719 @example
720 <namespace ns="http://www.w3.org/1999/xhtml" uri="xhtml-strict.rnc"/>
721 @end example
722
723 @noindent
724 A user can easily use @kbd{C-c C-s C-t} to select between XHTML
725 Strict and XHTML Transitional. Also, a user can easily add a catalog
726
727 @example
728 <locatingRules xmlns="http://thaiopensource.com/ns/locating-rules/1.0">
729 <typeId id="XHTML" typeId="XHTML Transitional"/>
730 </locatingRules>
731 @end example
732
733 @noindent
734 that makes the default variant of XHTML be XHTML Transitional.
735
736 @node Using multiple schema locating files
737 @subsection Using multiple schema locating files
738
739 The @samp{include} element includes rules from another
740 schema locating file. The behavior is exactly as if the rules from
741 that file were included in place of the @samp{include} element.
742 Relative URIs are resolved into absolute URIs before the inclusion is
743 performed. For example,
744
745 @example
746 <include rules="../rules.xml"/>
747 @end example
748
749 @noindent
750 includes the rules from @samp{rules.xml}.
751
752 The process of locating a schema takes as input a list of schema
753 locating files. The rules in all these files and in the files they
754 include are resolved into a single list of rules, which are applied
755 strictly in order. Sometimes this order is not what is needed.
756 For example, suppose you have two schema locating files, a private
757 file
758
759 @example
760 <locatingRules xmlns="http://thaiopensource.com/ns/locating-rules/1.0">
761 <namespace ns="http://www.w3.org/1999/xhtml" uri="xhtml.rnc"/>
762 </locatingRules>
763 @end example
764
765 @noindent
766 followed by a public file
767
768 @example
769 <locatingRules xmlns="http://thaiopensource.com/ns/locating-rules/1.0">
770 <transformURI pathSuffix=".xml" replacePathSuffix=".rnc"/>
771 <namespace ns="http://www.w3.org/1999/XSL/Transform" typeId="XSLT"/>
772 </locatingRules>
773 @end example
774
775 @noindent
776 The effect of these two files is that the XHTML @samp{namespace}
777 rule takes precedence over the @samp{transformURI} rule, which
778 is almost certainly not what is needed. This can be solved by adding
779 an @samp{applyFollowingRules} to the private file.
780
781 @example
782 <locatingRules xmlns="http://thaiopensource.com/ns/locating-rules/1.0">
783 <applyFollowingRules ruleType="transformURI"/>
784 <namespace ns="http://www.w3.org/1999/xhtml" uri="xhtml.rnc"/>
785 </locatingRules>
786 @end example
787
788 @node DTDs
789 @chapter DTDs
790
791 nxml-mode is designed to support the creation of standalone XML
792 documents that do not depend on a DTD. Although it is common practice
793 to insert a DOCTYPE declaration referencing an external DTD, this has
794 undesirable side-effects. It means that the document is no longer
795 self-contained. It also means that different XML parsers may interpret
796 the document in different ways, since the XML Recommendation does not
797 require XML parsers to read the DTD. With DTDs, it was impractical to
798 get validation without using an external DTD or reference to an
799 parameter entity. With RELAX NG and other schema languages, you can
800 simulataneously get the benefits of validation and standalone XML
801 documents. Therefore, I recommend that you do not reference an
802 external DOCTYPE in your XML documents.
803
804 One problem is entities for characters. Typically, as well as
805 providing validation, DTDs also provide a set of character entities
806 for documents to use. Schemas cannot provide this functionality,
807 because schema validation happens after XML parsing. The recommended
808 solution is to either use the Unicode characters directly, or, if this
809 is impractical, use character references. nXML mode supports this by
810 providing commands for entering characters and character references
811 using the Unicode names, and can display the glyph corresponding to a
812 character reference.
813
814 @node Limitations
815 @chapter Limitations
816
817 nXML mode has some limitations:
818
819 @itemize @bullet
820 @item
821 DTD support is limited. Internal parsed general entities declared
822 in the internal subset are supported provided they do not contain
823 elements. Other usage of DTDs is ignored.
824 @item
825 The restrictions on RELAX NG schemas in section 7 of the RELAX NG
826 specification are not enforced.
827 @item
828 Unicode support has problems. This stems mostly from the fact that
829 the XML (and RELAX NG) character model is based squarely on Unicode,
830 whereas the Emacs character model is not. Emacs 22 is slated to have
831 full Unicode support, which should improve the situation here.
832 @end itemize
833
834 @bye
835
836 @ignore
837 arch-tag: 3b6e8ac2-ae8d-4f38-bd43-ce9f80be04d6
838 @end ignore