Merge changes from emacs-23 branch.
[bpt/emacs.git] / doc / misc / nxml-mode.texi
1 \input texinfo @c -*- texinfo -*-
2 @c %**start of header
3 @setfilename ../../info/nxml-mode
4 @settitle nXML Mode
5 @c %**end of header
6
7 @copying
8 This manual documents nxml-mode, an Emacs major mode for editing
9 XML with RELAX NG support.
10
11 Copyright @copyright{} 2007, 2008, 2009, 2010 Free Software Foundation, Inc.
12
13 @quotation
14 Permission is granted to copy, distribute and/or modify this document
15 under the terms of the GNU Free Documentation License, Version 1.3 or
16 any later version published by the Free Software Foundation; with no
17 Invariant Sections, with the Front-Cover texts being ``A GNU
18 Manual,'' and with the Back-Cover Texts as in (a) below. A copy of the
19 license is included in the section entitled ``GNU Free Documentation
20 License'' in the Emacs manual.
21
22 (a) The FSF's Back-Cover Text is: ``You have the freedom to copy and
23 modify this GNU manual. Buying copies from the FSF supports it in
24 developing GNU and promoting software freedom.''
25
26 This document is part of a collection distributed under the GNU Free
27 Documentation License. If you want to distribute this document
28 separately from the collection, you can do so by adding a copy of the
29 license to the document, as described in section 6 of the license.
30 @end quotation
31 @end copying
32
33 @dircategory Emacs
34 @direntry
35 * nXML Mode: (nxml-mode). XML editing mode with RELAX NG support.
36 @end direntry
37
38 @node Top
39 @top nXML Mode
40
41 @insertcopying
42
43 This manual is not yet complete.
44
45 @menu
46 * Completion::
47 * Inserting end-tags::
48 * Paragraphs::
49 * Outlining::
50 * Locating a schema::
51 * DTDs::
52 * Limitations::
53 @end menu
54
55 @node Completion
56 @chapter Completion
57
58 Apart from real-time validation, the most important feature that
59 nxml-mode provides for assisting in document creation is "completion".
60 Completion assists the user in inserting characters at point, based on
61 knowledge of the schema and on the contents of the buffer before
62 point.
63
64 The traditional GNU Emacs key combination for completion in a
65 buffer is @kbd{M-@key{TAB}}. However, many window systems
66 and window managers use this key combination themselves (typically for
67 switching between windows) and do not pass it to applications. It's
68 hard to find key combinations in GNU Emacs that are both easy to type
69 and not taken by something else. @kbd{C-@key{RET}} (i.e.
70 pressing the Enter or Return key, while the Ctrl key is held down) is
71 available. It won't be available on a traditional terminal (because
72 it is indistinguishable from Return), but it will work with a window
73 system. Therefore we adopt the following solution by default: use
74 @kbd{C-@key{RET}} when there's a window system and
75 @kbd{M-@key{TAB}} when there's not. In the following, I
76 will assume that a window system is being used and will therefore
77 refer to @kbd{C-@key{RET}}.
78
79 Completion works by examining the symbol preceding point. This
80 is the symbol to be completed. The symbol to be completed may be the
81 empty. Completion considers what symbols starting with the symbol to
82 be completed would be valid replacements for the symbol to be
83 completed, given the schema and the contents of the buffer before
84 point. These symbols are the possible completions. An example may
85 make this clearer. Suppose the buffer looks like this (where @point{}
86 indicates point):
87
88 @example
89 <html xmlns="http://www.w3.org/1999/xhtml">
90 <h@point{}
91 @end example
92
93 @noindent
94 and the schema is XHTML. In this context, the symbol to be completed
95 is @samp{h}. The possible completions consist of just
96 @samp{head}. Another example, is
97
98 @example
99 <html xmlns="http://www.w3.org/1999/xhtml">
100 <head>
101 <@point{}
102 @end example
103
104 @noindent
105 In this case, the symbol to be completed is empty, and the possible
106 completions are @samp{base}, @samp{isindex},
107 @samp{link}, @samp{meta}, @samp{script},
108 @samp{style}, @samp{title}. Another example is:
109
110 @example
111 <html xmlns="@point{}
112 @end example
113
114 @noindent
115 In this case, the symbol to be completed is empty, and the possible
116 completions are just @samp{http://www.w3.org/1999/xhtml}.
117
118 When you type @kbd{C-@key{RET}}, what happens depends
119 on what the set of possible completions are.
120
121 @itemize @bullet
122 @item
123 If the set of completions is empty, nothing
124 happens.
125 @item
126 If there is one possible completion, then that completion is
127 inserted, together with any following characters that are
128 required. For example, in this case:
129
130 @example
131 <html xmlns="http://www.w3.org/1999/xhtml">
132 <@point{}
133 @end example
134
135 @noindent
136 @kbd{C-@key{RET}} will yield
137
138 @example
139 <html xmlns="http://www.w3.org/1999/xhtml">
140 <head@point{}
141 @end example
142 @item
143 If there is more than one possible completion, but all
144 possible completions share a common non-empty prefix, then that prefix
145 is inserted. For example, suppose the buffer is:
146
147 @example
148 <html x@point{}
149 @end example
150
151 @noindent
152 The symbol to be completed is @samp{x}. The possible completions
153 are @samp{xmlns} and @samp{xml:lang}. These share a
154 common prefix of @samp{xml}. Thus, @kbd{C-@key{RET}}
155 will yield:
156
157 @example
158 <html xml@point{}
159 @end example
160
161 @noindent
162 Typically, you would do @kbd{C-@key{RET}} again, which would
163 have the result described in the next item.
164 @item
165 If there is more than one possible completion, but the
166 possible completions do not share a non-empty prefix, then Emacs will
167 prompt you to input the symbol in the minibuffer, initializing the
168 minibuffer with the symbol to be completed, and popping up a buffer
169 showing the possible completions. You can now input the symbol to be
170 inserted. The symbol you input will be inserted in the buffer instead
171 of the symbol to be completed. Emacs will then insert any required
172 characters after the symbol. For example, if it contains:
173
174 @example
175 <html xml@point{}
176 @end example
177
178 @noindent
179 Emacs will prompt you in the minibuffer with
180
181 @example
182 Attribute: xml@point{}
183 @end example
184
185 @noindent
186 and the buffer showing possible completions will contain
187
188 @example
189 Possible completions are:
190 xml:lang xmlns
191 @end example
192
193 @noindent
194 If you input @kbd{xmlns}, the result will be:
195
196 @example
197 <html xmlns="@point{}
198 @end example
199
200 @noindent
201 (If you do @kbd{C-@key{RET}} again, the namespace URI will
202 be inserted. Should that happen automatically?)
203 @end itemize
204
205 @node Inserting end-tags
206 @chapter Inserting end-tags
207
208 The main redundancy in XML syntax is end-tags. nxml-mode provides
209 several ways to make it easier to enter end-tags. You can use all of
210 these without a schema.
211
212 You can use @kbd{C-@key{RET}} after @samp{</}
213 to complete the rest of the end-tag.
214
215 @kbd{C-c C-f} inserts an end-tag for the element containing
216 point. This command is useful when you want to input the start-tag,
217 then input the content and finally input the end-tag. The @samp{f}
218 is mnemonic for finish.
219
220 If you want to keep tags balanced and input the end-tag at the
221 same time as the start-tag, before inputting the content, then you can
222 use @kbd{C-c C-i}. This inserts a @samp{>}, then inserts
223 the end-tag and leaves point before the end-tag. @kbd{C-c C-b}
224 is similar but more convenient for block-level elements: it puts the
225 start-tag, point and the end-tag on successive lines, appropriately
226 indented. The @samp{i} is mnemonic for inline and the
227 @samp{b} is mnemonic for block.
228
229 Finally, you can customize nxml-mode so that @kbd{/}
230 automatically inserts the rest of the end-tag when it occurs after
231 @samp{<}, by doing
232
233 @display
234 @kbd{M-x customize-variable @key{RET} nxml-slash-auto-complete-flag @key{RET}}
235 @end display
236
237 @noindent
238 and then following the instructions in the displayed buffer.
239
240 @node Paragraphs
241 @chapter Paragraphs
242
243 Emacs has several commands that operate on paragraphs, most
244 notably @kbd{M-q}. nXML mode redefines these to work in a way
245 that is useful for XML. The exact rules that are used to find the
246 beginning and end of a paragraph are complicated; they are designed
247 mainly to ensure that @kbd{M-q} does the right thing.
248
249 A paragraph consists of one or more complete, consecutive lines.
250 A group of lines is not considered a paragraph unless it contains some
251 non-whitespace characters between tags or inside comments. A blank
252 line separates paragraphs. A single tag on a line by itself also
253 separates paragraphs. More precisely, if one tag together with any
254 leading and trailing whitespace completely occupy one or more lines,
255 then those lines will not be included in any paragraph.
256
257 A start-tag at the beginning of the line (possibly indented) may
258 be treated as starting a paragraph. Similarly, an end-tag at the end
259 of the line may be treated as ending a paragraph. The following rules
260 are used to determine whether such a tag is in fact treated as a
261 paragraph boundary:
262
263 @itemize @bullet
264 @item
265 If the schema does not allow text at that point, then it
266 is a paragraph boundary.
267 @item
268 If the end-tag corresponding to the start-tag is not at
269 the end of its line, or the start-tag corresponding to the end-tag is
270 not at the beginning of its line, then it is not a paragraph
271 boundary. For example, in
272
273 @example
274 <p>This is a paragraph with an
275 <emph>emphasized</emph> phrase.
276 @end example
277
278 @noindent
279 the @samp{<emph>} start-tag would not be considered as
280 starting a paragraph, because its corresponding end-tag is not at the
281 end of the line.
282 @item
283 If there is text that is a sibling in element tree, then
284 it is not a paragraph boundary. For example, in
285
286 @example
287 <p>This is a paragraph with an
288 <emph>emphasized phrase that takes one source line</emph>
289 @end example
290
291 @noindent
292 the @samp{<emph>} start-tag would not be considered as
293 starting a paragraph, even though its end-tag is at the end of its
294 line, because there the text @samp{This is a paragraph with an}
295 is a sibling of the @samp{emph} element.
296 @item
297 Otherwise, it is a paragraph boundary.
298 @end itemize
299
300 @node Outlining
301 @chapter Outlining
302
303 nXML mode allows you to display all or part of a buffer as an
304 outline, in a similar way to Emacs' outline mode. An outline in nXML
305 mode is based on recognizing two kinds of element: sections and
306 headings. There is one heading for every section and one section for
307 every heading. A section contains its heading as or within its first
308 child element. A section also contains its subordinate sections (its
309 subsections). The text content of a section consists of anything in a
310 section that is neither a subsection nor a heading.
311
312 Note that this is a different model from that used by XHTML.
313 nXML mode's outline support will not be useful for XHTML unless you
314 adopt a convention of adding a @code{div} to enclose each
315 section, rather than having sections implicitly delimited by different
316 @code{h@var{n}} elements. This limitation may be removed
317 in a future version.
318
319 The variable @code{nxml-section-element-name-regexp} gives
320 a regexp for the local names (i.e. the part of the name following any
321 prefix) of section elements. The variable
322 @code{nxml-heading-element-name-regexp} gives a regexp for the
323 local names of heading elements. For an element to be recognized
324 as a section
325
326 @itemize @bullet
327 @item
328 its start-tag must occur at the beginning of a line
329 (possibly indented);
330 @item
331 its local name must match
332 @code{nxml-section-element-name-regexp};
333 @item
334 either its first child element or a descendant of that
335 first child element must have a local name that matches
336 @code{nxml-heading-element-name-regexp}; the first such element
337 is treated as the section's heading.
338 @end itemize
339
340 @noindent
341 You can customize these variables using @kbd{M-x
342 customize-variable}.
343
344 There are three possible outline states for a section:
345
346 @itemize @bullet
347 @item
348 normal, showing everything, including its heading, text
349 content and subsections; each subsection is displayed according to the
350 state of that subsection;
351 @item
352 showing just its heading, with both its text content and
353 its subsections hidden; all subsections are hidden regardless of their
354 state;
355 @item
356 showing its heading and its subsections, with its text
357 content hidden; each subsection is displayed according to the state of
358 that subsection.
359 @end itemize
360
361 In the last two states, where the text content is hidden, the
362 heading is displayed specially, in an abbreviated form. An element
363 like this:
364
365 @example
366 <section>
367 <title>Food</title>
368 <para>There are many kinds of food.</para>
369 </section>
370 @end example
371
372 @noindent
373 would be displayed on a single line like this:
374
375 @example
376 <-section>Food...</>
377 @end example
378
379 @noindent
380 If there are hidden subsections, then a @code{+} will be used
381 instead of a @code{-} like this:
382
383 @example
384 <+section>Food...</>
385 @end example
386
387 @noindent
388 If there are non-hidden subsections, then the section will instead be
389 displayed like this:
390
391 @example
392 <-section>Food...
393 <-section>Delicious Food...</>
394 <-section>Distasteful Food...</>
395 </-section>
396 @end example
397
398 @noindent
399 The heading is always displayed with an indent that corresponds to its
400 depth in the outline, even it is not actually indented in the buffer.
401 The variable @code{nxml-outline-child-indent} controls how much
402 a subheading is indented with respect to its parent heading when the
403 heading is being displayed specially.
404
405 Commands to change the outline state of sections are bound to
406 key sequences that start with @kbd{C-c C-o} (@kbd{o} is
407 mnemonic for outline). The third and final key has been chosen to be
408 consistent with outline mode. In the following descriptions
409 current section means the section containing point, or, more precisely,
410 the innermost section containing the character immediately following
411 point.
412
413 @itemize @bullet
414 @item
415 @kbd{C-c C-o C-a} shows all sections in the buffer
416 normally.
417 @item
418 @kbd{C-c C-o C-t} hides the text content
419 of all sections in the buffer.
420 @item
421 @kbd{C-c C-o C-c} hides the text content
422 of the current section.
423 @item
424 @kbd{C-c C-o C-e} shows the text content
425 of the current section.
426 @item
427 @kbd{C-c C-o C-d} hides the text content
428 and subsections of the current section.
429 @item
430 @kbd{C-c C-o C-s} shows the current section
431 and all its direct and indirect subsections normally.
432 @item
433 @kbd{C-c C-o C-k} shows the headings of the
434 direct and indirect subsections of the current section.
435 @item
436 @kbd{C-c C-o C-l} hides the text content of the
437 current section and of its direct and indirect
438 subsections.
439 @item
440 @kbd{C-c C-o C-i} shows the headings of the
441 direct subsections of the current section.
442 @item
443 @kbd{C-c C-o C-o} hides as much as possible without
444 hiding the current section's text content; the headings of ancestor
445 sections of the current section and their child section sections will
446 not be hidden.
447 @end itemize
448
449 When a heading is displayed specially, you can use
450 @key{RET} in that heading to show the text content of the section
451 in the same way as @kbd{C-c C-o C-e}.
452
453 You can also use the mouse to change the outline state:
454 @kbd{S-mouse-2} hides the text content of a section in the same
455 way as@kbd{C-c C-o C-c}; @kbd{mouse-2} on a specially
456 displayed heading shows the text content of the section in the same
457 way as @kbd{C-c C-o C-e}; @kbd{mouse-1} on a specially
458 displayed start-tag toggles the display of subheadings on and
459 off.
460
461 The outline state for each section is stored with the first
462 character of the section (as a text property). Every command that
463 changes the outline state of any section updates the display of the
464 buffer so that each section is displayed correctly according to its
465 outline state. If the section structure is subsequently changed, then
466 it is possible for the display to no longer correctly reflect the
467 stored outline state. @kbd{C-c C-o C-r} can be used to refresh
468 the display so it is correct again.
469
470 @node Locating a schema
471 @chapter Locating a schema
472
473 nXML mode has a configurable set of rules to locate a schema for
474 the file being edited. The rules are contained in one or more schema
475 locating files, which are XML documents.
476
477 The variable @samp{rng-schema-locating-files} specifies
478 the list of the file-names of schema locating files that nXML mode
479 should use. The order of the list is significant: when file
480 @var{x} occurs in the list before file @var{y} then rules
481 from file @var{x} have precedence over rules from file
482 @var{y}. A filename specified in
483 @samp{rng-schema-locating-files} may be relative. If so, it will
484 be resolved relative to the document for which a schema is being
485 located. It is not an error if relative file-names in
486 @samp{rng-schema-locating-files} do not exist. You can use
487 @kbd{M-x customize-variable @key{RET} rng-schema-locating-files
488 @key{RET}} to customize the list of schema locating
489 files.
490
491 By default, @samp{rng-schema-locating-files} list has two
492 members: @samp{schemas.xml}, and
493 @samp{@var{dist-dir}/schema/schemas.xml} where
494 @samp{@var{dist-dir}} is the directory containing the nXML
495 distribution. The first member will cause nXML mode to use a file
496 @samp{schemas.xml} in the same directory as the document being
497 edited if such a file exist. The second member contains rules for the
498 schemas that are included with the nXML distribution.
499
500 @menu
501 * Commands for locating a schema::
502 * Schema locating files::
503 @end menu
504
505 @node Commands for locating a schema
506 @section Commands for locating a schema
507
508 The command @kbd{C-c C-s C-w} will tell you what schema
509 is currently being used.
510
511 The rules for locating a schema are applied automatically when
512 you visit a file in nXML mode. However, if you have just created a new
513 file and the schema cannot be inferred from the file-name, then this
514 will not locate the right schema. In this case, you should insert the
515 start-tag of the root element and then use the command @kbd{C-c C-s
516 C-a}, which reapplies the rules based on the current content of
517 the document. It is usually not necessary to insert the complete
518 start-tag; often just @samp{<@var{name}} is
519 enough.
520
521 If you want to use a schema that has not yet been added to the
522 schema locating files, you can use the command @kbd{C-c C-s C-f}
523 to manually select the file containing the schema for the document in
524 current buffer. Emacs will read the file-name of the schema from the
525 minibuffer. After reading the file-name, Emacs will ask whether you
526 wish to add a rule to a schema locating file that persistently
527 associates the document with the selected schema. The rule will be
528 added to the first file in the list specified
529 @samp{rng-schema-locating-files}; it will create the file if
530 necessary, but will not create a directory. If the variable
531 @samp{rng-schema-locating-files} has not been customized, this
532 means that the rule will be added to the file @samp{schemas.xml}
533 in the same directory as the document being edited.
534
535 The command @kbd{C-c C-s C-t} allows you to select a schema by
536 specifying an identifier for the type of the document. The schema
537 locating files determine the available type identifiers and what
538 schema is used for each type identifier. This is useful when it is
539 impossible to infer the right schema from either the file-name or the
540 content of the document, even though the schema is already in the
541 schema locating file. A situation in which this can occur is when
542 there are multiple variants of a schema where all valid documents have
543 the same document element. For example, XHTML has Strict and
544 Transitional variants. In a situation like this, a schema locating file
545 can define a type identifier for each variant. As with @kbd{C-c
546 C-s C-f}, Emacs will ask whether you wish to add a rule to a schema
547 locating file that persistently associates the document with the
548 specified type identifier.
549
550 The command @kbd{C-c C-s C-l} adds a rule to a schema
551 locating file that persistently associates the document with
552 the schema that is currently being used.
553
554 @node Schema locating files
555 @section Schema locating files
556
557 Each schema locating file specifies a list of rules. The rules
558 from each file are appended in order. To locate a schema each rule is
559 applied in turn until a rule matches. The first matching rule is then
560 used to determine the schema.
561
562 Schema locating files are designed to be useful for other
563 applications that need to locate a schema for a document. In fact,
564 there is nothing specific to locating schemas in the design; it could
565 equally well be used for locating a stylesheet.
566
567 @menu
568 * Schema locating file syntax basics::
569 * Using the document's URI to locate a schema::
570 * Using the document element to locate a schema::
571 * Using type identifiers in schema locating files::
572 * Using multiple schema locating files::
573 @end menu
574
575 @node Schema locating file syntax basics
576 @subsection Schema locating file syntax basics
577
578 There is a schema for schema locating files in the file
579 @samp{locate.rnc} in the schema directory. Schema locating
580 files must be valid with respect to this schema.
581
582 The document element of a schema locating file must be
583 @samp{locatingRules} and the namespace URI must be
584 @samp{http://thaiopensource.com/ns/locating-rules/1.0}. The
585 children of the document element specify rules. The order of the
586 children is the same as the order of the rules. Here's a complete
587 example of a schema locating file:
588
589 @example
590 <?xml version="1.0"?>
591 <locatingRules xmlns="http://thaiopensource.com/ns/locating-rules/1.0">
592 <namespace ns="http://www.w3.org/1999/xhtml" uri="xhtml.rnc"/>
593 <documentElement localName="book" uri="docbook.rnc"/>
594 </locatingRules>
595 @end example
596
597 @noindent
598 This says to use the schema @samp{xhtml.rnc} for a document with
599 namespace @samp{http://www.w3.org/1999/xhtml}, and to use the
600 schema @samp{docbook.rnc} for a document whose local name is
601 @samp{book}. If the document element had both a namespace URI
602 of @samp{http://www.w3.org/1999/xhtml} and a local name of
603 @samp{book}, then the matching rule that comes first will be
604 used and so the schema @samp{xhtml.rnc} would be used. There is
605 no precedence between different types of rule; the first matching rule
606 of any type is used.
607
608 As usual with XML-related technologies, resources are identified
609 by URIs. The @samp{uri} attribute identifies the schema by
610 specifying the URI. The URI may be relative. If so, it is resolved
611 relative to the URI of the schema locating file that contains
612 attribute. This means that if the value of @samp{uri} attribute
613 does not contain a @samp{/}, then it will refer to a filename in
614 the same directory as the schema locating file.
615
616 @node Using the document's URI to locate a schema
617 @subsection Using the document's URI to locate a schema
618
619 A @samp{uri} rule locates a schema based on the URI of the
620 document. The @samp{uri} attribute specifies the URI of the
621 schema. The @samp{resource} attribute can be used to specify
622 the schema for a particular document. For example,
623
624 @example
625 <uri resource="spec.xml" uri="docbook.rnc"/>
626 @end example
627
628 @noindent
629 specifies that the schema for @samp{spec.xml} is
630 @samp{docbook.rnc}.
631
632 The @samp{pattern} attribute can be used instead of the
633 @samp{resource} attribute to specify the schema for any document
634 whose URI matches a pattern. The pattern has the same syntax as an
635 absolute or relative URI except that the path component of the URI can
636 use a @samp{*} character to stand for zero or more characters
637 within a path segment (i.e. any character other @samp{/}).
638 Typically, the URI pattern looks like a relative URI, but, whereas a
639 relative URI in the @samp{resource} attribute is resolved into a
640 particular absolute URI using the base URI of the schema locating
641 file, a relative URI pattern matches if it matches some number of
642 complete path segments of the document's URI ending with the last path
643 segment of the document's URI. For example,
644
645 @example
646 <uri pattern="*.xsl" uri="xslt.rnc"/>
647 @end example
648
649 @noindent
650 specifies that the schema for documents with a URI whose path ends
651 with @samp{.xsl} is @samp{xslt.rnc}.
652
653 A @samp{transformURI} rule locates a schema by
654 transforming the URI of the document. The @samp{fromPattern}
655 attribute specifies a URI pattern with the same meaning as the
656 @samp{pattern} attribute of the @samp{uri} element. The
657 @samp{toPattern} attribute is a URI pattern that is used to
658 generate the URI of the schema. Each @samp{*} in the
659 @samp{toPattern} is replaced by the string that matched the
660 corresponding @samp{*} in the @samp{fromPattern}. The
661 resulting string is appended to the initial part of the document's URI
662 that was not explicitly matched by the @samp{fromPattern}. The
663 rule matches only if the transformed URI identifies an existing
664 resource. For example, the rule
665
666 @example
667 <transformURI fromPattern="*.xml" toPattern="*.rnc"/>
668 @end example
669
670 @noindent
671 would transform the URI @samp{file:///home/jjc/docs/spec.xml}
672 into the URI @samp{file:///home/jjc/docs/spec.rnc}. Thus, this
673 rule specifies that to locate a schema for a document
674 @samp{@var{foo}.xml}, Emacs should test whether a file
675 @samp{@var{foo}.rnc} exists in the same directory as
676 @samp{@var{foo}.xml}, and, if so, should use it as the
677 schema.
678
679 @node Using the document element to locate a schema
680 @subsection Using the document element to locate a schema
681
682 A @samp{documentElement} rule locates a schema based on
683 the local name and prefix of the document element. For example, a rule
684
685 @example
686 <documentElement prefix="xsl" localName="stylesheet" uri="xslt.rnc"/>
687 @end example
688
689 @noindent
690 specifies that when the name of the document element is
691 @samp{xsl:stylesheet}, then @samp{xslt.rnc} should be used
692 as the schema. Either the @samp{prefix} or
693 @samp{localName} attribute may be omitted to allow any prefix or
694 local name.
695
696 A @samp{namespace} rule locates a schema based on the
697 namespace URI of the document element. For example, a rule
698
699 @example
700 <namespace ns="http://www.w3.org/1999/XSL/Transform" uri="xslt.rnc"/>
701 @end example
702
703 @noindent
704 specifies that when the namespace URI of the document is
705 @samp{http://www.w3.org/1999/XSL/Transform}, then
706 @samp{xslt.rnc} should be used as the schema.
707
708 @node Using type identifiers in schema locating files
709 @subsection Using type identifiers in schema locating files
710
711 Type identifiers allow a level of indirection in locating the
712 schema for a document. Instead of associating the document directly
713 with a schema URI, the document is associated with a type identifier,
714 which is in turn associated with a schema URI. nXML mode does not
715 constrain the format of type identifiers. They can be simply strings
716 without any formal structure or they can be public identifiers or
717 URIs. Note that these type identifiers have nothing to do with the
718 DOCTYPE declaration. When comparing type identifiers, whitespace is
719 normalized in the same way as with the @samp{xsd:token}
720 datatype: leading and trailing whitespace is stripped; other sequences
721 of whitespace are normalized to a single space character.
722
723 Each of the rules described in previous sections that uses a
724 @samp{uri} attribute to specify a schema, can instead use a
725 @samp{typeId} attribute to specify a type identifier. The type
726 identifier can be associated with a URI using a @samp{typeId}
727 element. For example,
728
729 @example
730 <locatingRules xmlns="http://thaiopensource.com/ns/locating-rules/1.0">
731 <namespace ns="http://www.w3.org/1999/xhtml" typeId="XHTML"/>
732 <typeId id="XHTML" typeId="XHTML Strict"/>
733 <typeId id="XHTML Strict" uri="xhtml-strict.rnc"/>
734 <typeId id="XHTML Transitional" uri="xhtml-transitional.rnc"/>
735 </locatingRules>
736 @end example
737
738 @noindent
739 declares three type identifiers @samp{XHTML} (representing the
740 default variant of XHTML to be used), @samp{XHTML Strict} and
741 @samp{XHTML Transitional}. Such a schema locating file would
742 use @samp{xhtml-strict.rnc} for a document whose namespace is
743 @samp{http://www.w3.org/1999/xhtml}. But it is considerably
744 more flexible than a schema locating file that simply specified
745
746 @example
747 <namespace ns="http://www.w3.org/1999/xhtml" uri="xhtml-strict.rnc"/>
748 @end example
749
750 @noindent
751 A user can easily use @kbd{C-c C-s C-t} to select between XHTML
752 Strict and XHTML Transitional. Also, a user can easily add a catalog
753
754 @example
755 <locatingRules xmlns="http://thaiopensource.com/ns/locating-rules/1.0">
756 <typeId id="XHTML" typeId="XHTML Transitional"/>
757 </locatingRules>
758 @end example
759
760 @noindent
761 that makes the default variant of XHTML be XHTML Transitional.
762
763 @node Using multiple schema locating files
764 @subsection Using multiple schema locating files
765
766 The @samp{include} element includes rules from another
767 schema locating file. The behavior is exactly as if the rules from
768 that file were included in place of the @samp{include} element.
769 Relative URIs are resolved into absolute URIs before the inclusion is
770 performed. For example,
771
772 @example
773 <include rules="../rules.xml"/>
774 @end example
775
776 @noindent
777 includes the rules from @samp{rules.xml}.
778
779 The process of locating a schema takes as input a list of schema
780 locating files. The rules in all these files and in the files they
781 include are resolved into a single list of rules, which are applied
782 strictly in order. Sometimes this order is not what is needed.
783 For example, suppose you have two schema locating files, a private
784 file
785
786 @example
787 <locatingRules xmlns="http://thaiopensource.com/ns/locating-rules/1.0">
788 <namespace ns="http://www.w3.org/1999/xhtml" uri="xhtml.rnc"/>
789 </locatingRules>
790 @end example
791
792 @noindent
793 followed by a public file
794
795 @example
796 <locatingRules xmlns="http://thaiopensource.com/ns/locating-rules/1.0">
797 <transformURI pathSuffix=".xml" replacePathSuffix=".rnc"/>
798 <namespace ns="http://www.w3.org/1999/XSL/Transform" typeId="XSLT"/>
799 </locatingRules>
800 @end example
801
802 @noindent
803 The effect of these two files is that the XHTML @samp{namespace}
804 rule takes precedence over the @samp{transformURI} rule, which
805 is almost certainly not what is needed. This can be solved by adding
806 an @samp{applyFollowingRules} to the private file.
807
808 @example
809 <locatingRules xmlns="http://thaiopensource.com/ns/locating-rules/1.0">
810 <applyFollowingRules ruleType="transformURI"/>
811 <namespace ns="http://www.w3.org/1999/xhtml" uri="xhtml.rnc"/>
812 </locatingRules>
813 @end example
814
815 @node DTDs
816 @chapter DTDs
817
818 nxml-mode is designed to support the creation of standalone XML
819 documents that do not depend on a DTD. Although it is common practice
820 to insert a DOCTYPE declaration referencing an external DTD, this has
821 undesirable side-effects. It means that the document is no longer
822 self-contained. It also means that different XML parsers may interpret
823 the document in different ways, since the XML Recommendation does not
824 require XML parsers to read the DTD. With DTDs, it was impractical to
825 get validation without using an external DTD or reference to an
826 parameter entity. With RELAX NG and other schema languages, you can
827 simulataneously get the benefits of validation and standalone XML
828 documents. Therefore, I recommend that you do not reference an
829 external DOCTYPE in your XML documents.
830
831 One problem is entities for characters. Typically, as well as
832 providing validation, DTDs also provide a set of character entities
833 for documents to use. Schemas cannot provide this functionality,
834 because schema validation happens after XML parsing. The recommended
835 solution is to either use the Unicode characters directly, or, if this
836 is impractical, use character references. nXML mode supports this by
837 providing commands for entering characters and character references
838 using the Unicode names, and can display the glyph corresponding to a
839 character reference.
840
841 @node Limitations
842 @chapter Limitations
843
844 nXML mode has some limitations:
845
846 @itemize @bullet
847 @item
848 DTD support is limited. Internal parsed general entities declared
849 in the internal subset are supported provided they do not contain
850 elements. Other usage of DTDs is ignored.
851 @item
852 The restrictions on RELAX NG schemas in section 7 of the RELAX NG
853 specification are not enforced.
854 @item
855 Unicode support has problems. This stems mostly from the fact that
856 the XML (and RELAX NG) character model is based squarely on Unicode,
857 whereas the Emacs character model is not. Emacs 22 is slated to have
858 full Unicode support, which should improve the situation here.
859 @end itemize
860
861 @bye
862
863 @ignore
864 arch-tag: 3b6e8ac2-ae8d-4f38-bd43-ce9f80be04d6
865 @end ignore