lisp/frameset.el: New frame-id functions.
[bpt/emacs.git] / doc / misc / nxml-mode.texi
1 \input texinfo @c -*- texinfo -*-
2 @c %**start of header
3 @setfilename ../../info/nxml-mode
4 @settitle nXML Mode
5 @c %**end of header
6
7 @copying
8 This manual documents nXML mode, an Emacs major mode for editing
9 XML with RELAX NG support.
10
11 Copyright @copyright{} 2007--2013 Free Software Foundation, Inc.
12
13 @quotation
14 Permission is granted to copy, distribute and/or modify this document
15 under the terms of the GNU Free Documentation License, Version 1.3 or
16 any later version published by the Free Software Foundation; with no
17 Invariant Sections, with the Front-Cover texts being ``A GNU Manual,''
18 and with the Back-Cover Texts as in (a) below. A copy of the license
19 is included in the section entitled ``GNU Free Documentation License''.
20
21 (a) The FSF's Back-Cover Text is: ``You have the freedom to copy and
22 modify this GNU manual.''
23 @end quotation
24 @end copying
25
26 @dircategory Emacs editing modes
27 @direntry
28 * nXML Mode: (nxml-mode). XML editing mode with RELAX NG support.
29 @end direntry
30
31 @node Top
32 @top nXML Mode
33
34 @insertcopying
35
36 This manual is not yet complete.
37
38 @menu
39 * Introduction::
40 * Completion::
41 * Inserting end-tags::
42 * Paragraphs::
43 * Outlining::
44 * Locating a schema::
45 * DTDs::
46 * Limitations::
47 * GNU Free Documentation License:: The license for this documentation.
48 @end menu
49
50 @node Introduction
51 @chapter Introduction
52
53 nXML mode is an Emacs major-mode for editing XML documents. It supports
54 editing well-formed XML documents, and provides schema-sensitive editing
55 using RELAX NG Compact Syntax. To get started, visit a file containing an
56 XML document, and, if necessary, use @kbd{M-x nxml-mode} to switch to nXML
57 mode. By default, @code{auto-mode-alist} and @code{magic-fallback-alist}
58 put buffers in nXML mode if they have recognizable XML content or file
59 extensions. You may wish to customize the settings, for example to
60 recognize different file extensions.
61
62 Once in nXML mode, you can type @kbd{C-h m} for basic information on the
63 mode.
64
65 The @file{etc/nxml} directory in the Emacs distribution contains some data
66 files used by nXML mode, and includes two files (@file{test-valid.xml} and
67 @file{test-invalid.xml}) that provide examples of valid and invalid XML
68 documents.
69
70 To get validation and schema-sensitive editing, you need a RELAX NG Compact
71 Syntax (RNC) schema for your document (@pxref{Locating a schema}). The
72 @file{etc/schema} directory includes some schemas for popular document
73 types. See @url{http://relaxng.org/} for more information on RELAX NG@.
74 You can use the @samp{Trang} program from
75 @url{http://www.thaiopensource.com/relaxng/trang.html} to
76 automatically create RNC schemas. This program can:
77
78 @itemize @bullet
79 @item
80 infer an RNC schema from an instance document;
81 @item
82 convert a DTD to an RNC schema;
83 @item
84 convert a RELAX NG XML syntax schema to an RNC schema.
85 @end itemize
86
87 @noindent To convert a RELAX NG XML syntax (@samp{.rng}) schema to a RNC
88 one, you can also use the XSLT stylesheet from
89 @url{https://github.com/oleg-pavliv/emacs/tree/master/xsl}.
90 @ignore
91 @c Original location, now defunct.
92 @url{http://www.pantor.com/download.html}.
93 @end ignore
94
95 To convert a W3C XML Schema to an RNC schema, you need first to convert it
96 to RELAX NG XML syntax using the RELAX NG converter tool @code{rngconv}
97 (built on top of MSV). See @url{https://github.com/kohsuke/msv}
98 and @url{https://msv.dev.java.net/}.
99
100 For historical discussions only, see the mailing list archives at
101 @url{http://groups.yahoo.com/group/emacs-nxml-mode/}. Please make all new
102 discussions on the @samp{help-gnu-emacs} and @samp{emacs-devel} mailing
103 lists. Report any bugs with @kbd{M-x report-emacs-bug}.
104
105
106 @node Completion
107 @chapter Completion
108
109 Apart from real-time validation, the most important feature that nXML
110 mode provides for assisting in document creation is "completion".
111 Completion assists the user in inserting characters at point, based on
112 knowledge of the schema and on the contents of the buffer before
113 point.
114
115 nXML mode adapts the standard GNU Emacs command for completion in a
116 buffer: @code{completion-at-point}, which is bound to @kbd{C-M-i} and
117 @kbd{M-@key{TAB}}. Note that many window systems and window managers
118 use @kbd{M-@key{TAB}} themselves (typically for switching between
119 windows) and do not pass it to applications. In that case, you should
120 type @kbd{C-M-i} or @kbd{@key{ESC} @key{TAB}} for completion, or bind
121 @code{completion-at-point} to a key that is convenient for you. In
122 the following, I will assume that you type @kbd{C-M-i}.
123
124 nXML mode completion works by examining the symbol preceding point.
125 This is the symbol to be completed. The symbol to be completed may be
126 the empty. Completion considers what symbols starting with the symbol
127 to be completed would be valid replacements for the symbol to be
128 completed, given the schema and the contents of the buffer before
129 point. These symbols are the possible completions. An example may
130 make this clearer. Suppose the buffer looks like this (where @point{}
131 indicates point):
132
133 @example
134 <html xmlns="http://www.w3.org/1999/xhtml">
135 <h@point{}
136 @end example
137
138 @noindent
139 and the schema is XHTML@. In this context, the symbol to be completed
140 is @samp{h}. The possible completions consist of just
141 @samp{head}. Another example, is
142
143 @example
144 <html xmlns="http://www.w3.org/1999/xhtml">
145 <head>
146 <@point{}
147 @end example
148
149 @noindent
150 In this case, the symbol to be completed is empty, and the possible
151 completions are @samp{base}, @samp{isindex},
152 @samp{link}, @samp{meta}, @samp{script},
153 @samp{style}, @samp{title}. Another example is:
154
155 @example
156 <html xmlns="@point{}
157 @end example
158
159 @noindent
160 In this case, the symbol to be completed is empty, and the possible
161 completions are just @samp{http://www.w3.org/1999/xhtml}.
162
163 When you type @kbd{C-M-i}, what happens depends
164 on what the set of possible completions are.
165
166 @itemize @bullet
167 @item
168 If the set of completions is empty, nothing
169 happens.
170 @item
171 If there is one possible completion, then that completion is
172 inserted, together with any following characters that are
173 required. For example, in this case:
174
175 @example
176 <html xmlns="http://www.w3.org/1999/xhtml">
177 <@point{}
178 @end example
179
180 @noindent
181 @kbd{C-M-i} will yield
182
183 @example
184 <html xmlns="http://www.w3.org/1999/xhtml">
185 <head@point{}
186 @end example
187 @item
188 If there is more than one possible completion, but all
189 possible completions share a common non-empty prefix, then that prefix
190 is inserted. For example, suppose the buffer is:
191
192 @example
193 <html x@point{}
194 @end example
195
196 @noindent
197 The symbol to be completed is @samp{x}. The possible completions are
198 @samp{xmlns} and @samp{xml:lang}. These share a common prefix of
199 @samp{xml}. Thus, @kbd{C-M-i} will yield:
200
201 @example
202 <html xml@point{}
203 @end example
204
205 @noindent
206 Typically, you would do @kbd{C-M-i} again, which would have the result
207 described in the next item.
208 @item
209 If there is more than one possible completion, but the
210 possible completions do not share a non-empty prefix, then Emacs will
211 prompt you to input the symbol in the minibuffer, initializing the
212 minibuffer with the symbol to be completed, and popping up a buffer
213 showing the possible completions. You can now input the symbol to be
214 inserted. The symbol you input will be inserted in the buffer instead
215 of the symbol to be completed. Emacs will then insert any required
216 characters after the symbol. For example, if it contains:
217
218 @example
219 <html xml@point{}
220 @end example
221
222 @noindent
223 Emacs will prompt you in the minibuffer with
224
225 @example
226 Attribute: xml@point{}
227 @end example
228
229 @noindent
230 and the buffer showing possible completions will contain
231
232 @example
233 Possible completions are:
234 xml:lang xmlns
235 @end example
236
237 @noindent
238 If you input @kbd{xmlns}, the result will be:
239
240 @example
241 <html xmlns="@point{}
242 @end example
243
244 @noindent
245 (If you do @kbd{C-M-i} again, the namespace URI will be
246 inserted. Should that happen automatically?)
247 @end itemize
248
249 @node Inserting end-tags
250 @chapter Inserting end-tags
251
252 The main redundancy in XML syntax is end-tags. nXML mode provides
253 several ways to make it easier to enter end-tags. You can use all of
254 these without a schema.
255
256 You can use @kbd{C-M-i} after @samp{</} to complete the rest of the
257 end-tag.
258
259 @kbd{C-c C-f} inserts an end-tag for the element containing
260 point. This command is useful when you want to input the start-tag,
261 then input the content and finally input the end-tag. The @samp{f}
262 is mnemonic for finish.
263
264 If you want to keep tags balanced and input the end-tag at the
265 same time as the start-tag, before inputting the content, then you can
266 use @kbd{C-c C-i}. This inserts a @samp{>}, then inserts
267 the end-tag and leaves point before the end-tag. @kbd{C-c C-b}
268 is similar but more convenient for block-level elements: it puts the
269 start-tag, point and the end-tag on successive lines, appropriately
270 indented. The @samp{i} is mnemonic for inline and the
271 @samp{b} is mnemonic for block.
272
273 Finally, you can customize nXML mode so that @kbd{/} automatically
274 inserts the rest of the end-tag when it occurs after @samp{<}, by
275 doing
276
277 @display
278 @kbd{M-x customize-variable @key{RET} nxml-slash-auto-complete-flag @key{RET}}
279 @end display
280
281 @noindent
282 and then following the instructions in the displayed buffer.
283
284 @node Paragraphs
285 @chapter Paragraphs
286
287 Emacs has several commands that operate on paragraphs, most
288 notably @kbd{M-q}. nXML mode redefines these to work in a way
289 that is useful for XML@. The exact rules that are used to find the
290 beginning and end of a paragraph are complicated; they are designed
291 mainly to ensure that @kbd{M-q} does the right thing.
292
293 A paragraph consists of one or more complete, consecutive lines.
294 A group of lines is not considered a paragraph unless it contains some
295 non-whitespace characters between tags or inside comments. A blank
296 line separates paragraphs. A single tag on a line by itself also
297 separates paragraphs. More precisely, if one tag together with any
298 leading and trailing whitespace completely occupy one or more lines,
299 then those lines will not be included in any paragraph.
300
301 A start-tag at the beginning of the line (possibly indented) may
302 be treated as starting a paragraph. Similarly, an end-tag at the end
303 of the line may be treated as ending a paragraph. The following rules
304 are used to determine whether such a tag is in fact treated as a
305 paragraph boundary:
306
307 @itemize @bullet
308 @item
309 If the schema does not allow text at that point, then it
310 is a paragraph boundary.
311 @item
312 If the end-tag corresponding to the start-tag is not at
313 the end of its line, or the start-tag corresponding to the end-tag is
314 not at the beginning of its line, then it is not a paragraph
315 boundary. For example, in
316
317 @example
318 <p>This is a paragraph with an
319 <emph>emphasized</emph> phrase.
320 @end example
321
322 @noindent
323 the @samp{<emph>} start-tag would not be considered as
324 starting a paragraph, because its corresponding end-tag is not at the
325 end of the line.
326 @item
327 If there is text that is a sibling in element tree, then
328 it is not a paragraph boundary. For example, in
329
330 @example
331 <p>This is a paragraph with an
332 <emph>emphasized phrase that takes one source line</emph>
333 @end example
334
335 @noindent
336 the @samp{<emph>} start-tag would not be considered as
337 starting a paragraph, even though its end-tag is at the end of its
338 line, because there the text @samp{This is a paragraph with an}
339 is a sibling of the @samp{emph} element.
340 @item
341 Otherwise, it is a paragraph boundary.
342 @end itemize
343
344 @node Outlining
345 @chapter Outlining
346
347 nXML mode allows you to display all or part of a buffer as an
348 outline, in a similar way to Emacs's outline mode. An outline in nXML
349 mode is based on recognizing two kinds of element: sections and
350 headings. There is one heading for every section and one section for
351 every heading. A section contains its heading as or within its first
352 child element. A section also contains its subordinate sections (its
353 subsections). The text content of a section consists of anything in a
354 section that is neither a subsection nor a heading.
355
356 Note that this is a different model from that used by XHTML@.
357 nXML mode's outline support will not be useful for XHTML unless you
358 adopt a convention of adding a @code{div} to enclose each
359 section, rather than having sections implicitly delimited by different
360 @code{h@var{n}} elements. This limitation may be removed
361 in a future version.
362
363 The variable @code{nxml-section-element-name-regexp} gives
364 a regexp for the local names (i.e., the part of the name following any
365 prefix) of section elements. The variable
366 @code{nxml-heading-element-name-regexp} gives a regexp for the
367 local names of heading elements. For an element to be recognized
368 as a section
369
370 @itemize @bullet
371 @item
372 its start-tag must occur at the beginning of a line
373 (possibly indented);
374 @item
375 its local name must match
376 @code{nxml-section-element-name-regexp};
377 @item
378 either its first child element or a descendant of that
379 first child element must have a local name that matches
380 @code{nxml-heading-element-name-regexp}; the first such element
381 is treated as the section's heading.
382 @end itemize
383
384 @noindent
385 You can customize these variables using @kbd{M-x
386 customize-variable}.
387
388 There are three possible outline states for a section:
389
390 @itemize @bullet
391 @item
392 normal, showing everything, including its heading, text
393 content and subsections; each subsection is displayed according to the
394 state of that subsection;
395 @item
396 showing just its heading, with both its text content and
397 its subsections hidden; all subsections are hidden regardless of their
398 state;
399 @item
400 showing its heading and its subsections, with its text
401 content hidden; each subsection is displayed according to the state of
402 that subsection.
403 @end itemize
404
405 In the last two states, where the text content is hidden, the
406 heading is displayed specially, in an abbreviated form. An element
407 like this:
408
409 @example
410 <section>
411 <title>Food</title>
412 <para>There are many kinds of food.</para>
413 </section>
414 @end example
415
416 @noindent
417 would be displayed on a single line like this:
418
419 @example
420 <-section>Food...</>
421 @end example
422
423 @noindent
424 If there are hidden subsections, then a @code{+} will be used
425 instead of a @code{-} like this:
426
427 @example
428 <+section>Food...</>
429 @end example
430
431 @noindent
432 If there are non-hidden subsections, then the section will instead be
433 displayed like this:
434
435 @example
436 <-section>Food...
437 <-section>Delicious Food...</>
438 <-section>Distasteful Food...</>
439 </-section>
440 @end example
441
442 @noindent
443 The heading is always displayed with an indent that corresponds to its
444 depth in the outline, even it is not actually indented in the buffer.
445 The variable @code{nxml-outline-child-indent} controls how much
446 a subheading is indented with respect to its parent heading when the
447 heading is being displayed specially.
448
449 Commands to change the outline state of sections are bound to
450 key sequences that start with @kbd{C-c C-o} (@kbd{o} is
451 mnemonic for outline). The third and final key has been chosen to be
452 consistent with outline mode. In the following descriptions
453 current section means the section containing point, or, more precisely,
454 the innermost section containing the character immediately following
455 point.
456
457 @itemize @bullet
458 @item
459 @kbd{C-c C-o C-a} shows all sections in the buffer
460 normally.
461 @item
462 @kbd{C-c C-o C-t} hides the text content
463 of all sections in the buffer.
464 @item
465 @kbd{C-c C-o C-c} hides the text content
466 of the current section.
467 @item
468 @kbd{C-c C-o C-e} shows the text content
469 of the current section.
470 @item
471 @kbd{C-c C-o C-d} hides the text content
472 and subsections of the current section.
473 @item
474 @kbd{C-c C-o C-s} shows the current section
475 and all its direct and indirect subsections normally.
476 @item
477 @kbd{C-c C-o C-k} shows the headings of the
478 direct and indirect subsections of the current section.
479 @item
480 @kbd{C-c C-o C-l} hides the text content of the
481 current section and of its direct and indirect
482 subsections.
483 @item
484 @kbd{C-c C-o C-i} shows the headings of the
485 direct subsections of the current section.
486 @item
487 @kbd{C-c C-o C-o} hides as much as possible without
488 hiding the current section's text content; the headings of ancestor
489 sections of the current section and their child section sections will
490 not be hidden.
491 @end itemize
492
493 When a heading is displayed specially, you can use
494 @key{RET} in that heading to show the text content of the section
495 in the same way as @kbd{C-c C-o C-e}.
496
497 You can also use the mouse to change the outline state:
498 @kbd{S-mouse-2} hides the text content of a section in the same
499 way as@kbd{C-c C-o C-c}; @kbd{mouse-2} on a specially
500 displayed heading shows the text content of the section in the same
501 way as @kbd{C-c C-o C-e}; @kbd{mouse-1} on a specially
502 displayed start-tag toggles the display of subheadings on and
503 off.
504
505 The outline state for each section is stored with the first
506 character of the section (as a text property). Every command that
507 changes the outline state of any section updates the display of the
508 buffer so that each section is displayed correctly according to its
509 outline state. If the section structure is subsequently changed, then
510 it is possible for the display to no longer correctly reflect the
511 stored outline state. @kbd{C-c C-o C-r} can be used to refresh
512 the display so it is correct again.
513
514 @node Locating a schema
515 @chapter Locating a schema
516
517 nXML mode has a configurable set of rules to locate a schema for
518 the file being edited. The rules are contained in one or more schema
519 locating files, which are XML documents.
520
521 The variable @samp{rng-schema-locating-files} specifies
522 the list of the file-names of schema locating files that nXML mode
523 should use. The order of the list is significant: when file
524 @var{x} occurs in the list before file @var{y} then rules
525 from file @var{x} have precedence over rules from file
526 @var{y}. A filename specified in
527 @samp{rng-schema-locating-files} may be relative. If so, it will
528 be resolved relative to the document for which a schema is being
529 located. It is not an error if relative file-names in
530 @samp{rng-schema-locating-files} do not exist. You can use
531 @kbd{M-x customize-variable @key{RET} rng-schema-locating-files
532 @key{RET}} to customize the list of schema locating
533 files.
534
535 By default, @samp{rng-schema-locating-files} list has two
536 members: @samp{schemas.xml}, and
537 @samp{@var{dist-dir}/schema/schemas.xml} where
538 @samp{@var{dist-dir}} is the directory containing the nXML
539 distribution. The first member will cause nXML mode to use a file
540 @samp{schemas.xml} in the same directory as the document being
541 edited if such a file exist. The second member contains rules for the
542 schemas that are included with the nXML distribution.
543
544 @menu
545 * Commands for locating a schema::
546 * Schema locating files::
547 @end menu
548
549 @node Commands for locating a schema
550 @section Commands for locating a schema
551
552 The command @kbd{C-c C-s C-w} will tell you what schema
553 is currently being used.
554
555 The rules for locating a schema are applied automatically when
556 you visit a file in nXML mode. However, if you have just created a new
557 file and the schema cannot be inferred from the file-name, then this
558 will not locate the right schema. In this case, you should insert the
559 start-tag of the root element and then use the command @kbd{C-c C-s
560 C-a}, which reapplies the rules based on the current content of
561 the document. It is usually not necessary to insert the complete
562 start-tag; often just @samp{<@var{name}} is
563 enough.
564
565 If you want to use a schema that has not yet been added to the
566 schema locating files, you can use the command @kbd{C-c C-s C-f}
567 to manually select the file containing the schema for the document in
568 current buffer. Emacs will read the file-name of the schema from the
569 minibuffer. After reading the file-name, Emacs will ask whether you
570 wish to add a rule to a schema locating file that persistently
571 associates the document with the selected schema. The rule will be
572 added to the first file in the list specified
573 @samp{rng-schema-locating-files}; it will create the file if
574 necessary, but will not create a directory. If the variable
575 @samp{rng-schema-locating-files} has not been customized, this
576 means that the rule will be added to the file @samp{schemas.xml}
577 in the same directory as the document being edited.
578
579 The command @kbd{C-c C-s C-t} allows you to select a schema by
580 specifying an identifier for the type of the document. The schema
581 locating files determine the available type identifiers and what
582 schema is used for each type identifier. This is useful when it is
583 impossible to infer the right schema from either the file-name or the
584 content of the document, even though the schema is already in the
585 schema locating file. A situation in which this can occur is when
586 there are multiple variants of a schema where all valid documents have
587 the same document element. For example, XHTML has Strict and
588 Transitional variants. In a situation like this, a schema locating file
589 can define a type identifier for each variant. As with @kbd{C-c
590 C-s C-f}, Emacs will ask whether you wish to add a rule to a schema
591 locating file that persistently associates the document with the
592 specified type identifier.
593
594 The command @kbd{C-c C-s C-l} adds a rule to a schema
595 locating file that persistently associates the document with
596 the schema that is currently being used.
597
598 @node Schema locating files
599 @section Schema locating files
600
601 Each schema locating file specifies a list of rules. The rules
602 from each file are appended in order. To locate a schema each rule is
603 applied in turn until a rule matches. The first matching rule is then
604 used to determine the schema.
605
606 Schema locating files are designed to be useful for other
607 applications that need to locate a schema for a document. In fact,
608 there is nothing specific to locating schemas in the design; it could
609 equally well be used for locating a stylesheet.
610
611 @menu
612 * Schema locating file syntax basics::
613 * Using the document's URI to locate a schema::
614 * Using the document element to locate a schema::
615 * Using type identifiers in schema locating files::
616 * Using multiple schema locating files::
617 @end menu
618
619 @node Schema locating file syntax basics
620 @subsection Schema locating file syntax basics
621
622 There is a schema for schema locating files in the file
623 @samp{locate.rnc} in the schema directory. Schema locating
624 files must be valid with respect to this schema.
625
626 The document element of a schema locating file must be
627 @samp{locatingRules} and the namespace URI must be
628 @samp{http://thaiopensource.com/ns/locating-rules/1.0}. The
629 children of the document element specify rules. The order of the
630 children is the same as the order of the rules. Here's a complete
631 example of a schema locating file:
632
633 @example
634 <?xml version="1.0"?>
635 <locatingRules xmlns="http://thaiopensource.com/ns/locating-rules/1.0">
636 <namespace ns="http://www.w3.org/1999/xhtml" uri="xhtml.rnc"/>
637 <documentElement localName="book" uri="docbook.rnc"/>
638 </locatingRules>
639 @end example
640
641 @noindent
642 This says to use the schema @samp{xhtml.rnc} for a document with
643 namespace @samp{http://www.w3.org/1999/xhtml}, and to use the
644 schema @samp{docbook.rnc} for a document whose local name is
645 @samp{book}. If the document element had both a namespace URI
646 of @samp{http://www.w3.org/1999/xhtml} and a local name of
647 @samp{book}, then the matching rule that comes first will be
648 used and so the schema @samp{xhtml.rnc} would be used. There is
649 no precedence between different types of rule; the first matching rule
650 of any type is used.
651
652 As usual with XML-related technologies, resources are identified
653 by URIs. The @samp{uri} attribute identifies the schema by
654 specifying the URI@. The URI may be relative. If so, it is resolved
655 relative to the URI of the schema locating file that contains
656 attribute. This means that if the value of @samp{uri} attribute
657 does not contain a @samp{/}, then it will refer to a filename in
658 the same directory as the schema locating file.
659
660 @node Using the document's URI to locate a schema
661 @subsection Using the document's URI to locate a schema
662
663 A @samp{uri} rule locates a schema based on the URI of the
664 document. The @samp{uri} attribute specifies the URI of the
665 schema. The @samp{resource} attribute can be used to specify
666 the schema for a particular document. For example,
667
668 @example
669 <uri resource="spec.xml" uri="docbook.rnc"/>
670 @end example
671
672 @noindent
673 specifies that the schema for @samp{spec.xml} is
674 @samp{docbook.rnc}.
675
676 The @samp{pattern} attribute can be used instead of the
677 @samp{resource} attribute to specify the schema for any document
678 whose URI matches a pattern. The pattern has the same syntax as an
679 absolute or relative URI except that the path component of the URI can
680 use a @samp{*} character to stand for zero or more characters
681 within a path segment (i.e., any character other @samp{/}).
682 Typically, the URI pattern looks like a relative URI, but, whereas a
683 relative URI in the @samp{resource} attribute is resolved into a
684 particular absolute URI using the base URI of the schema locating
685 file, a relative URI pattern matches if it matches some number of
686 complete path segments of the document's URI ending with the last path
687 segment of the document's URI@. For example,
688
689 @example
690 <uri pattern="*.xsl" uri="xslt.rnc"/>
691 @end example
692
693 @noindent
694 specifies that the schema for documents with a URI whose path ends
695 with @samp{.xsl} is @samp{xslt.rnc}.
696
697 A @samp{transformURI} rule locates a schema by
698 transforming the URI of the document. The @samp{fromPattern}
699 attribute specifies a URI pattern with the same meaning as the
700 @samp{pattern} attribute of the @samp{uri} element. The
701 @samp{toPattern} attribute is a URI pattern that is used to
702 generate the URI of the schema. Each @samp{*} in the
703 @samp{toPattern} is replaced by the string that matched the
704 corresponding @samp{*} in the @samp{fromPattern}. The
705 resulting string is appended to the initial part of the document's URI
706 that was not explicitly matched by the @samp{fromPattern}. The
707 rule matches only if the transformed URI identifies an existing
708 resource. For example, the rule
709
710 @example
711 <transformURI fromPattern="*.xml" toPattern="*.rnc"/>
712 @end example
713
714 @noindent
715 would transform the URI @samp{file:///home/jjc/docs/spec.xml}
716 into the URI @samp{file:///home/jjc/docs/spec.rnc}. Thus, this
717 rule specifies that to locate a schema for a document
718 @samp{@var{foo}.xml}, Emacs should test whether a file
719 @samp{@var{foo}.rnc} exists in the same directory as
720 @samp{@var{foo}.xml}, and, if so, should use it as the
721 schema.
722
723 @node Using the document element to locate a schema
724 @subsection Using the document element to locate a schema
725
726 A @samp{documentElement} rule locates a schema based on
727 the local name and prefix of the document element. For example, a rule
728
729 @example
730 <documentElement prefix="xsl" localName="stylesheet" uri="xslt.rnc"/>
731 @end example
732
733 @noindent
734 specifies that when the name of the document element is
735 @samp{xsl:stylesheet}, then @samp{xslt.rnc} should be used
736 as the schema. Either the @samp{prefix} or
737 @samp{localName} attribute may be omitted to allow any prefix or
738 local name.
739
740 A @samp{namespace} rule locates a schema based on the
741 namespace URI of the document element. For example, a rule
742
743 @example
744 <namespace ns="http://www.w3.org/1999/XSL/Transform" uri="xslt.rnc"/>
745 @end example
746
747 @noindent
748 specifies that when the namespace URI of the document is
749 @samp{http://www.w3.org/1999/XSL/Transform}, then
750 @samp{xslt.rnc} should be used as the schema.
751
752 @node Using type identifiers in schema locating files
753 @subsection Using type identifiers in schema locating files
754
755 Type identifiers allow a level of indirection in locating the
756 schema for a document. Instead of associating the document directly
757 with a schema URI, the document is associated with a type identifier,
758 which is in turn associated with a schema URI@. nXML mode does not
759 constrain the format of type identifiers. They can be simply strings
760 without any formal structure or they can be public identifiers or
761 URIs. Note that these type identifiers have nothing to do with the
762 DOCTYPE declaration. When comparing type identifiers, whitespace is
763 normalized in the same way as with the @samp{xsd:token}
764 datatype: leading and trailing whitespace is stripped; other sequences
765 of whitespace are normalized to a single space character.
766
767 Each of the rules described in previous sections that uses a
768 @samp{uri} attribute to specify a schema, can instead use a
769 @samp{typeId} attribute to specify a type identifier. The type
770 identifier can be associated with a URI using a @samp{typeId}
771 element. For example,
772
773 @example
774 <locatingRules xmlns="http://thaiopensource.com/ns/locating-rules/1.0">
775 <namespace ns="http://www.w3.org/1999/xhtml" typeId="XHTML"/>
776 <typeId id="XHTML" typeId="XHTML Strict"/>
777 <typeId id="XHTML Strict" uri="xhtml-strict.rnc"/>
778 <typeId id="XHTML Transitional" uri="xhtml-transitional.rnc"/>
779 </locatingRules>
780 @end example
781
782 @noindent
783 declares three type identifiers @samp{XHTML} (representing the
784 default variant of XHTML to be used), @samp{XHTML Strict} and
785 @samp{XHTML Transitional}. Such a schema locating file would
786 use @samp{xhtml-strict.rnc} for a document whose namespace is
787 @samp{http://www.w3.org/1999/xhtml}. But it is considerably
788 more flexible than a schema locating file that simply specified
789
790 @example
791 <namespace ns="http://www.w3.org/1999/xhtml" uri="xhtml-strict.rnc"/>
792 @end example
793
794 @noindent
795 A user can easily use @kbd{C-c C-s C-t} to select between XHTML
796 Strict and XHTML Transitional. Also, a user can easily add a catalog
797
798 @example
799 <locatingRules xmlns="http://thaiopensource.com/ns/locating-rules/1.0">
800 <typeId id="XHTML" typeId="XHTML Transitional"/>
801 </locatingRules>
802 @end example
803
804 @noindent
805 that makes the default variant of XHTML be XHTML Transitional.
806
807 @node Using multiple schema locating files
808 @subsection Using multiple schema locating files
809
810 The @samp{include} element includes rules from another
811 schema locating file. The behavior is exactly as if the rules from
812 that file were included in place of the @samp{include} element.
813 Relative URIs are resolved into absolute URIs before the inclusion is
814 performed. For example,
815
816 @example
817 <include rules="../rules.xml"/>
818 @end example
819
820 @noindent
821 includes the rules from @samp{rules.xml}.
822
823 The process of locating a schema takes as input a list of schema
824 locating files. The rules in all these files and in the files they
825 include are resolved into a single list of rules, which are applied
826 strictly in order. Sometimes this order is not what is needed.
827 For example, suppose you have two schema locating files, a private
828 file
829
830 @example
831 <locatingRules xmlns="http://thaiopensource.com/ns/locating-rules/1.0">
832 <namespace ns="http://www.w3.org/1999/xhtml" uri="xhtml.rnc"/>
833 </locatingRules>
834 @end example
835
836 @noindent
837 followed by a public file
838
839 @example
840 <locatingRules xmlns="http://thaiopensource.com/ns/locating-rules/1.0">
841 <transformURI pathSuffix=".xml" replacePathSuffix=".rnc"/>
842 <namespace ns="http://www.w3.org/1999/XSL/Transform" typeId="XSLT"/>
843 </locatingRules>
844 @end example
845
846 @noindent
847 The effect of these two files is that the XHTML @samp{namespace}
848 rule takes precedence over the @samp{transformURI} rule, which
849 is almost certainly not what is needed. This can be solved by adding
850 an @samp{applyFollowingRules} to the private file.
851
852 @example
853 <locatingRules xmlns="http://thaiopensource.com/ns/locating-rules/1.0">
854 <applyFollowingRules ruleType="transformURI"/>
855 <namespace ns="http://www.w3.org/1999/xhtml" uri="xhtml.rnc"/>
856 </locatingRules>
857 @end example
858
859 @node DTDs
860 @chapter DTDs
861
862 nXML mode is designed to support the creation of standalone XML
863 documents that do not depend on a DTD@. Although it is common practice
864 to insert a DOCTYPE declaration referencing an external DTD, this has
865 undesirable side-effects. It means that the document is no longer
866 self-contained. It also means that different XML parsers may interpret
867 the document in different ways, since the XML Recommendation does not
868 require XML parsers to read the DTD@. With DTDs, it was impractical to
869 get validation without using an external DTD or reference to an
870 parameter entity. With RELAX NG and other schema languages, you can
871 simultaneously get the benefits of validation and standalone XML
872 documents. Therefore, I recommend that you do not reference an
873 external DOCTYPE in your XML documents.
874
875 One problem is entities for characters. Typically, as well as
876 providing validation, DTDs also provide a set of character entities
877 for documents to use. Schemas cannot provide this functionality,
878 because schema validation happens after XML parsing. The recommended
879 solution is to either use the Unicode characters directly, or, if this
880 is impractical, use character references. nXML mode supports this by
881 providing commands for entering characters and character references
882 using the Unicode names, and can display the glyph corresponding to a
883 character reference.
884
885 @node Limitations
886 @chapter Limitations
887
888 nXML mode has some limitations:
889
890 @itemize @bullet
891 @item
892 DTD support is limited. Internal parsed general entities declared
893 in the internal subset are supported provided they do not contain
894 elements. Other usage of DTDs is ignored.
895 @item
896 The restrictions on RELAX NG schemas in section 7 of the RELAX NG
897 specification are not enforced.
898 @end itemize
899
900 @node GNU Free Documentation License
901 @appendix GNU Free Documentation License
902 @include doclicense.texi
903
904 @bye