Several miscellaneous doc changes.
[bpt/emacs.git] / doc / misc / nxml-mode.texi
CommitLineData
8cd39fb3
MH
1\input texinfo @c -*- texinfo -*-
2@c %**start of header
ac97a16b 3@setfilename ../../info/nxml-mode
8cd39fb3
MH
4@settitle nXML Mode
5@c %**end of header
6
20234d96 7@copying
3d439cd1 8This manual documents nXML mode, an Emacs major mode for editing
867d4bb3 9XML with RELAX NG support.
20234d96 10
44e97401 11Copyright @copyright{} 2007-2012 Free Software Foundation, Inc.
20234d96
GM
12
13@quotation
14Permission is granted to copy, distribute and/or modify this document
6a2c4aec 15under the terms of the GNU Free Documentation License, Version 1.3 or
20234d96
GM
16any later version published by the Free Software Foundation; with no
17Invariant Sections, with the Front-Cover texts being ``A GNU
18Manual,'' and with the Back-Cover Texts as in (a) below. A copy of the
19license is included in the section entitled ``GNU Free Documentation
20License'' in the Emacs manual.
21
6f093307
GM
22(a) The FSF's Back-Cover Text is: ``You have the freedom to copy and
23modify this GNU manual. Buying copies from the FSF supports it in
24developing GNU and promoting software freedom.''
20234d96
GM
25
26This document is part of a collection distributed under the GNU Free
27Documentation License. If you want to distribute this document
28separately from the collection, you can do so by adding a copy of the
29license to the document, as described in section 6 of the license.
30@end quotation
31@end copying
32
0c973505 33@dircategory Emacs editing modes
8cd39fb3 34@direntry
7aa579d9 35* nXML Mode: (nxml-mode). XML editing mode with RELAX NG support.
8cd39fb3
MH
36@end direntry
37
38@node Top
39@top nXML Mode
40
5dc584b5
KB
41@insertcopying
42
43This manual is not yet complete.
8cd39fb3
MH
44
45@menu
d3dfb185 46* Introduction::
867d4bb3
JB
47* Completion::
48* Inserting end-tags::
49* Paragraphs::
50* Outlining::
51* Locating a schema::
52* DTDs::
53* Limitations::
8cd39fb3
MH
54@end menu
55
d3dfb185
GM
56@node Introduction
57@chapter Introduction
58
59nXML mode is an Emacs major-mode for editing XML documents. It supports
60editing well-formed XML documents, and provides schema-sensitive editing
61using RELAX NG Compact Syntax. To get started, visit a file containing an
62XML document, and, if necessary, use @kbd{M-x nxml-mode} to switch to nXML
63mode. By default, @code{auto-mode-alist} and @code{magic-fallback-alist}
64put buffers in nXML mode if they have recognizable XML content or file
65extensions. You may wish to customize the settings, for example to
66recognize different file extensions.
67
68Once in nXML mode, you can type @kbd{C-h m} for basic information on the
69mode.
70
71The @file{etc/nxml} directory in the Emacs distribution contains some data
b218c6cd
EW
72files used by nXML mode, and includes two files (@file{test-valid.xml} and
73@file{test-invalid.xml}) that provide examples of valid and invalid XML
d3dfb185
GM
74documents.
75
76To get validation and schema-sensitive editing, you need a RELAX NG Compact
77Syntax (RNC) schema for your document (@pxref{Locating a schema}). The
78@file{etc/schema} directory includes some schemas for popular document
79types. See @url{http://relaxng.org/} for more information on RELAX NG.
80You can use the @samp{Trang} program from
81@url{http://www.thaiopensource.com/relaxng/trang.html} to
82automatically create RNC schemas. This program can:
83
84@itemize @bullet
85@item
86infer an RNC schema from an instance document;
87@item
88convert a DTD to an RNC schema;
89@item
90convert a RELAX NG XML syntax schema to an RNC schema.
91@end itemize
92
93@noindent To convert a RELAX NG XML syntax (@samp{.rng}) schema to a RNC
94one, you can also use the XSLT stylesheet from
95@url{http://www.pantor.com/download.html}.
96
97To convert a W3C XML Schema to an RNC schema, you need first to convert it
4d47208a 98to RELAX NG XML syntax using the RELAX NG converter tool @code{rngconv}
d3dfb185
GM
99(built on top of MSV). See @url{https://github.com/kohsuke/msv}
100and @url{https://msv.dev.java.net/}.
101
102For historical discussions only, see the mailing list archives at
103@url{http://groups.yahoo.com/group/emacs-nxml-mode/}. Please make all new
104discussions on the @samp{help-gnu-emacs} and @samp{emacs-devel} mailing
105lists. Report any bugs with @kbd{M-x report-emacs-bug}.
106
107
8cd39fb3
MH
108@node Completion
109@chapter Completion
110
3d439cd1
CY
111Apart from real-time validation, the most important feature that nXML
112mode provides for assisting in document creation is "completion".
8cd39fb3
MH
113Completion assists the user in inserting characters at point, based on
114knowledge of the schema and on the contents of the buffer before
115point.
116
3d439cd1
CY
117nXML mode adapts the standard GNU Emacs command for completion in a
118buffer: @code{completion-at-point}, which is bound to @kbd{C-M-i} and
119@kbd{M-@key{TAB}}. Note that many window systems and window managers
120use @kbd{M-@key{TAB}} themselves (typically for switching between
121windows) and do not pass it to applications. In that case, you should
122type @kbd{C-M-i} or @kbd{@key{ESC} @key{TAB}} for completion, or bind
123@code{completion-at-point} to a key that is convenient for you. In
124the following, I will assume that you type @kbd{C-M-i}.
125
126nXML mode completion works by examining the symbol preceding point.
127This is the symbol to be completed. The symbol to be completed may be
128the empty. Completion considers what symbols starting with the symbol
129to be completed would be valid replacements for the symbol to be
8cd39fb3
MH
130completed, given the schema and the contents of the buffer before
131point. These symbols are the possible completions. An example may
132make this clearer. Suppose the buffer looks like this (where @point{}
133indicates point):
134
135@example
136<html xmlns="http://www.w3.org/1999/xhtml">
137<h@point{}
138@end example
139
140@noindent
141and the schema is XHTML. In this context, the symbol to be completed
142is @samp{h}. The possible completions consist of just
143@samp{head}. Another example, is
144
145@example
146<html xmlns="http://www.w3.org/1999/xhtml">
147<head>
148<@point{}
149@end example
150
151@noindent
152In this case, the symbol to be completed is empty, and the possible
153completions are @samp{base}, @samp{isindex},
154@samp{link}, @samp{meta}, @samp{script},
155@samp{style}, @samp{title}. Another example is:
156
157@example
158<html xmlns="@point{}
159@end example
160
161@noindent
162In this case, the symbol to be completed is empty, and the possible
163completions are just @samp{http://www.w3.org/1999/xhtml}.
164
3d439cd1 165When you type @kbd{C-M-i}, what happens depends
8cd39fb3
MH
166on what the set of possible completions are.
167
168@itemize @bullet
169@item
170If the set of completions is empty, nothing
171happens.
172@item
173If there is one possible completion, then that completion is
174inserted, together with any following characters that are
175required. For example, in this case:
176
177@example
178<html xmlns="http://www.w3.org/1999/xhtml">
179<@point{}
180@end example
181
182@noindent
3d439cd1 183@kbd{C-M-i} will yield
8cd39fb3
MH
184
185@example
186<html xmlns="http://www.w3.org/1999/xhtml">
187<head@point{}
188@end example
189@item
190If there is more than one possible completion, but all
191possible completions share a common non-empty prefix, then that prefix
192is inserted. For example, suppose the buffer is:
193
194@example
195<html x@point{}
196@end example
197
198@noindent
3d439cd1
CY
199The symbol to be completed is @samp{x}. The possible completions are
200@samp{xmlns} and @samp{xml:lang}. These share a common prefix of
201@samp{xml}. Thus, @kbd{C-M-i} will yield:
8cd39fb3
MH
202
203@example
204<html xml@point{}
205@end example
206
207@noindent
3d439cd1
CY
208Typically, you would do @kbd{C-M-i} again, which would have the result
209described in the next item.
8cd39fb3
MH
210@item
211If there is more than one possible completion, but the
212possible completions do not share a non-empty prefix, then Emacs will
213prompt you to input the symbol in the minibuffer, initializing the
214minibuffer with the symbol to be completed, and popping up a buffer
215showing the possible completions. You can now input the symbol to be
216inserted. The symbol you input will be inserted in the buffer instead
217of the symbol to be completed. Emacs will then insert any required
218characters after the symbol. For example, if it contains:
219
220@example
221<html xml@point{}
222@end example
223
224@noindent
225Emacs will prompt you in the minibuffer with
226
227@example
228Attribute: xml@point{}
229@end example
230
231@noindent
232and the buffer showing possible completions will contain
233
234@example
235Possible completions are:
b1fbbb32 236xml:lang xmlns
8cd39fb3
MH
237@end example
238
239@noindent
240If you input @kbd{xmlns}, the result will be:
241
242@example
243<html xmlns="@point{}
244@end example
245
246@noindent
3d439cd1
CY
247(If you do @kbd{C-M-i} again, the namespace URI will be
248inserted. Should that happen automatically?)
8cd39fb3
MH
249@end itemize
250
251@node Inserting end-tags
252@chapter Inserting end-tags
253
3d439cd1 254The main redundancy in XML syntax is end-tags. nXML mode provides
8cd39fb3
MH
255several ways to make it easier to enter end-tags. You can use all of
256these without a schema.
257
3d439cd1
CY
258You can use @kbd{C-M-i} after @samp{</} to complete the rest of the
259end-tag.
8cd39fb3
MH
260
261@kbd{C-c C-f} inserts an end-tag for the element containing
262point. This command is useful when you want to input the start-tag,
263then input the content and finally input the end-tag. The @samp{f}
264is mnemonic for finish.
265
266If you want to keep tags balanced and input the end-tag at the
267same time as the start-tag, before inputting the content, then you can
268use @kbd{C-c C-i}. This inserts a @samp{>}, then inserts
269the end-tag and leaves point before the end-tag. @kbd{C-c C-b}
270is similar but more convenient for block-level elements: it puts the
271start-tag, point and the end-tag on successive lines, appropriately
272indented. The @samp{i} is mnemonic for inline and the
273@samp{b} is mnemonic for block.
274
3d439cd1
CY
275Finally, you can customize nXML mode so that @kbd{/} automatically
276inserts the rest of the end-tag when it occurs after @samp{<}, by
277doing
8cd39fb3
MH
278
279@display
280@kbd{M-x customize-variable @key{RET} nxml-slash-auto-complete-flag @key{RET}}
281@end display
282
283@noindent
284and then following the instructions in the displayed buffer.
285
286@node Paragraphs
287@chapter Paragraphs
288
289Emacs has several commands that operate on paragraphs, most
290notably @kbd{M-q}. nXML mode redefines these to work in a way
291that is useful for XML. The exact rules that are used to find the
292beginning and end of a paragraph are complicated; they are designed
293mainly to ensure that @kbd{M-q} does the right thing.
294
295A paragraph consists of one or more complete, consecutive lines.
296A group of lines is not considered a paragraph unless it contains some
297non-whitespace characters between tags or inside comments. A blank
298line separates paragraphs. A single tag on a line by itself also
299separates paragraphs. More precisely, if one tag together with any
300leading and trailing whitespace completely occupy one or more lines,
301then those lines will not be included in any paragraph.
302
303A start-tag at the beginning of the line (possibly indented) may
304be treated as starting a paragraph. Similarly, an end-tag at the end
305of the line may be treated as ending a paragraph. The following rules
306are used to determine whether such a tag is in fact treated as a
307paragraph boundary:
308
309@itemize @bullet
310@item
311If the schema does not allow text at that point, then it
312is a paragraph boundary.
313@item
314If the end-tag corresponding to the start-tag is not at
315the end of its line, or the start-tag corresponding to the end-tag is
316not at the beginning of its line, then it is not a paragraph
317boundary. For example, in
318
319@example
320<p>This is a paragraph with an
321<emph>emphasized</emph> phrase.
322@end example
323
324@noindent
325the @samp{<emph>} start-tag would not be considered as
326starting a paragraph, because its corresponding end-tag is not at the
327end of the line.
328@item
329If there is text that is a sibling in element tree, then
330it is not a paragraph boundary. For example, in
331
332@example
333<p>This is a paragraph with an
334<emph>emphasized phrase that takes one source line</emph>
335@end example
336
337@noindent
338the @samp{<emph>} start-tag would not be considered as
339starting a paragraph, even though its end-tag is at the end of its
340line, because there the text @samp{This is a paragraph with an}
341is a sibling of the @samp{emph} element.
342@item
343Otherwise, it is a paragraph boundary.
344@end itemize
345
346@node Outlining
347@chapter Outlining
348
349nXML mode allows you to display all or part of a buffer as an
44e97401 350outline, in a similar way to Emacs's outline mode. An outline in nXML
8cd39fb3
MH
351mode is based on recognizing two kinds of element: sections and
352headings. There is one heading for every section and one section for
353every heading. A section contains its heading as or within its first
354child element. A section also contains its subordinate sections (its
355subsections). The text content of a section consists of anything in a
356section that is neither a subsection nor a heading.
357
358Note that this is a different model from that used by XHTML.
359nXML mode's outline support will not be useful for XHTML unless you
360adopt a convention of adding a @code{div} to enclose each
361section, rather than having sections implicitly delimited by different
362@code{h@var{n}} elements. This limitation may be removed
363in a future version.
364
365The variable @code{nxml-section-element-name-regexp} gives
366a regexp for the local names (i.e. the part of the name following any
367prefix) of section elements. The variable
368@code{nxml-heading-element-name-regexp} gives a regexp for the
369local names of heading elements. For an element to be recognized
370as a section
371
372@itemize @bullet
373@item
374its start-tag must occur at the beginning of a line
375(possibly indented);
376@item
377its local name must match
378@code{nxml-section-element-name-regexp};
379@item
380either its first child element or a descendant of that
381first child element must have a local name that matches
382@code{nxml-heading-element-name-regexp}; the first such element
383is treated as the section's heading.
384@end itemize
385
386@noindent
387You can customize these variables using @kbd{M-x
388customize-variable}.
389
390There are three possible outline states for a section:
391
392@itemize @bullet
393@item
394normal, showing everything, including its heading, text
395content and subsections; each subsection is displayed according to the
396state of that subsection;
397@item
398showing just its heading, with both its text content and
399its subsections hidden; all subsections are hidden regardless of their
400state;
401@item
402showing its heading and its subsections, with its text
403content hidden; each subsection is displayed according to the state of
404that subsection.
405@end itemize
406
407In the last two states, where the text content is hidden, the
408heading is displayed specially, in an abbreviated form. An element
409like this:
410
411@example
412<section>
413<title>Food</title>
414<para>There are many kinds of food.</para>
415</section>
416@end example
417
418@noindent
419would be displayed on a single line like this:
420
421@example
422<-section>Food...</>
423@end example
424
425@noindent
426If there are hidden subsections, then a @code{+} will be used
427instead of a @code{-} like this:
428
429@example
430<+section>Food...</>
431@end example
432
433@noindent
434If there are non-hidden subsections, then the section will instead be
435displayed like this:
436
437@example
438<-section>Food...
439 <-section>Delicious Food...</>
440 <-section>Distasteful Food...</>
441</-section>
442@end example
443
444@noindent
445The heading is always displayed with an indent that corresponds to its
446depth in the outline, even it is not actually indented in the buffer.
447The variable @code{nxml-outline-child-indent} controls how much
448a subheading is indented with respect to its parent heading when the
449heading is being displayed specially.
450
451Commands to change the outline state of sections are bound to
452key sequences that start with @kbd{C-c C-o} (@kbd{o} is
453mnemonic for outline). The third and final key has been chosen to be
454consistent with outline mode. In the following descriptions
455current section means the section containing point, or, more precisely,
456the innermost section containing the character immediately following
457point.
458
459@itemize @bullet
460@item
461@kbd{C-c C-o C-a} shows all sections in the buffer
462normally.
463@item
464@kbd{C-c C-o C-t} hides the text content
465of all sections in the buffer.
466@item
467@kbd{C-c C-o C-c} hides the text content
468of the current section.
469@item
470@kbd{C-c C-o C-e} shows the text content
471of the current section.
472@item
473@kbd{C-c C-o C-d} hides the text content
474and subsections of the current section.
475@item
867d4bb3 476@kbd{C-c C-o C-s} shows the current section
8cd39fb3
MH
477and all its direct and indirect subsections normally.
478@item
479@kbd{C-c C-o C-k} shows the headings of the
480direct and indirect subsections of the current section.
481@item
482@kbd{C-c C-o C-l} hides the text content of the
483current section and of its direct and indirect
484subsections.
485@item
486@kbd{C-c C-o C-i} shows the headings of the
487direct subsections of the current section.
488@item
489@kbd{C-c C-o C-o} hides as much as possible without
490hiding the current section's text content; the headings of ancestor
491sections of the current section and their child section sections will
492not be hidden.
493@end itemize
494
495When a heading is displayed specially, you can use
496@key{RET} in that heading to show the text content of the section
497in the same way as @kbd{C-c C-o C-e}.
498
499You can also use the mouse to change the outline state:
500@kbd{S-mouse-2} hides the text content of a section in the same
501way as@kbd{C-c C-o C-c}; @kbd{mouse-2} on a specially
502displayed heading shows the text content of the section in the same
503way as @kbd{C-c C-o C-e}; @kbd{mouse-1} on a specially
504displayed start-tag toggles the display of subheadings on and
505off.
506
507The outline state for each section is stored with the first
508character of the section (as a text property). Every command that
509changes the outline state of any section updates the display of the
510buffer so that each section is displayed correctly according to its
511outline state. If the section structure is subsequently changed, then
512it is possible for the display to no longer correctly reflect the
513stored outline state. @kbd{C-c C-o C-r} can be used to refresh
514the display so it is correct again.
515
516@node Locating a schema
517@chapter Locating a schema
518
519nXML mode has a configurable set of rules to locate a schema for
520the file being edited. The rules are contained in one or more schema
521locating files, which are XML documents.
522
523The variable @samp{rng-schema-locating-files} specifies
524the list of the file-names of schema locating files that nXML mode
525should use. The order of the list is significant: when file
526@var{x} occurs in the list before file @var{y} then rules
527from file @var{x} have precedence over rules from file
528@var{y}. A filename specified in
529@samp{rng-schema-locating-files} may be relative. If so, it will
530be resolved relative to the document for which a schema is being
531located. It is not an error if relative file-names in
867d4bb3 532@samp{rng-schema-locating-files} do not exist. You can use
8cd39fb3
MH
533@kbd{M-x customize-variable @key{RET} rng-schema-locating-files
534@key{RET}} to customize the list of schema locating
535files.
536
537By default, @samp{rng-schema-locating-files} list has two
538members: @samp{schemas.xml}, and
539@samp{@var{dist-dir}/schema/schemas.xml} where
540@samp{@var{dist-dir}} is the directory containing the nXML
541distribution. The first member will cause nXML mode to use a file
542@samp{schemas.xml} in the same directory as the document being
543edited if such a file exist. The second member contains rules for the
544schemas that are included with the nXML distribution.
545
546@menu
867d4bb3
JB
547* Commands for locating a schema::
548* Schema locating files::
8cd39fb3
MH
549@end menu
550
551@node Commands for locating a schema
552@section Commands for locating a schema
553
554The command @kbd{C-c C-s C-w} will tell you what schema
555is currently being used.
556
557The rules for locating a schema are applied automatically when
558you visit a file in nXML mode. However, if you have just created a new
559file and the schema cannot be inferred from the file-name, then this
560will not locate the right schema. In this case, you should insert the
40572be6 561start-tag of the root element and then use the command @kbd{C-c C-s
8cd39fb3
MH
562C-a}, which reapplies the rules based on the current content of
563the document. It is usually not necessary to insert the complete
564start-tag; often just @samp{<@var{name}} is
565enough.
566
567If you want to use a schema that has not yet been added to the
568schema locating files, you can use the command @kbd{C-c C-s C-f}
b6f9df0f 569to manually select the file containing the schema for the document in
8cd39fb3
MH
570current buffer. Emacs will read the file-name of the schema from the
571minibuffer. After reading the file-name, Emacs will ask whether you
572wish to add a rule to a schema locating file that persistently
573associates the document with the selected schema. The rule will be
574added to the first file in the list specified
575@samp{rng-schema-locating-files}; it will create the file if
576necessary, but will not create a directory. If the variable
577@samp{rng-schema-locating-files} has not been customized, this
578means that the rule will be added to the file @samp{schemas.xml}
579in the same directory as the document being edited.
580
581The command @kbd{C-c C-s C-t} allows you to select a schema by
582specifying an identifier for the type of the document. The schema
583locating files determine the available type identifiers and what
584schema is used for each type identifier. This is useful when it is
585impossible to infer the right schema from either the file-name or the
586content of the document, even though the schema is already in the
587schema locating file. A situation in which this can occur is when
588there are multiple variants of a schema where all valid documents have
589the same document element. For example, XHTML has Strict and
590Transitional variants. In a situation like this, a schema locating file
591can define a type identifier for each variant. As with @kbd{C-c
592C-s C-f}, Emacs will ask whether you wish to add a rule to a schema
593locating file that persistently associates the document with the
594specified type identifier.
595
596The command @kbd{C-c C-s C-l} adds a rule to a schema
597locating file that persistently associates the document with
598the schema that is currently being used.
599
600@node Schema locating files
601@section Schema locating files
602
603Each schema locating file specifies a list of rules. The rules
604from each file are appended in order. To locate a schema each rule is
605applied in turn until a rule matches. The first matching rule is then
606used to determine the schema.
607
608Schema locating files are designed to be useful for other
609applications that need to locate a schema for a document. In fact,
610there is nothing specific to locating schemas in the design; it could
611equally well be used for locating a stylesheet.
612
613@menu
867d4bb3
JB
614* Schema locating file syntax basics::
615* Using the document's URI to locate a schema::
616* Using the document element to locate a schema::
617* Using type identifiers in schema locating files::
618* Using multiple schema locating files::
8cd39fb3
MH
619@end menu
620
621@node Schema locating file syntax basics
622@subsection Schema locating file syntax basics
623
624There is a schema for schema locating files in the file
625@samp{locate.rnc} in the schema directory. Schema locating
626files must be valid with respect to this schema.
627
628The document element of a schema locating file must be
629@samp{locatingRules} and the namespace URI must be
630@samp{http://thaiopensource.com/ns/locating-rules/1.0}. The
631children of the document element specify rules. The order of the
632children is the same as the order of the rules. Here's a complete
633example of a schema locating file:
634
635@example
636<?xml version="1.0"?>
637<locatingRules xmlns="http://thaiopensource.com/ns/locating-rules/1.0">
638 <namespace ns="http://www.w3.org/1999/xhtml" uri="xhtml.rnc"/>
639 <documentElement localName="book" uri="docbook.rnc"/>
640</locatingRules>
641@end example
642
643@noindent
644This says to use the schema @samp{xhtml.rnc} for a document with
645namespace @samp{http://www.w3.org/1999/xhtml}, and to use the
646schema @samp{docbook.rnc} for a document whose local name is
647@samp{book}. If the document element had both a namespace URI
648of @samp{http://www.w3.org/1999/xhtml} and a local name of
649@samp{book}, then the matching rule that comes first will be
650used and so the schema @samp{xhtml.rnc} would be used. There is
651no precedence between different types of rule; the first matching rule
652of any type is used.
653
654As usual with XML-related technologies, resources are identified
655by URIs. The @samp{uri} attribute identifies the schema by
656specifying the URI. The URI may be relative. If so, it is resolved
657relative to the URI of the schema locating file that contains
658attribute. This means that if the value of @samp{uri} attribute
659does not contain a @samp{/}, then it will refer to a filename in
660the same directory as the schema locating file.
661
662@node Using the document's URI to locate a schema
663@subsection Using the document's URI to locate a schema
664
665A @samp{uri} rule locates a schema based on the URI of the
666document. The @samp{uri} attribute specifies the URI of the
667schema. The @samp{resource} attribute can be used to specify
668the schema for a particular document. For example,
669
670@example
671<uri resource="spec.xml" uri="docbook.rnc"/>
672@end example
673
674@noindent
867d4bb3 675specifies that the schema for @samp{spec.xml} is
8cd39fb3
MH
676@samp{docbook.rnc}.
677
678The @samp{pattern} attribute can be used instead of the
679@samp{resource} attribute to specify the schema for any document
680whose URI matches a pattern. The pattern has the same syntax as an
681absolute or relative URI except that the path component of the URI can
682use a @samp{*} character to stand for zero or more characters
683within a path segment (i.e. any character other @samp{/}).
684Typically, the URI pattern looks like a relative URI, but, whereas a
685relative URI in the @samp{resource} attribute is resolved into a
686particular absolute URI using the base URI of the schema locating
687file, a relative URI pattern matches if it matches some number of
688complete path segments of the document's URI ending with the last path
689segment of the document's URI. For example,
690
691@example
692<uri pattern="*.xsl" uri="xslt.rnc"/>
693@end example
694
695@noindent
696specifies that the schema for documents with a URI whose path ends
697with @samp{.xsl} is @samp{xslt.rnc}.
698
699A @samp{transformURI} rule locates a schema by
700transforming the URI of the document. The @samp{fromPattern}
701attribute specifies a URI pattern with the same meaning as the
702@samp{pattern} attribute of the @samp{uri} element. The
703@samp{toPattern} attribute is a URI pattern that is used to
704generate the URI of the schema. Each @samp{*} in the
705@samp{toPattern} is replaced by the string that matched the
706corresponding @samp{*} in the @samp{fromPattern}. The
707resulting string is appended to the initial part of the document's URI
708that was not explicitly matched by the @samp{fromPattern}. The
709rule matches only if the transformed URI identifies an existing
710resource. For example, the rule
711
712@example
713<transformURI fromPattern="*.xml" toPattern="*.rnc"/>
714@end example
715
716@noindent
717would transform the URI @samp{file:///home/jjc/docs/spec.xml}
718into the URI @samp{file:///home/jjc/docs/spec.rnc}. Thus, this
719rule specifies that to locate a schema for a document
720@samp{@var{foo}.xml}, Emacs should test whether a file
721@samp{@var{foo}.rnc} exists in the same directory as
722@samp{@var{foo}.xml}, and, if so, should use it as the
723schema.
724
725@node Using the document element to locate a schema
726@subsection Using the document element to locate a schema
727
728A @samp{documentElement} rule locates a schema based on
729the local name and prefix of the document element. For example, a rule
730
731@example
732<documentElement prefix="xsl" localName="stylesheet" uri="xslt.rnc"/>
733@end example
734
735@noindent
736specifies that when the name of the document element is
737@samp{xsl:stylesheet}, then @samp{xslt.rnc} should be used
738as the schema. Either the @samp{prefix} or
739@samp{localName} attribute may be omitted to allow any prefix or
740local name.
741
742A @samp{namespace} rule locates a schema based on the
743namespace URI of the document element. For example, a rule
744
745@example
746<namespace ns="http://www.w3.org/1999/XSL/Transform" uri="xslt.rnc"/>
747@end example
748
749@noindent
750specifies that when the namespace URI of the document is
751@samp{http://www.w3.org/1999/XSL/Transform}, then
752@samp{xslt.rnc} should be used as the schema.
753
754@node Using type identifiers in schema locating files
755@subsection Using type identifiers in schema locating files
756
757Type identifiers allow a level of indirection in locating the
758schema for a document. Instead of associating the document directly
759with a schema URI, the document is associated with a type identifier,
760which is in turn associated with a schema URI. nXML mode does not
761constrain the format of type identifiers. They can be simply strings
762without any formal structure or they can be public identifiers or
763URIs. Note that these type identifiers have nothing to do with the
764DOCTYPE declaration. When comparing type identifiers, whitespace is
765normalized in the same way as with the @samp{xsd:token}
766datatype: leading and trailing whitespace is stripped; other sequences
767of whitespace are normalized to a single space character.
768
769Each of the rules described in previous sections that uses a
770@samp{uri} attribute to specify a schema, can instead use a
771@samp{typeId} attribute to specify a type identifier. The type
772identifier can be associated with a URI using a @samp{typeId}
773element. For example,
774
775@example
776<locatingRules xmlns="http://thaiopensource.com/ns/locating-rules/1.0">
777 <namespace ns="http://www.w3.org/1999/xhtml" typeId="XHTML"/>
778 <typeId id="XHTML" typeId="XHTML Strict"/>
779 <typeId id="XHTML Strict" uri="xhtml-strict.rnc"/>
780 <typeId id="XHTML Transitional" uri="xhtml-transitional.rnc"/>
781</locatingRules>
782@end example
783
784@noindent
785declares three type identifiers @samp{XHTML} (representing the
786default variant of XHTML to be used), @samp{XHTML Strict} and
787@samp{XHTML Transitional}. Such a schema locating file would
788use @samp{xhtml-strict.rnc} for a document whose namespace is
789@samp{http://www.w3.org/1999/xhtml}. But it is considerably
790more flexible than a schema locating file that simply specified
791
792@example
793<namespace ns="http://www.w3.org/1999/xhtml" uri="xhtml-strict.rnc"/>
794@end example
795
796@noindent
797A user can easily use @kbd{C-c C-s C-t} to select between XHTML
798Strict and XHTML Transitional. Also, a user can easily add a catalog
799
800@example
801<locatingRules xmlns="http://thaiopensource.com/ns/locating-rules/1.0">
802 <typeId id="XHTML" typeId="XHTML Transitional"/>
803</locatingRules>
804@end example
805
806@noindent
807that makes the default variant of XHTML be XHTML Transitional.
808
809@node Using multiple schema locating files
810@subsection Using multiple schema locating files
811
812The @samp{include} element includes rules from another
813schema locating file. The behavior is exactly as if the rules from
814that file were included in place of the @samp{include} element.
815Relative URIs are resolved into absolute URIs before the inclusion is
816performed. For example,
817
818@example
819<include rules="../rules.xml"/>
820@end example
821
822@noindent
823includes the rules from @samp{rules.xml}.
824
825The process of locating a schema takes as input a list of schema
826locating files. The rules in all these files and in the files they
827include are resolved into a single list of rules, which are applied
828strictly in order. Sometimes this order is not what is needed.
829For example, suppose you have two schema locating files, a private
830file
831
832@example
833<locatingRules xmlns="http://thaiopensource.com/ns/locating-rules/1.0">
834 <namespace ns="http://www.w3.org/1999/xhtml" uri="xhtml.rnc"/>
835</locatingRules>
836@end example
837
838@noindent
839followed by a public file
840
841@example
842<locatingRules xmlns="http://thaiopensource.com/ns/locating-rules/1.0">
843 <transformURI pathSuffix=".xml" replacePathSuffix=".rnc"/>
844 <namespace ns="http://www.w3.org/1999/XSL/Transform" typeId="XSLT"/>
845</locatingRules>
846@end example
847
848@noindent
849The effect of these two files is that the XHTML @samp{namespace}
850rule takes precedence over the @samp{transformURI} rule, which
851is almost certainly not what is needed. This can be solved by adding
852an @samp{applyFollowingRules} to the private file.
853
854@example
855<locatingRules xmlns="http://thaiopensource.com/ns/locating-rules/1.0">
856 <applyFollowingRules ruleType="transformURI"/>
857 <namespace ns="http://www.w3.org/1999/xhtml" uri="xhtml.rnc"/>
858</locatingRules>
859@end example
860
861@node DTDs
862@chapter DTDs
863
3d439cd1 864nXML mode is designed to support the creation of standalone XML
8cd39fb3
MH
865documents that do not depend on a DTD. Although it is common practice
866to insert a DOCTYPE declaration referencing an external DTD, this has
867undesirable side-effects. It means that the document is no longer
868self-contained. It also means that different XML parsers may interpret
869the document in different ways, since the XML Recommendation does not
870require XML parsers to read the DTD. With DTDs, it was impractical to
871get validation without using an external DTD or reference to an
872parameter entity. With RELAX NG and other schema languages, you can
9858f6c3 873simultaneously get the benefits of validation and standalone XML
8cd39fb3
MH
874documents. Therefore, I recommend that you do not reference an
875external DOCTYPE in your XML documents.
876
877One problem is entities for characters. Typically, as well as
878providing validation, DTDs also provide a set of character entities
879for documents to use. Schemas cannot provide this functionality,
880because schema validation happens after XML parsing. The recommended
881solution is to either use the Unicode characters directly, or, if this
882is impractical, use character references. nXML mode supports this by
883providing commands for entering characters and character references
884using the Unicode names, and can display the glyph corresponding to a
885character reference.
886
887@node Limitations
888@chapter Limitations
889
890nXML mode has some limitations:
891
892@itemize @bullet
893@item
894DTD support is limited. Internal parsed general entities declared
895in the internal subset are supported provided they do not contain
896elements. Other usage of DTDs is ignored.
897@item
898The restrictions on RELAX NG schemas in section 7 of the RELAX NG
899specification are not enforced.
8cd39fb3
MH
900@end itemize
901
902@bye