Optionally, undo several consequential deletion in one step.
[bpt/emacs.git] / doc / misc / nxml-mode.texi
CommitLineData
8cd39fb3
MH
1\input texinfo @c -*- texinfo -*-
2@c %**start of header
29993416 3@setfilename ../../info/nxml-mode.info
8cd39fb3 4@settitle nXML Mode
c6ab4664 5@documentencoding UTF-8
8cd39fb3
MH
6@c %**end of header
7
20234d96 8@copying
3d439cd1 9This manual documents nXML mode, an Emacs major mode for editing
867d4bb3 10XML with RELAX NG support.
20234d96 11
6bc383b1 12Copyright @copyright{} 2007--2014 Free Software Foundation, Inc.
20234d96
GM
13
14@quotation
15Permission is granted to copy, distribute and/or modify this document
6a2c4aec 16under the terms of the GNU Free Documentation License, Version 1.3 or
20234d96 17any later version published by the Free Software Foundation; with no
551a89e1 18Invariant Sections, with the Front-Cover Texts being ``A GNU Manual,''
0b1af106
GM
19and with the Back-Cover Texts as in (a) below. A copy of the license
20is included in the section entitled ``GNU Free Documentation License''.
20234d96 21
6f093307 22(a) The FSF's Back-Cover Text is: ``You have the freedom to copy and
6bf430d1 23modify this GNU manual.''
20234d96
GM
24@end quotation
25@end copying
26
0c973505 27@dircategory Emacs editing modes
8cd39fb3 28@direntry
7aa579d9 29* nXML Mode: (nxml-mode). XML editing mode with RELAX NG support.
8cd39fb3
MH
30@end direntry
31
4a970ff5
GM
32
33@titlepage
34@title nXML mode
35@page
36@vskip 0pt plus 1filll
37@insertcopying
38@end titlepage
39
40@contents
41
42
8cd39fb3
MH
43@node Top
44@top nXML Mode
45
5dc584b5
KB
46@insertcopying
47
48This manual is not yet complete.
8cd39fb3
MH
49
50@menu
d3dfb185 51* Introduction::
867d4bb3
JB
52* Completion::
53* Inserting end-tags::
54* Paragraphs::
55* Outlining::
56* Locating a schema::
57* DTDs::
58* Limitations::
0b1af106 59* GNU Free Documentation License:: The license for this documentation.
8cd39fb3
MH
60@end menu
61
d3dfb185
GM
62@node Introduction
63@chapter Introduction
64
65nXML mode is an Emacs major-mode for editing XML documents. It supports
66editing well-formed XML documents, and provides schema-sensitive editing
67using RELAX NG Compact Syntax. To get started, visit a file containing an
68XML document, and, if necessary, use @kbd{M-x nxml-mode} to switch to nXML
69mode. By default, @code{auto-mode-alist} and @code{magic-fallback-alist}
70put buffers in nXML mode if they have recognizable XML content or file
71extensions. You may wish to customize the settings, for example to
72recognize different file extensions.
73
74Once in nXML mode, you can type @kbd{C-h m} for basic information on the
75mode.
76
77The @file{etc/nxml} directory in the Emacs distribution contains some data
b218c6cd
EW
78files used by nXML mode, and includes two files (@file{test-valid.xml} and
79@file{test-invalid.xml}) that provide examples of valid and invalid XML
d3dfb185
GM
80documents.
81
82To get validation and schema-sensitive editing, you need a RELAX NG Compact
83Syntax (RNC) schema for your document (@pxref{Locating a schema}). The
84@file{etc/schema} directory includes some schemas for popular document
1df7defd 85types. See @url{http://relaxng.org/} for more information on RELAX NG@.
d3dfb185
GM
86You can use the @samp{Trang} program from
87@url{http://www.thaiopensource.com/relaxng/trang.html} to
88automatically create RNC schemas. This program can:
89
90@itemize @bullet
91@item
92infer an RNC schema from an instance document;
93@item
94convert a DTD to an RNC schema;
95@item
96convert a RELAX NG XML syntax schema to an RNC schema.
97@end itemize
98
99@noindent To convert a RELAX NG XML syntax (@samp{.rng}) schema to a RNC
100one, you can also use the XSLT stylesheet from
583873a9
GM
101@url{https://github.com/oleg-pavliv/emacs/tree/master/xsl}.
102@ignore
103@c Original location, now defunct.
d3dfb185 104@url{http://www.pantor.com/download.html}.
583873a9 105@end ignore
d3dfb185
GM
106
107To convert a W3C XML Schema to an RNC schema, you need first to convert it
4d47208a 108to RELAX NG XML syntax using the RELAX NG converter tool @code{rngconv}
d3dfb185
GM
109(built on top of MSV). See @url{https://github.com/kohsuke/msv}
110and @url{https://msv.dev.java.net/}.
111
112For historical discussions only, see the mailing list archives at
113@url{http://groups.yahoo.com/group/emacs-nxml-mode/}. Please make all new
114discussions on the @samp{help-gnu-emacs} and @samp{emacs-devel} mailing
115lists. Report any bugs with @kbd{M-x report-emacs-bug}.
116
117
8cd39fb3
MH
118@node Completion
119@chapter Completion
120
3d439cd1
CY
121Apart from real-time validation, the most important feature that nXML
122mode provides for assisting in document creation is "completion".
8cd39fb3
MH
123Completion assists the user in inserting characters at point, based on
124knowledge of the schema and on the contents of the buffer before
125point.
126
3d439cd1
CY
127nXML mode adapts the standard GNU Emacs command for completion in a
128buffer: @code{completion-at-point}, which is bound to @kbd{C-M-i} and
129@kbd{M-@key{TAB}}. Note that many window systems and window managers
130use @kbd{M-@key{TAB}} themselves (typically for switching between
131windows) and do not pass it to applications. In that case, you should
132type @kbd{C-M-i} or @kbd{@key{ESC} @key{TAB}} for completion, or bind
133@code{completion-at-point} to a key that is convenient for you. In
134the following, I will assume that you type @kbd{C-M-i}.
135
136nXML mode completion works by examining the symbol preceding point.
137This is the symbol to be completed. The symbol to be completed may be
138the empty. Completion considers what symbols starting with the symbol
139to be completed would be valid replacements for the symbol to be
8cd39fb3
MH
140completed, given the schema and the contents of the buffer before
141point. These symbols are the possible completions. An example may
142make this clearer. Suppose the buffer looks like this (where @point{}
143indicates point):
144
145@example
146<html xmlns="http://www.w3.org/1999/xhtml">
147<h@point{}
148@end example
149
150@noindent
1df7defd 151and the schema is XHTML@. In this context, the symbol to be completed
8cd39fb3
MH
152is @samp{h}. The possible completions consist of just
153@samp{head}. Another example, is
154
155@example
156<html xmlns="http://www.w3.org/1999/xhtml">
157<head>
158<@point{}
159@end example
160
161@noindent
162In this case, the symbol to be completed is empty, and the possible
163completions are @samp{base}, @samp{isindex},
164@samp{link}, @samp{meta}, @samp{script},
165@samp{style}, @samp{title}. Another example is:
166
167@example
168<html xmlns="@point{}
169@end example
170
171@noindent
172In this case, the symbol to be completed is empty, and the possible
173completions are just @samp{http://www.w3.org/1999/xhtml}.
174
3d439cd1 175When you type @kbd{C-M-i}, what happens depends
8cd39fb3
MH
176on what the set of possible completions are.
177
178@itemize @bullet
179@item
180If the set of completions is empty, nothing
181happens.
182@item
183If there is one possible completion, then that completion is
184inserted, together with any following characters that are
185required. For example, in this case:
186
187@example
188<html xmlns="http://www.w3.org/1999/xhtml">
189<@point{}
190@end example
191
192@noindent
3d439cd1 193@kbd{C-M-i} will yield
8cd39fb3
MH
194
195@example
196<html xmlns="http://www.w3.org/1999/xhtml">
197<head@point{}
198@end example
199@item
200If there is more than one possible completion, but all
201possible completions share a common non-empty prefix, then that prefix
202is inserted. For example, suppose the buffer is:
203
204@example
205<html x@point{}
206@end example
207
208@noindent
3d439cd1
CY
209The symbol to be completed is @samp{x}. The possible completions are
210@samp{xmlns} and @samp{xml:lang}. These share a common prefix of
211@samp{xml}. Thus, @kbd{C-M-i} will yield:
8cd39fb3
MH
212
213@example
214<html xml@point{}
215@end example
216
217@noindent
3d439cd1
CY
218Typically, you would do @kbd{C-M-i} again, which would have the result
219described in the next item.
8cd39fb3
MH
220@item
221If there is more than one possible completion, but the
222possible completions do not share a non-empty prefix, then Emacs will
223prompt you to input the symbol in the minibuffer, initializing the
224minibuffer with the symbol to be completed, and popping up a buffer
225showing the possible completions. You can now input the symbol to be
226inserted. The symbol you input will be inserted in the buffer instead
227of the symbol to be completed. Emacs will then insert any required
228characters after the symbol. For example, if it contains:
229
230@example
231<html xml@point{}
232@end example
233
234@noindent
235Emacs will prompt you in the minibuffer with
236
237@example
238Attribute: xml@point{}
239@end example
240
241@noindent
242and the buffer showing possible completions will contain
243
244@example
245Possible completions are:
b1fbbb32 246xml:lang xmlns
8cd39fb3
MH
247@end example
248
249@noindent
250If you input @kbd{xmlns}, the result will be:
251
252@example
253<html xmlns="@point{}
254@end example
255
256@noindent
3d439cd1
CY
257(If you do @kbd{C-M-i} again, the namespace URI will be
258inserted. Should that happen automatically?)
8cd39fb3
MH
259@end itemize
260
261@node Inserting end-tags
262@chapter Inserting end-tags
263
3d439cd1 264The main redundancy in XML syntax is end-tags. nXML mode provides
8cd39fb3
MH
265several ways to make it easier to enter end-tags. You can use all of
266these without a schema.
267
3d439cd1
CY
268You can use @kbd{C-M-i} after @samp{</} to complete the rest of the
269end-tag.
8cd39fb3
MH
270
271@kbd{C-c C-f} inserts an end-tag for the element containing
272point. This command is useful when you want to input the start-tag,
273then input the content and finally input the end-tag. The @samp{f}
274is mnemonic for finish.
275
276If you want to keep tags balanced and input the end-tag at the
277same time as the start-tag, before inputting the content, then you can
278use @kbd{C-c C-i}. This inserts a @samp{>}, then inserts
279the end-tag and leaves point before the end-tag. @kbd{C-c C-b}
280is similar but more convenient for block-level elements: it puts the
281start-tag, point and the end-tag on successive lines, appropriately
282indented. The @samp{i} is mnemonic for inline and the
283@samp{b} is mnemonic for block.
284
3d439cd1
CY
285Finally, you can customize nXML mode so that @kbd{/} automatically
286inserts the rest of the end-tag when it occurs after @samp{<}, by
287doing
8cd39fb3
MH
288
289@display
290@kbd{M-x customize-variable @key{RET} nxml-slash-auto-complete-flag @key{RET}}
291@end display
292
293@noindent
294and then following the instructions in the displayed buffer.
295
296@node Paragraphs
297@chapter Paragraphs
298
299Emacs has several commands that operate on paragraphs, most
300notably @kbd{M-q}. nXML mode redefines these to work in a way
1df7defd 301that is useful for XML@. The exact rules that are used to find the
8cd39fb3
MH
302beginning and end of a paragraph are complicated; they are designed
303mainly to ensure that @kbd{M-q} does the right thing.
304
305A paragraph consists of one or more complete, consecutive lines.
306A group of lines is not considered a paragraph unless it contains some
307non-whitespace characters between tags or inside comments. A blank
308line separates paragraphs. A single tag on a line by itself also
309separates paragraphs. More precisely, if one tag together with any
310leading and trailing whitespace completely occupy one or more lines,
311then those lines will not be included in any paragraph.
312
313A start-tag at the beginning of the line (possibly indented) may
314be treated as starting a paragraph. Similarly, an end-tag at the end
315of the line may be treated as ending a paragraph. The following rules
316are used to determine whether such a tag is in fact treated as a
317paragraph boundary:
318
319@itemize @bullet
320@item
321If the schema does not allow text at that point, then it
322is a paragraph boundary.
323@item
324If the end-tag corresponding to the start-tag is not at
325the end of its line, or the start-tag corresponding to the end-tag is
326not at the beginning of its line, then it is not a paragraph
327boundary. For example, in
328
329@example
330<p>This is a paragraph with an
331<emph>emphasized</emph> phrase.
332@end example
333
334@noindent
335the @samp{<emph>} start-tag would not be considered as
336starting a paragraph, because its corresponding end-tag is not at the
337end of the line.
338@item
339If there is text that is a sibling in element tree, then
340it is not a paragraph boundary. For example, in
341
342@example
343<p>This is a paragraph with an
344<emph>emphasized phrase that takes one source line</emph>
345@end example
346
347@noindent
348the @samp{<emph>} start-tag would not be considered as
349starting a paragraph, even though its end-tag is at the end of its
350line, because there the text @samp{This is a paragraph with an}
351is a sibling of the @samp{emph} element.
352@item
353Otherwise, it is a paragraph boundary.
354@end itemize
355
356@node Outlining
357@chapter Outlining
358
359nXML mode allows you to display all or part of a buffer as an
44e97401 360outline, in a similar way to Emacs's outline mode. An outline in nXML
8cd39fb3
MH
361mode is based on recognizing two kinds of element: sections and
362headings. There is one heading for every section and one section for
363every heading. A section contains its heading as or within its first
364child element. A section also contains its subordinate sections (its
365subsections). The text content of a section consists of anything in a
366section that is neither a subsection nor a heading.
367
1df7defd 368Note that this is a different model from that used by XHTML@.
8cd39fb3
MH
369nXML mode's outline support will not be useful for XHTML unless you
370adopt a convention of adding a @code{div} to enclose each
371section, rather than having sections implicitly delimited by different
372@code{h@var{n}} elements. This limitation may be removed
373in a future version.
374
375The variable @code{nxml-section-element-name-regexp} gives
1df7defd 376a regexp for the local names (i.e., the part of the name following any
8cd39fb3
MH
377prefix) of section elements. The variable
378@code{nxml-heading-element-name-regexp} gives a regexp for the
379local names of heading elements. For an element to be recognized
380as a section
381
382@itemize @bullet
383@item
384its start-tag must occur at the beginning of a line
385(possibly indented);
386@item
387its local name must match
388@code{nxml-section-element-name-regexp};
389@item
390either its first child element or a descendant of that
391first child element must have a local name that matches
392@code{nxml-heading-element-name-regexp}; the first such element
393is treated as the section's heading.
394@end itemize
395
396@noindent
397You can customize these variables using @kbd{M-x
398customize-variable}.
399
400There are three possible outline states for a section:
401
402@itemize @bullet
403@item
404normal, showing everything, including its heading, text
405content and subsections; each subsection is displayed according to the
406state of that subsection;
407@item
408showing just its heading, with both its text content and
409its subsections hidden; all subsections are hidden regardless of their
410state;
411@item
412showing its heading and its subsections, with its text
413content hidden; each subsection is displayed according to the state of
414that subsection.
415@end itemize
416
417In the last two states, where the text content is hidden, the
418heading is displayed specially, in an abbreviated form. An element
419like this:
420
421@example
422<section>
423<title>Food</title>
424<para>There are many kinds of food.</para>
425</section>
426@end example
427
428@noindent
429would be displayed on a single line like this:
430
431@example
432<-section>Food...</>
433@end example
434
435@noindent
436If there are hidden subsections, then a @code{+} will be used
437instead of a @code{-} like this:
438
439@example
440<+section>Food...</>
441@end example
442
443@noindent
444If there are non-hidden subsections, then the section will instead be
445displayed like this:
446
447@example
448<-section>Food...
449 <-section>Delicious Food...</>
450 <-section>Distasteful Food...</>
451</-section>
452@end example
453
454@noindent
455The heading is always displayed with an indent that corresponds to its
456depth in the outline, even it is not actually indented in the buffer.
457The variable @code{nxml-outline-child-indent} controls how much
458a subheading is indented with respect to its parent heading when the
459heading is being displayed specially.
460
461Commands to change the outline state of sections are bound to
462key sequences that start with @kbd{C-c C-o} (@kbd{o} is
463mnemonic for outline). The third and final key has been chosen to be
464consistent with outline mode. In the following descriptions
465current section means the section containing point, or, more precisely,
466the innermost section containing the character immediately following
467point.
468
469@itemize @bullet
470@item
471@kbd{C-c C-o C-a} shows all sections in the buffer
472normally.
473@item
474@kbd{C-c C-o C-t} hides the text content
475of all sections in the buffer.
476@item
477@kbd{C-c C-o C-c} hides the text content
478of the current section.
479@item
480@kbd{C-c C-o C-e} shows the text content
481of the current section.
482@item
483@kbd{C-c C-o C-d} hides the text content
484and subsections of the current section.
485@item
867d4bb3 486@kbd{C-c C-o C-s} shows the current section
8cd39fb3
MH
487and all its direct and indirect subsections normally.
488@item
489@kbd{C-c C-o C-k} shows the headings of the
490direct and indirect subsections of the current section.
491@item
492@kbd{C-c C-o C-l} hides the text content of the
493current section and of its direct and indirect
494subsections.
495@item
496@kbd{C-c C-o C-i} shows the headings of the
497direct subsections of the current section.
498@item
499@kbd{C-c C-o C-o} hides as much as possible without
500hiding the current section's text content; the headings of ancestor
501sections of the current section and their child section sections will
502not be hidden.
503@end itemize
504
505When a heading is displayed specially, you can use
506@key{RET} in that heading to show the text content of the section
507in the same way as @kbd{C-c C-o C-e}.
508
509You can also use the mouse to change the outline state:
510@kbd{S-mouse-2} hides the text content of a section in the same
511way as@kbd{C-c C-o C-c}; @kbd{mouse-2} on a specially
512displayed heading shows the text content of the section in the same
513way as @kbd{C-c C-o C-e}; @kbd{mouse-1} on a specially
514displayed start-tag toggles the display of subheadings on and
515off.
516
517The outline state for each section is stored with the first
518character of the section (as a text property). Every command that
519changes the outline state of any section updates the display of the
520buffer so that each section is displayed correctly according to its
521outline state. If the section structure is subsequently changed, then
522it is possible for the display to no longer correctly reflect the
523stored outline state. @kbd{C-c C-o C-r} can be used to refresh
524the display so it is correct again.
525
526@node Locating a schema
527@chapter Locating a schema
528
529nXML mode has a configurable set of rules to locate a schema for
530the file being edited. The rules are contained in one or more schema
531locating files, which are XML documents.
532
533The variable @samp{rng-schema-locating-files} specifies
534the list of the file-names of schema locating files that nXML mode
535should use. The order of the list is significant: when file
536@var{x} occurs in the list before file @var{y} then rules
537from file @var{x} have precedence over rules from file
538@var{y}. A filename specified in
539@samp{rng-schema-locating-files} may be relative. If so, it will
540be resolved relative to the document for which a schema is being
541located. It is not an error if relative file-names in
867d4bb3 542@samp{rng-schema-locating-files} do not exist. You can use
8cd39fb3
MH
543@kbd{M-x customize-variable @key{RET} rng-schema-locating-files
544@key{RET}} to customize the list of schema locating
545files.
546
547By default, @samp{rng-schema-locating-files} list has two
548members: @samp{schemas.xml}, and
549@samp{@var{dist-dir}/schema/schemas.xml} where
550@samp{@var{dist-dir}} is the directory containing the nXML
551distribution. The first member will cause nXML mode to use a file
552@samp{schemas.xml} in the same directory as the document being
553edited if such a file exist. The second member contains rules for the
554schemas that are included with the nXML distribution.
555
556@menu
867d4bb3
JB
557* Commands for locating a schema::
558* Schema locating files::
8cd39fb3
MH
559@end menu
560
561@node Commands for locating a schema
562@section Commands for locating a schema
563
564The command @kbd{C-c C-s C-w} will tell you what schema
565is currently being used.
566
567The rules for locating a schema are applied automatically when
568you visit a file in nXML mode. However, if you have just created a new
569file and the schema cannot be inferred from the file-name, then this
570will not locate the right schema. In this case, you should insert the
40572be6 571start-tag of the root element and then use the command @kbd{C-c C-s
8cd39fb3
MH
572C-a}, which reapplies the rules based on the current content of
573the document. It is usually not necessary to insert the complete
574start-tag; often just @samp{<@var{name}} is
575enough.
576
577If you want to use a schema that has not yet been added to the
578schema locating files, you can use the command @kbd{C-c C-s C-f}
b6f9df0f 579to manually select the file containing the schema for the document in
8cd39fb3
MH
580current buffer. Emacs will read the file-name of the schema from the
581minibuffer. After reading the file-name, Emacs will ask whether you
582wish to add a rule to a schema locating file that persistently
583associates the document with the selected schema. The rule will be
584added to the first file in the list specified
585@samp{rng-schema-locating-files}; it will create the file if
586necessary, but will not create a directory. If the variable
587@samp{rng-schema-locating-files} has not been customized, this
588means that the rule will be added to the file @samp{schemas.xml}
589in the same directory as the document being edited.
590
591The command @kbd{C-c C-s C-t} allows you to select a schema by
592specifying an identifier for the type of the document. The schema
593locating files determine the available type identifiers and what
594schema is used for each type identifier. This is useful when it is
595impossible to infer the right schema from either the file-name or the
596content of the document, even though the schema is already in the
597schema locating file. A situation in which this can occur is when
598there are multiple variants of a schema where all valid documents have
599the same document element. For example, XHTML has Strict and
600Transitional variants. In a situation like this, a schema locating file
601can define a type identifier for each variant. As with @kbd{C-c
602C-s C-f}, Emacs will ask whether you wish to add a rule to a schema
603locating file that persistently associates the document with the
604specified type identifier.
605
606The command @kbd{C-c C-s C-l} adds a rule to a schema
607locating file that persistently associates the document with
608the schema that is currently being used.
609
610@node Schema locating files
611@section Schema locating files
612
613Each schema locating file specifies a list of rules. The rules
614from each file are appended in order. To locate a schema each rule is
615applied in turn until a rule matches. The first matching rule is then
616used to determine the schema.
617
618Schema locating files are designed to be useful for other
619applications that need to locate a schema for a document. In fact,
620there is nothing specific to locating schemas in the design; it could
621equally well be used for locating a stylesheet.
622
623@menu
867d4bb3
JB
624* Schema locating file syntax basics::
625* Using the document's URI to locate a schema::
626* Using the document element to locate a schema::
627* Using type identifiers in schema locating files::
628* Using multiple schema locating files::
8cd39fb3
MH
629@end menu
630
631@node Schema locating file syntax basics
632@subsection Schema locating file syntax basics
633
634There is a schema for schema locating files in the file
635@samp{locate.rnc} in the schema directory. Schema locating
636files must be valid with respect to this schema.
637
638The document element of a schema locating file must be
639@samp{locatingRules} and the namespace URI must be
640@samp{http://thaiopensource.com/ns/locating-rules/1.0}. The
641children of the document element specify rules. The order of the
642children is the same as the order of the rules. Here's a complete
643example of a schema locating file:
644
645@example
646<?xml version="1.0"?>
647<locatingRules xmlns="http://thaiopensource.com/ns/locating-rules/1.0">
648 <namespace ns="http://www.w3.org/1999/xhtml" uri="xhtml.rnc"/>
649 <documentElement localName="book" uri="docbook.rnc"/>
650</locatingRules>
651@end example
652
653@noindent
654This says to use the schema @samp{xhtml.rnc} for a document with
655namespace @samp{http://www.w3.org/1999/xhtml}, and to use the
656schema @samp{docbook.rnc} for a document whose local name is
657@samp{book}. If the document element had both a namespace URI
658of @samp{http://www.w3.org/1999/xhtml} and a local name of
659@samp{book}, then the matching rule that comes first will be
660used and so the schema @samp{xhtml.rnc} would be used. There is
661no precedence between different types of rule; the first matching rule
662of any type is used.
663
664As usual with XML-related technologies, resources are identified
665by URIs. The @samp{uri} attribute identifies the schema by
1df7defd 666specifying the URI@. The URI may be relative. If so, it is resolved
8cd39fb3
MH
667relative to the URI of the schema locating file that contains
668attribute. This means that if the value of @samp{uri} attribute
669does not contain a @samp{/}, then it will refer to a filename in
670the same directory as the schema locating file.
671
672@node Using the document's URI to locate a schema
673@subsection Using the document's URI to locate a schema
674
675A @samp{uri} rule locates a schema based on the URI of the
676document. The @samp{uri} attribute specifies the URI of the
677schema. The @samp{resource} attribute can be used to specify
678the schema for a particular document. For example,
679
680@example
681<uri resource="spec.xml" uri="docbook.rnc"/>
682@end example
683
684@noindent
867d4bb3 685specifies that the schema for @samp{spec.xml} is
8cd39fb3
MH
686@samp{docbook.rnc}.
687
688The @samp{pattern} attribute can be used instead of the
689@samp{resource} attribute to specify the schema for any document
690whose URI matches a pattern. The pattern has the same syntax as an
691absolute or relative URI except that the path component of the URI can
692use a @samp{*} character to stand for zero or more characters
1df7defd 693within a path segment (i.e., any character other @samp{/}).
8cd39fb3
MH
694Typically, the URI pattern looks like a relative URI, but, whereas a
695relative URI in the @samp{resource} attribute is resolved into a
696particular absolute URI using the base URI of the schema locating
697file, a relative URI pattern matches if it matches some number of
698complete path segments of the document's URI ending with the last path
1df7defd 699segment of the document's URI@. For example,
8cd39fb3
MH
700
701@example
702<uri pattern="*.xsl" uri="xslt.rnc"/>
703@end example
704
705@noindent
706specifies that the schema for documents with a URI whose path ends
707with @samp{.xsl} is @samp{xslt.rnc}.
708
709A @samp{transformURI} rule locates a schema by
710transforming the URI of the document. The @samp{fromPattern}
711attribute specifies a URI pattern with the same meaning as the
712@samp{pattern} attribute of the @samp{uri} element. The
713@samp{toPattern} attribute is a URI pattern that is used to
714generate the URI of the schema. Each @samp{*} in the
715@samp{toPattern} is replaced by the string that matched the
716corresponding @samp{*} in the @samp{fromPattern}. The
717resulting string is appended to the initial part of the document's URI
718that was not explicitly matched by the @samp{fromPattern}. The
719rule matches only if the transformed URI identifies an existing
720resource. For example, the rule
721
722@example
723<transformURI fromPattern="*.xml" toPattern="*.rnc"/>
724@end example
725
726@noindent
727would transform the URI @samp{file:///home/jjc/docs/spec.xml}
728into the URI @samp{file:///home/jjc/docs/spec.rnc}. Thus, this
729rule specifies that to locate a schema for a document
730@samp{@var{foo}.xml}, Emacs should test whether a file
731@samp{@var{foo}.rnc} exists in the same directory as
732@samp{@var{foo}.xml}, and, if so, should use it as the
733schema.
734
735@node Using the document element to locate a schema
736@subsection Using the document element to locate a schema
737
738A @samp{documentElement} rule locates a schema based on
739the local name and prefix of the document element. For example, a rule
740
741@example
742<documentElement prefix="xsl" localName="stylesheet" uri="xslt.rnc"/>
743@end example
744
745@noindent
746specifies that when the name of the document element is
747@samp{xsl:stylesheet}, then @samp{xslt.rnc} should be used
748as the schema. Either the @samp{prefix} or
749@samp{localName} attribute may be omitted to allow any prefix or
750local name.
751
752A @samp{namespace} rule locates a schema based on the
753namespace URI of the document element. For example, a rule
754
755@example
756<namespace ns="http://www.w3.org/1999/XSL/Transform" uri="xslt.rnc"/>
757@end example
758
759@noindent
760specifies that when the namespace URI of the document is
761@samp{http://www.w3.org/1999/XSL/Transform}, then
762@samp{xslt.rnc} should be used as the schema.
763
764@node Using type identifiers in schema locating files
765@subsection Using type identifiers in schema locating files
766
767Type identifiers allow a level of indirection in locating the
768schema for a document. Instead of associating the document directly
769with a schema URI, the document is associated with a type identifier,
1df7defd 770which is in turn associated with a schema URI@. nXML mode does not
8cd39fb3
MH
771constrain the format of type identifiers. They can be simply strings
772without any formal structure or they can be public identifiers or
773URIs. Note that these type identifiers have nothing to do with the
774DOCTYPE declaration. When comparing type identifiers, whitespace is
775normalized in the same way as with the @samp{xsd:token}
776datatype: leading and trailing whitespace is stripped; other sequences
777of whitespace are normalized to a single space character.
778
779Each of the rules described in previous sections that uses a
780@samp{uri} attribute to specify a schema, can instead use a
781@samp{typeId} attribute to specify a type identifier. The type
782identifier can be associated with a URI using a @samp{typeId}
783element. For example,
784
785@example
786<locatingRules xmlns="http://thaiopensource.com/ns/locating-rules/1.0">
787 <namespace ns="http://www.w3.org/1999/xhtml" typeId="XHTML"/>
788 <typeId id="XHTML" typeId="XHTML Strict"/>
789 <typeId id="XHTML Strict" uri="xhtml-strict.rnc"/>
790 <typeId id="XHTML Transitional" uri="xhtml-transitional.rnc"/>
791</locatingRules>
792@end example
793
794@noindent
795declares three type identifiers @samp{XHTML} (representing the
796default variant of XHTML to be used), @samp{XHTML Strict} and
797@samp{XHTML Transitional}. Such a schema locating file would
798use @samp{xhtml-strict.rnc} for a document whose namespace is
799@samp{http://www.w3.org/1999/xhtml}. But it is considerably
800more flexible than a schema locating file that simply specified
801
802@example
803<namespace ns="http://www.w3.org/1999/xhtml" uri="xhtml-strict.rnc"/>
804@end example
805
806@noindent
807A user can easily use @kbd{C-c C-s C-t} to select between XHTML
808Strict and XHTML Transitional. Also, a user can easily add a catalog
809
810@example
811<locatingRules xmlns="http://thaiopensource.com/ns/locating-rules/1.0">
812 <typeId id="XHTML" typeId="XHTML Transitional"/>
813</locatingRules>
814@end example
815
816@noindent
817that makes the default variant of XHTML be XHTML Transitional.
818
819@node Using multiple schema locating files
820@subsection Using multiple schema locating files
821
822The @samp{include} element includes rules from another
823schema locating file. The behavior is exactly as if the rules from
824that file were included in place of the @samp{include} element.
825Relative URIs are resolved into absolute URIs before the inclusion is
826performed. For example,
827
828@example
829<include rules="../rules.xml"/>
830@end example
831
832@noindent
833includes the rules from @samp{rules.xml}.
834
835The process of locating a schema takes as input a list of schema
836locating files. The rules in all these files and in the files they
837include are resolved into a single list of rules, which are applied
838strictly in order. Sometimes this order is not what is needed.
839For example, suppose you have two schema locating files, a private
840file
841
842@example
843<locatingRules xmlns="http://thaiopensource.com/ns/locating-rules/1.0">
844 <namespace ns="http://www.w3.org/1999/xhtml" uri="xhtml.rnc"/>
845</locatingRules>
846@end example
847
848@noindent
849followed by a public file
850
851@example
852<locatingRules xmlns="http://thaiopensource.com/ns/locating-rules/1.0">
853 <transformURI pathSuffix=".xml" replacePathSuffix=".rnc"/>
854 <namespace ns="http://www.w3.org/1999/XSL/Transform" typeId="XSLT"/>
855</locatingRules>
856@end example
857
858@noindent
859The effect of these two files is that the XHTML @samp{namespace}
860rule takes precedence over the @samp{transformURI} rule, which
861is almost certainly not what is needed. This can be solved by adding
862an @samp{applyFollowingRules} to the private file.
863
864@example
865<locatingRules xmlns="http://thaiopensource.com/ns/locating-rules/1.0">
866 <applyFollowingRules ruleType="transformURI"/>
867 <namespace ns="http://www.w3.org/1999/xhtml" uri="xhtml.rnc"/>
868</locatingRules>
869@end example
870
871@node DTDs
872@chapter DTDs
873
3d439cd1 874nXML mode is designed to support the creation of standalone XML
1df7defd 875documents that do not depend on a DTD@. Although it is common practice
8cd39fb3
MH
876to insert a DOCTYPE declaration referencing an external DTD, this has
877undesirable side-effects. It means that the document is no longer
878self-contained. It also means that different XML parsers may interpret
879the document in different ways, since the XML Recommendation does not
1df7defd 880require XML parsers to read the DTD@. With DTDs, it was impractical to
8cd39fb3
MH
881get validation without using an external DTD or reference to an
882parameter entity. With RELAX NG and other schema languages, you can
9858f6c3 883simultaneously get the benefits of validation and standalone XML
8cd39fb3
MH
884documents. Therefore, I recommend that you do not reference an
885external DOCTYPE in your XML documents.
886
887One problem is entities for characters. Typically, as well as
888providing validation, DTDs also provide a set of character entities
889for documents to use. Schemas cannot provide this functionality,
890because schema validation happens after XML parsing. The recommended
891solution is to either use the Unicode characters directly, or, if this
892is impractical, use character references. nXML mode supports this by
893providing commands for entering characters and character references
894using the Unicode names, and can display the glyph corresponding to a
895character reference.
896
897@node Limitations
898@chapter Limitations
899
900nXML mode has some limitations:
901
902@itemize @bullet
903@item
904DTD support is limited. Internal parsed general entities declared
905in the internal subset are supported provided they do not contain
906elements. Other usage of DTDs is ignored.
907@item
908The restrictions on RELAX NG schemas in section 7 of the RELAX NG
909specification are not enforced.
8cd39fb3
MH
910@end itemize
911
0b1af106
GM
912@node GNU Free Documentation License
913@appendix GNU Free Documentation License
914@include doclicense.texi
915
8cd39fb3 916@bye