* net/tramp-sh.el (tramp-sh-handle-set-file-acl): Add argument to
[bpt/emacs.git] / doc / misc / nxml-mode.texi
CommitLineData
8cd39fb3
MH
1\input texinfo @c -*- texinfo -*-
2@c %**start of header
ac97a16b 3@setfilename ../../info/nxml-mode
8cd39fb3
MH
4@settitle nXML Mode
5@c %**end of header
6
20234d96 7@copying
3d439cd1 8This manual documents nXML mode, an Emacs major mode for editing
867d4bb3 9XML with RELAX NG support.
20234d96 10
f99f1641 11Copyright @copyright{} 2007--2012 Free Software Foundation, Inc.
20234d96
GM
12
13@quotation
14Permission is granted to copy, distribute and/or modify this document
6a2c4aec 15under the terms of the GNU Free Documentation License, Version 1.3 or
20234d96 16any later version published by the Free Software Foundation; with no
0b1af106
GM
17Invariant Sections, with the Front-Cover texts being ``A GNU Manual,''
18and with the Back-Cover Texts as in (a) below. A copy of the license
19is included in the section entitled ``GNU Free Documentation License''.
20234d96 20
6f093307 21(a) The FSF's Back-Cover Text is: ``You have the freedom to copy and
6bf430d1 22modify this GNU manual.''
20234d96
GM
23@end quotation
24@end copying
25
0c973505 26@dircategory Emacs editing modes
8cd39fb3 27@direntry
7aa579d9 28* nXML Mode: (nxml-mode). XML editing mode with RELAX NG support.
8cd39fb3
MH
29@end direntry
30
31@node Top
32@top nXML Mode
33
5dc584b5
KB
34@insertcopying
35
36This manual is not yet complete.
8cd39fb3
MH
37
38@menu
d3dfb185 39* Introduction::
867d4bb3
JB
40* Completion::
41* Inserting end-tags::
42* Paragraphs::
43* Outlining::
44* Locating a schema::
45* DTDs::
46* Limitations::
0b1af106 47* GNU Free Documentation License:: The license for this documentation.
8cd39fb3
MH
48@end menu
49
d3dfb185
GM
50@node Introduction
51@chapter Introduction
52
53nXML mode is an Emacs major-mode for editing XML documents. It supports
54editing well-formed XML documents, and provides schema-sensitive editing
55using RELAX NG Compact Syntax. To get started, visit a file containing an
56XML document, and, if necessary, use @kbd{M-x nxml-mode} to switch to nXML
57mode. By default, @code{auto-mode-alist} and @code{magic-fallback-alist}
58put buffers in nXML mode if they have recognizable XML content or file
59extensions. You may wish to customize the settings, for example to
60recognize different file extensions.
61
62Once in nXML mode, you can type @kbd{C-h m} for basic information on the
63mode.
64
65The @file{etc/nxml} directory in the Emacs distribution contains some data
b218c6cd
EW
66files used by nXML mode, and includes two files (@file{test-valid.xml} and
67@file{test-invalid.xml}) that provide examples of valid and invalid XML
d3dfb185
GM
68documents.
69
70To get validation and schema-sensitive editing, you need a RELAX NG Compact
71Syntax (RNC) schema for your document (@pxref{Locating a schema}). The
72@file{etc/schema} directory includes some schemas for popular document
1df7defd 73types. See @url{http://relaxng.org/} for more information on RELAX NG@.
d3dfb185
GM
74You can use the @samp{Trang} program from
75@url{http://www.thaiopensource.com/relaxng/trang.html} to
76automatically create RNC schemas. This program can:
77
78@itemize @bullet
79@item
80infer an RNC schema from an instance document;
81@item
82convert a DTD to an RNC schema;
83@item
84convert a RELAX NG XML syntax schema to an RNC schema.
85@end itemize
86
87@noindent To convert a RELAX NG XML syntax (@samp{.rng}) schema to a RNC
88one, you can also use the XSLT stylesheet from
89@url{http://www.pantor.com/download.html}.
90
91To convert a W3C XML Schema to an RNC schema, you need first to convert it
4d47208a 92to RELAX NG XML syntax using the RELAX NG converter tool @code{rngconv}
d3dfb185
GM
93(built on top of MSV). See @url{https://github.com/kohsuke/msv}
94and @url{https://msv.dev.java.net/}.
95
96For historical discussions only, see the mailing list archives at
97@url{http://groups.yahoo.com/group/emacs-nxml-mode/}. Please make all new
98discussions on the @samp{help-gnu-emacs} and @samp{emacs-devel} mailing
99lists. Report any bugs with @kbd{M-x report-emacs-bug}.
100
101
8cd39fb3
MH
102@node Completion
103@chapter Completion
104
3d439cd1
CY
105Apart from real-time validation, the most important feature that nXML
106mode provides for assisting in document creation is "completion".
8cd39fb3
MH
107Completion assists the user in inserting characters at point, based on
108knowledge of the schema and on the contents of the buffer before
109point.
110
3d439cd1
CY
111nXML mode adapts the standard GNU Emacs command for completion in a
112buffer: @code{completion-at-point}, which is bound to @kbd{C-M-i} and
113@kbd{M-@key{TAB}}. Note that many window systems and window managers
114use @kbd{M-@key{TAB}} themselves (typically for switching between
115windows) and do not pass it to applications. In that case, you should
116type @kbd{C-M-i} or @kbd{@key{ESC} @key{TAB}} for completion, or bind
117@code{completion-at-point} to a key that is convenient for you. In
118the following, I will assume that you type @kbd{C-M-i}.
119
120nXML mode completion works by examining the symbol preceding point.
121This is the symbol to be completed. The symbol to be completed may be
122the empty. Completion considers what symbols starting with the symbol
123to be completed would be valid replacements for the symbol to be
8cd39fb3
MH
124completed, given the schema and the contents of the buffer before
125point. These symbols are the possible completions. An example may
126make this clearer. Suppose the buffer looks like this (where @point{}
127indicates point):
128
129@example
130<html xmlns="http://www.w3.org/1999/xhtml">
131<h@point{}
132@end example
133
134@noindent
1df7defd 135and the schema is XHTML@. In this context, the symbol to be completed
8cd39fb3
MH
136is @samp{h}. The possible completions consist of just
137@samp{head}. Another example, is
138
139@example
140<html xmlns="http://www.w3.org/1999/xhtml">
141<head>
142<@point{}
143@end example
144
145@noindent
146In this case, the symbol to be completed is empty, and the possible
147completions are @samp{base}, @samp{isindex},
148@samp{link}, @samp{meta}, @samp{script},
149@samp{style}, @samp{title}. Another example is:
150
151@example
152<html xmlns="@point{}
153@end example
154
155@noindent
156In this case, the symbol to be completed is empty, and the possible
157completions are just @samp{http://www.w3.org/1999/xhtml}.
158
3d439cd1 159When you type @kbd{C-M-i}, what happens depends
8cd39fb3
MH
160on what the set of possible completions are.
161
162@itemize @bullet
163@item
164If the set of completions is empty, nothing
165happens.
166@item
167If there is one possible completion, then that completion is
168inserted, together with any following characters that are
169required. For example, in this case:
170
171@example
172<html xmlns="http://www.w3.org/1999/xhtml">
173<@point{}
174@end example
175
176@noindent
3d439cd1 177@kbd{C-M-i} will yield
8cd39fb3
MH
178
179@example
180<html xmlns="http://www.w3.org/1999/xhtml">
181<head@point{}
182@end example
183@item
184If there is more than one possible completion, but all
185possible completions share a common non-empty prefix, then that prefix
186is inserted. For example, suppose the buffer is:
187
188@example
189<html x@point{}
190@end example
191
192@noindent
3d439cd1
CY
193The symbol to be completed is @samp{x}. The possible completions are
194@samp{xmlns} and @samp{xml:lang}. These share a common prefix of
195@samp{xml}. Thus, @kbd{C-M-i} will yield:
8cd39fb3
MH
196
197@example
198<html xml@point{}
199@end example
200
201@noindent
3d439cd1
CY
202Typically, you would do @kbd{C-M-i} again, which would have the result
203described in the next item.
8cd39fb3
MH
204@item
205If there is more than one possible completion, but the
206possible completions do not share a non-empty prefix, then Emacs will
207prompt you to input the symbol in the minibuffer, initializing the
208minibuffer with the symbol to be completed, and popping up a buffer
209showing the possible completions. You can now input the symbol to be
210inserted. The symbol you input will be inserted in the buffer instead
211of the symbol to be completed. Emacs will then insert any required
212characters after the symbol. For example, if it contains:
213
214@example
215<html xml@point{}
216@end example
217
218@noindent
219Emacs will prompt you in the minibuffer with
220
221@example
222Attribute: xml@point{}
223@end example
224
225@noindent
226and the buffer showing possible completions will contain
227
228@example
229Possible completions are:
b1fbbb32 230xml:lang xmlns
8cd39fb3
MH
231@end example
232
233@noindent
234If you input @kbd{xmlns}, the result will be:
235
236@example
237<html xmlns="@point{}
238@end example
239
240@noindent
3d439cd1
CY
241(If you do @kbd{C-M-i} again, the namespace URI will be
242inserted. Should that happen automatically?)
8cd39fb3
MH
243@end itemize
244
245@node Inserting end-tags
246@chapter Inserting end-tags
247
3d439cd1 248The main redundancy in XML syntax is end-tags. nXML mode provides
8cd39fb3
MH
249several ways to make it easier to enter end-tags. You can use all of
250these without a schema.
251
3d439cd1
CY
252You can use @kbd{C-M-i} after @samp{</} to complete the rest of the
253end-tag.
8cd39fb3
MH
254
255@kbd{C-c C-f} inserts an end-tag for the element containing
256point. This command is useful when you want to input the start-tag,
257then input the content and finally input the end-tag. The @samp{f}
258is mnemonic for finish.
259
260If you want to keep tags balanced and input the end-tag at the
261same time as the start-tag, before inputting the content, then you can
262use @kbd{C-c C-i}. This inserts a @samp{>}, then inserts
263the end-tag and leaves point before the end-tag. @kbd{C-c C-b}
264is similar but more convenient for block-level elements: it puts the
265start-tag, point and the end-tag on successive lines, appropriately
266indented. The @samp{i} is mnemonic for inline and the
267@samp{b} is mnemonic for block.
268
3d439cd1
CY
269Finally, you can customize nXML mode so that @kbd{/} automatically
270inserts the rest of the end-tag when it occurs after @samp{<}, by
271doing
8cd39fb3
MH
272
273@display
274@kbd{M-x customize-variable @key{RET} nxml-slash-auto-complete-flag @key{RET}}
275@end display
276
277@noindent
278and then following the instructions in the displayed buffer.
279
280@node Paragraphs
281@chapter Paragraphs
282
283Emacs has several commands that operate on paragraphs, most
284notably @kbd{M-q}. nXML mode redefines these to work in a way
1df7defd 285that is useful for XML@. The exact rules that are used to find the
8cd39fb3
MH
286beginning and end of a paragraph are complicated; they are designed
287mainly to ensure that @kbd{M-q} does the right thing.
288
289A paragraph consists of one or more complete, consecutive lines.
290A group of lines is not considered a paragraph unless it contains some
291non-whitespace characters between tags or inside comments. A blank
292line separates paragraphs. A single tag on a line by itself also
293separates paragraphs. More precisely, if one tag together with any
294leading and trailing whitespace completely occupy one or more lines,
295then those lines will not be included in any paragraph.
296
297A start-tag at the beginning of the line (possibly indented) may
298be treated as starting a paragraph. Similarly, an end-tag at the end
299of the line may be treated as ending a paragraph. The following rules
300are used to determine whether such a tag is in fact treated as a
301paragraph boundary:
302
303@itemize @bullet
304@item
305If the schema does not allow text at that point, then it
306is a paragraph boundary.
307@item
308If the end-tag corresponding to the start-tag is not at
309the end of its line, or the start-tag corresponding to the end-tag is
310not at the beginning of its line, then it is not a paragraph
311boundary. For example, in
312
313@example
314<p>This is a paragraph with an
315<emph>emphasized</emph> phrase.
316@end example
317
318@noindent
319the @samp{<emph>} start-tag would not be considered as
320starting a paragraph, because its corresponding end-tag is not at the
321end of the line.
322@item
323If there is text that is a sibling in element tree, then
324it is not a paragraph boundary. For example, in
325
326@example
327<p>This is a paragraph with an
328<emph>emphasized phrase that takes one source line</emph>
329@end example
330
331@noindent
332the @samp{<emph>} start-tag would not be considered as
333starting a paragraph, even though its end-tag is at the end of its
334line, because there the text @samp{This is a paragraph with an}
335is a sibling of the @samp{emph} element.
336@item
337Otherwise, it is a paragraph boundary.
338@end itemize
339
340@node Outlining
341@chapter Outlining
342
343nXML mode allows you to display all or part of a buffer as an
44e97401 344outline, in a similar way to Emacs's outline mode. An outline in nXML
8cd39fb3
MH
345mode is based on recognizing two kinds of element: sections and
346headings. There is one heading for every section and one section for
347every heading. A section contains its heading as or within its first
348child element. A section also contains its subordinate sections (its
349subsections). The text content of a section consists of anything in a
350section that is neither a subsection nor a heading.
351
1df7defd 352Note that this is a different model from that used by XHTML@.
8cd39fb3
MH
353nXML mode's outline support will not be useful for XHTML unless you
354adopt a convention of adding a @code{div} to enclose each
355section, rather than having sections implicitly delimited by different
356@code{h@var{n}} elements. This limitation may be removed
357in a future version.
358
359The variable @code{nxml-section-element-name-regexp} gives
1df7defd 360a regexp for the local names (i.e., the part of the name following any
8cd39fb3
MH
361prefix) of section elements. The variable
362@code{nxml-heading-element-name-regexp} gives a regexp for the
363local names of heading elements. For an element to be recognized
364as a section
365
366@itemize @bullet
367@item
368its start-tag must occur at the beginning of a line
369(possibly indented);
370@item
371its local name must match
372@code{nxml-section-element-name-regexp};
373@item
374either its first child element or a descendant of that
375first child element must have a local name that matches
376@code{nxml-heading-element-name-regexp}; the first such element
377is treated as the section's heading.
378@end itemize
379
380@noindent
381You can customize these variables using @kbd{M-x
382customize-variable}.
383
384There are three possible outline states for a section:
385
386@itemize @bullet
387@item
388normal, showing everything, including its heading, text
389content and subsections; each subsection is displayed according to the
390state of that subsection;
391@item
392showing just its heading, with both its text content and
393its subsections hidden; all subsections are hidden regardless of their
394state;
395@item
396showing its heading and its subsections, with its text
397content hidden; each subsection is displayed according to the state of
398that subsection.
399@end itemize
400
401In the last two states, where the text content is hidden, the
402heading is displayed specially, in an abbreviated form. An element
403like this:
404
405@example
406<section>
407<title>Food</title>
408<para>There are many kinds of food.</para>
409</section>
410@end example
411
412@noindent
413would be displayed on a single line like this:
414
415@example
416<-section>Food...</>
417@end example
418
419@noindent
420If there are hidden subsections, then a @code{+} will be used
421instead of a @code{-} like this:
422
423@example
424<+section>Food...</>
425@end example
426
427@noindent
428If there are non-hidden subsections, then the section will instead be
429displayed like this:
430
431@example
432<-section>Food...
433 <-section>Delicious Food...</>
434 <-section>Distasteful Food...</>
435</-section>
436@end example
437
438@noindent
439The heading is always displayed with an indent that corresponds to its
440depth in the outline, even it is not actually indented in the buffer.
441The variable @code{nxml-outline-child-indent} controls how much
442a subheading is indented with respect to its parent heading when the
443heading is being displayed specially.
444
445Commands to change the outline state of sections are bound to
446key sequences that start with @kbd{C-c C-o} (@kbd{o} is
447mnemonic for outline). The third and final key has been chosen to be
448consistent with outline mode. In the following descriptions
449current section means the section containing point, or, more precisely,
450the innermost section containing the character immediately following
451point.
452
453@itemize @bullet
454@item
455@kbd{C-c C-o C-a} shows all sections in the buffer
456normally.
457@item
458@kbd{C-c C-o C-t} hides the text content
459of all sections in the buffer.
460@item
461@kbd{C-c C-o C-c} hides the text content
462of the current section.
463@item
464@kbd{C-c C-o C-e} shows the text content
465of the current section.
466@item
467@kbd{C-c C-o C-d} hides the text content
468and subsections of the current section.
469@item
867d4bb3 470@kbd{C-c C-o C-s} shows the current section
8cd39fb3
MH
471and all its direct and indirect subsections normally.
472@item
473@kbd{C-c C-o C-k} shows the headings of the
474direct and indirect subsections of the current section.
475@item
476@kbd{C-c C-o C-l} hides the text content of the
477current section and of its direct and indirect
478subsections.
479@item
480@kbd{C-c C-o C-i} shows the headings of the
481direct subsections of the current section.
482@item
483@kbd{C-c C-o C-o} hides as much as possible without
484hiding the current section's text content; the headings of ancestor
485sections of the current section and their child section sections will
486not be hidden.
487@end itemize
488
489When a heading is displayed specially, you can use
490@key{RET} in that heading to show the text content of the section
491in the same way as @kbd{C-c C-o C-e}.
492
493You can also use the mouse to change the outline state:
494@kbd{S-mouse-2} hides the text content of a section in the same
495way as@kbd{C-c C-o C-c}; @kbd{mouse-2} on a specially
496displayed heading shows the text content of the section in the same
497way as @kbd{C-c C-o C-e}; @kbd{mouse-1} on a specially
498displayed start-tag toggles the display of subheadings on and
499off.
500
501The outline state for each section is stored with the first
502character of the section (as a text property). Every command that
503changes the outline state of any section updates the display of the
504buffer so that each section is displayed correctly according to its
505outline state. If the section structure is subsequently changed, then
506it is possible for the display to no longer correctly reflect the
507stored outline state. @kbd{C-c C-o C-r} can be used to refresh
508the display so it is correct again.
509
510@node Locating a schema
511@chapter Locating a schema
512
513nXML mode has a configurable set of rules to locate a schema for
514the file being edited. The rules are contained in one or more schema
515locating files, which are XML documents.
516
517The variable @samp{rng-schema-locating-files} specifies
518the list of the file-names of schema locating files that nXML mode
519should use. The order of the list is significant: when file
520@var{x} occurs in the list before file @var{y} then rules
521from file @var{x} have precedence over rules from file
522@var{y}. A filename specified in
523@samp{rng-schema-locating-files} may be relative. If so, it will
524be resolved relative to the document for which a schema is being
525located. It is not an error if relative file-names in
867d4bb3 526@samp{rng-schema-locating-files} do not exist. You can use
8cd39fb3
MH
527@kbd{M-x customize-variable @key{RET} rng-schema-locating-files
528@key{RET}} to customize the list of schema locating
529files.
530
531By default, @samp{rng-schema-locating-files} list has two
532members: @samp{schemas.xml}, and
533@samp{@var{dist-dir}/schema/schemas.xml} where
534@samp{@var{dist-dir}} is the directory containing the nXML
535distribution. The first member will cause nXML mode to use a file
536@samp{schemas.xml} in the same directory as the document being
537edited if such a file exist. The second member contains rules for the
538schemas that are included with the nXML distribution.
539
540@menu
867d4bb3
JB
541* Commands for locating a schema::
542* Schema locating files::
8cd39fb3
MH
543@end menu
544
545@node Commands for locating a schema
546@section Commands for locating a schema
547
548The command @kbd{C-c C-s C-w} will tell you what schema
549is currently being used.
550
551The rules for locating a schema are applied automatically when
552you visit a file in nXML mode. However, if you have just created a new
553file and the schema cannot be inferred from the file-name, then this
554will not locate the right schema. In this case, you should insert the
40572be6 555start-tag of the root element and then use the command @kbd{C-c C-s
8cd39fb3
MH
556C-a}, which reapplies the rules based on the current content of
557the document. It is usually not necessary to insert the complete
558start-tag; often just @samp{<@var{name}} is
559enough.
560
561If you want to use a schema that has not yet been added to the
562schema locating files, you can use the command @kbd{C-c C-s C-f}
b6f9df0f 563to manually select the file containing the schema for the document in
8cd39fb3
MH
564current buffer. Emacs will read the file-name of the schema from the
565minibuffer. After reading the file-name, Emacs will ask whether you
566wish to add a rule to a schema locating file that persistently
567associates the document with the selected schema. The rule will be
568added to the first file in the list specified
569@samp{rng-schema-locating-files}; it will create the file if
570necessary, but will not create a directory. If the variable
571@samp{rng-schema-locating-files} has not been customized, this
572means that the rule will be added to the file @samp{schemas.xml}
573in the same directory as the document being edited.
574
575The command @kbd{C-c C-s C-t} allows you to select a schema by
576specifying an identifier for the type of the document. The schema
577locating files determine the available type identifiers and what
578schema is used for each type identifier. This is useful when it is
579impossible to infer the right schema from either the file-name or the
580content of the document, even though the schema is already in the
581schema locating file. A situation in which this can occur is when
582there are multiple variants of a schema where all valid documents have
583the same document element. For example, XHTML has Strict and
584Transitional variants. In a situation like this, a schema locating file
585can define a type identifier for each variant. As with @kbd{C-c
586C-s C-f}, Emacs will ask whether you wish to add a rule to a schema
587locating file that persistently associates the document with the
588specified type identifier.
589
590The command @kbd{C-c C-s C-l} adds a rule to a schema
591locating file that persistently associates the document with
592the schema that is currently being used.
593
594@node Schema locating files
595@section Schema locating files
596
597Each schema locating file specifies a list of rules. The rules
598from each file are appended in order. To locate a schema each rule is
599applied in turn until a rule matches. The first matching rule is then
600used to determine the schema.
601
602Schema locating files are designed to be useful for other
603applications that need to locate a schema for a document. In fact,
604there is nothing specific to locating schemas in the design; it could
605equally well be used for locating a stylesheet.
606
607@menu
867d4bb3
JB
608* Schema locating file syntax basics::
609* Using the document's URI to locate a schema::
610* Using the document element to locate a schema::
611* Using type identifiers in schema locating files::
612* Using multiple schema locating files::
8cd39fb3
MH
613@end menu
614
615@node Schema locating file syntax basics
616@subsection Schema locating file syntax basics
617
618There is a schema for schema locating files in the file
619@samp{locate.rnc} in the schema directory. Schema locating
620files must be valid with respect to this schema.
621
622The document element of a schema locating file must be
623@samp{locatingRules} and the namespace URI must be
624@samp{http://thaiopensource.com/ns/locating-rules/1.0}. The
625children of the document element specify rules. The order of the
626children is the same as the order of the rules. Here's a complete
627example of a schema locating file:
628
629@example
630<?xml version="1.0"?>
631<locatingRules xmlns="http://thaiopensource.com/ns/locating-rules/1.0">
632 <namespace ns="http://www.w3.org/1999/xhtml" uri="xhtml.rnc"/>
633 <documentElement localName="book" uri="docbook.rnc"/>
634</locatingRules>
635@end example
636
637@noindent
638This says to use the schema @samp{xhtml.rnc} for a document with
639namespace @samp{http://www.w3.org/1999/xhtml}, and to use the
640schema @samp{docbook.rnc} for a document whose local name is
641@samp{book}. If the document element had both a namespace URI
642of @samp{http://www.w3.org/1999/xhtml} and a local name of
643@samp{book}, then the matching rule that comes first will be
644used and so the schema @samp{xhtml.rnc} would be used. There is
645no precedence between different types of rule; the first matching rule
646of any type is used.
647
648As usual with XML-related technologies, resources are identified
649by URIs. The @samp{uri} attribute identifies the schema by
1df7defd 650specifying the URI@. The URI may be relative. If so, it is resolved
8cd39fb3
MH
651relative to the URI of the schema locating file that contains
652attribute. This means that if the value of @samp{uri} attribute
653does not contain a @samp{/}, then it will refer to a filename in
654the same directory as the schema locating file.
655
656@node Using the document's URI to locate a schema
657@subsection Using the document's URI to locate a schema
658
659A @samp{uri} rule locates a schema based on the URI of the
660document. The @samp{uri} attribute specifies the URI of the
661schema. The @samp{resource} attribute can be used to specify
662the schema for a particular document. For example,
663
664@example
665<uri resource="spec.xml" uri="docbook.rnc"/>
666@end example
667
668@noindent
867d4bb3 669specifies that the schema for @samp{spec.xml} is
8cd39fb3
MH
670@samp{docbook.rnc}.
671
672The @samp{pattern} attribute can be used instead of the
673@samp{resource} attribute to specify the schema for any document
674whose URI matches a pattern. The pattern has the same syntax as an
675absolute or relative URI except that the path component of the URI can
676use a @samp{*} character to stand for zero or more characters
1df7defd 677within a path segment (i.e., any character other @samp{/}).
8cd39fb3
MH
678Typically, the URI pattern looks like a relative URI, but, whereas a
679relative URI in the @samp{resource} attribute is resolved into a
680particular absolute URI using the base URI of the schema locating
681file, a relative URI pattern matches if it matches some number of
682complete path segments of the document's URI ending with the last path
1df7defd 683segment of the document's URI@. For example,
8cd39fb3
MH
684
685@example
686<uri pattern="*.xsl" uri="xslt.rnc"/>
687@end example
688
689@noindent
690specifies that the schema for documents with a URI whose path ends
691with @samp{.xsl} is @samp{xslt.rnc}.
692
693A @samp{transformURI} rule locates a schema by
694transforming the URI of the document. The @samp{fromPattern}
695attribute specifies a URI pattern with the same meaning as the
696@samp{pattern} attribute of the @samp{uri} element. The
697@samp{toPattern} attribute is a URI pattern that is used to
698generate the URI of the schema. Each @samp{*} in the
699@samp{toPattern} is replaced by the string that matched the
700corresponding @samp{*} in the @samp{fromPattern}. The
701resulting string is appended to the initial part of the document's URI
702that was not explicitly matched by the @samp{fromPattern}. The
703rule matches only if the transformed URI identifies an existing
704resource. For example, the rule
705
706@example
707<transformURI fromPattern="*.xml" toPattern="*.rnc"/>
708@end example
709
710@noindent
711would transform the URI @samp{file:///home/jjc/docs/spec.xml}
712into the URI @samp{file:///home/jjc/docs/spec.rnc}. Thus, this
713rule specifies that to locate a schema for a document
714@samp{@var{foo}.xml}, Emacs should test whether a file
715@samp{@var{foo}.rnc} exists in the same directory as
716@samp{@var{foo}.xml}, and, if so, should use it as the
717schema.
718
719@node Using the document element to locate a schema
720@subsection Using the document element to locate a schema
721
722A @samp{documentElement} rule locates a schema based on
723the local name and prefix of the document element. For example, a rule
724
725@example
726<documentElement prefix="xsl" localName="stylesheet" uri="xslt.rnc"/>
727@end example
728
729@noindent
730specifies that when the name of the document element is
731@samp{xsl:stylesheet}, then @samp{xslt.rnc} should be used
732as the schema. Either the @samp{prefix} or
733@samp{localName} attribute may be omitted to allow any prefix or
734local name.
735
736A @samp{namespace} rule locates a schema based on the
737namespace URI of the document element. For example, a rule
738
739@example
740<namespace ns="http://www.w3.org/1999/XSL/Transform" uri="xslt.rnc"/>
741@end example
742
743@noindent
744specifies that when the namespace URI of the document is
745@samp{http://www.w3.org/1999/XSL/Transform}, then
746@samp{xslt.rnc} should be used as the schema.
747
748@node Using type identifiers in schema locating files
749@subsection Using type identifiers in schema locating files
750
751Type identifiers allow a level of indirection in locating the
752schema for a document. Instead of associating the document directly
753with a schema URI, the document is associated with a type identifier,
1df7defd 754which is in turn associated with a schema URI@. nXML mode does not
8cd39fb3
MH
755constrain the format of type identifiers. They can be simply strings
756without any formal structure or they can be public identifiers or
757URIs. Note that these type identifiers have nothing to do with the
758DOCTYPE declaration. When comparing type identifiers, whitespace is
759normalized in the same way as with the @samp{xsd:token}
760datatype: leading and trailing whitespace is stripped; other sequences
761of whitespace are normalized to a single space character.
762
763Each of the rules described in previous sections that uses a
764@samp{uri} attribute to specify a schema, can instead use a
765@samp{typeId} attribute to specify a type identifier. The type
766identifier can be associated with a URI using a @samp{typeId}
767element. For example,
768
769@example
770<locatingRules xmlns="http://thaiopensource.com/ns/locating-rules/1.0">
771 <namespace ns="http://www.w3.org/1999/xhtml" typeId="XHTML"/>
772 <typeId id="XHTML" typeId="XHTML Strict"/>
773 <typeId id="XHTML Strict" uri="xhtml-strict.rnc"/>
774 <typeId id="XHTML Transitional" uri="xhtml-transitional.rnc"/>
775</locatingRules>
776@end example
777
778@noindent
779declares three type identifiers @samp{XHTML} (representing the
780default variant of XHTML to be used), @samp{XHTML Strict} and
781@samp{XHTML Transitional}. Such a schema locating file would
782use @samp{xhtml-strict.rnc} for a document whose namespace is
783@samp{http://www.w3.org/1999/xhtml}. But it is considerably
784more flexible than a schema locating file that simply specified
785
786@example
787<namespace ns="http://www.w3.org/1999/xhtml" uri="xhtml-strict.rnc"/>
788@end example
789
790@noindent
791A user can easily use @kbd{C-c C-s C-t} to select between XHTML
792Strict and XHTML Transitional. Also, a user can easily add a catalog
793
794@example
795<locatingRules xmlns="http://thaiopensource.com/ns/locating-rules/1.0">
796 <typeId id="XHTML" typeId="XHTML Transitional"/>
797</locatingRules>
798@end example
799
800@noindent
801that makes the default variant of XHTML be XHTML Transitional.
802
803@node Using multiple schema locating files
804@subsection Using multiple schema locating files
805
806The @samp{include} element includes rules from another
807schema locating file. The behavior is exactly as if the rules from
808that file were included in place of the @samp{include} element.
809Relative URIs are resolved into absolute URIs before the inclusion is
810performed. For example,
811
812@example
813<include rules="../rules.xml"/>
814@end example
815
816@noindent
817includes the rules from @samp{rules.xml}.
818
819The process of locating a schema takes as input a list of schema
820locating files. The rules in all these files and in the files they
821include are resolved into a single list of rules, which are applied
822strictly in order. Sometimes this order is not what is needed.
823For example, suppose you have two schema locating files, a private
824file
825
826@example
827<locatingRules xmlns="http://thaiopensource.com/ns/locating-rules/1.0">
828 <namespace ns="http://www.w3.org/1999/xhtml" uri="xhtml.rnc"/>
829</locatingRules>
830@end example
831
832@noindent
833followed by a public file
834
835@example
836<locatingRules xmlns="http://thaiopensource.com/ns/locating-rules/1.0">
837 <transformURI pathSuffix=".xml" replacePathSuffix=".rnc"/>
838 <namespace ns="http://www.w3.org/1999/XSL/Transform" typeId="XSLT"/>
839</locatingRules>
840@end example
841
842@noindent
843The effect of these two files is that the XHTML @samp{namespace}
844rule takes precedence over the @samp{transformURI} rule, which
845is almost certainly not what is needed. This can be solved by adding
846an @samp{applyFollowingRules} to the private file.
847
848@example
849<locatingRules xmlns="http://thaiopensource.com/ns/locating-rules/1.0">
850 <applyFollowingRules ruleType="transformURI"/>
851 <namespace ns="http://www.w3.org/1999/xhtml" uri="xhtml.rnc"/>
852</locatingRules>
853@end example
854
855@node DTDs
856@chapter DTDs
857
3d439cd1 858nXML mode is designed to support the creation of standalone XML
1df7defd 859documents that do not depend on a DTD@. Although it is common practice
8cd39fb3
MH
860to insert a DOCTYPE declaration referencing an external DTD, this has
861undesirable side-effects. It means that the document is no longer
862self-contained. It also means that different XML parsers may interpret
863the document in different ways, since the XML Recommendation does not
1df7defd 864require XML parsers to read the DTD@. With DTDs, it was impractical to
8cd39fb3
MH
865get validation without using an external DTD or reference to an
866parameter entity. With RELAX NG and other schema languages, you can
9858f6c3 867simultaneously get the benefits of validation and standalone XML
8cd39fb3
MH
868documents. Therefore, I recommend that you do not reference an
869external DOCTYPE in your XML documents.
870
871One problem is entities for characters. Typically, as well as
872providing validation, DTDs also provide a set of character entities
873for documents to use. Schemas cannot provide this functionality,
874because schema validation happens after XML parsing. The recommended
875solution is to either use the Unicode characters directly, or, if this
876is impractical, use character references. nXML mode supports this by
877providing commands for entering characters and character references
878using the Unicode names, and can display the glyph corresponding to a
879character reference.
880
881@node Limitations
882@chapter Limitations
883
884nXML mode has some limitations:
885
886@itemize @bullet
887@item
888DTD support is limited. Internal parsed general entities declared
889in the internal subset are supported provided they do not contain
890elements. Other usage of DTDs is ignored.
891@item
892The restrictions on RELAX NG schemas in section 7 of the RELAX NG
893specification are not enforced.
8cd39fb3
MH
894@end itemize
895
0b1af106
GM
896@node GNU Free Documentation License
897@appendix GNU Free Documentation License
898@include doclicense.texi
899
8cd39fb3 900@bye