Standardize possessive apostrophe usage in manuals, docs, and comments
[bpt/emacs.git] / doc / misc / nxml-mode.texi
CommitLineData
8cd39fb3
MH
1\input texinfo @c -*- texinfo -*-
2@c %**start of header
ac97a16b 3@setfilename ../../info/nxml-mode
8cd39fb3
MH
4@settitle nXML Mode
5@c %**end of header
6
20234d96 7@copying
20234d96 8This manual documents nxml-mode, an Emacs major mode for editing
867d4bb3 9XML with RELAX NG support.
20234d96 10
44e97401 11Copyright @copyright{} 2007-2012 Free Software Foundation, Inc.
20234d96
GM
12
13@quotation
14Permission is granted to copy, distribute and/or modify this document
6a2c4aec 15under the terms of the GNU Free Documentation License, Version 1.3 or
20234d96
GM
16any later version published by the Free Software Foundation; with no
17Invariant Sections, with the Front-Cover texts being ``A GNU
18Manual,'' and with the Back-Cover Texts as in (a) below. A copy of the
19license is included in the section entitled ``GNU Free Documentation
20License'' in the Emacs manual.
21
6f093307
GM
22(a) The FSF's Back-Cover Text is: ``You have the freedom to copy and
23modify this GNU manual. Buying copies from the FSF supports it in
24developing GNU and promoting software freedom.''
20234d96
GM
25
26This document is part of a collection distributed under the GNU Free
27Documentation License. If you want to distribute this document
28separately from the collection, you can do so by adding a copy of the
29license to the document, as described in section 6 of the license.
30@end quotation
31@end copying
32
0c973505 33@dircategory Emacs editing modes
8cd39fb3 34@direntry
7aa579d9 35* nXML Mode: (nxml-mode). XML editing mode with RELAX NG support.
8cd39fb3
MH
36@end direntry
37
38@node Top
39@top nXML Mode
40
5dc584b5
KB
41@insertcopying
42
43This manual is not yet complete.
8cd39fb3
MH
44
45@menu
d3dfb185 46* Introduction::
867d4bb3
JB
47* Completion::
48* Inserting end-tags::
49* Paragraphs::
50* Outlining::
51* Locating a schema::
52* DTDs::
53* Limitations::
8cd39fb3
MH
54@end menu
55
d3dfb185
GM
56@node Introduction
57@chapter Introduction
58
59nXML mode is an Emacs major-mode for editing XML documents. It supports
60editing well-formed XML documents, and provides schema-sensitive editing
61using RELAX NG Compact Syntax. To get started, visit a file containing an
62XML document, and, if necessary, use @kbd{M-x nxml-mode} to switch to nXML
63mode. By default, @code{auto-mode-alist} and @code{magic-fallback-alist}
64put buffers in nXML mode if they have recognizable XML content or file
65extensions. You may wish to customize the settings, for example to
66recognize different file extensions.
67
68Once in nXML mode, you can type @kbd{C-h m} for basic information on the
69mode.
70
71The @file{etc/nxml} directory in the Emacs distribution contains some data
b218c6cd
EW
72files used by nXML mode, and includes two files (@file{test-valid.xml} and
73@file{test-invalid.xml}) that provide examples of valid and invalid XML
d3dfb185
GM
74documents.
75
76To get validation and schema-sensitive editing, you need a RELAX NG Compact
77Syntax (RNC) schema for your document (@pxref{Locating a schema}). The
78@file{etc/schema} directory includes some schemas for popular document
79types. See @url{http://relaxng.org/} for more information on RELAX NG.
80You can use the @samp{Trang} program from
81@url{http://www.thaiopensource.com/relaxng/trang.html} to
82automatically create RNC schemas. This program can:
83
84@itemize @bullet
85@item
86infer an RNC schema from an instance document;
87@item
88convert a DTD to an RNC schema;
89@item
90convert a RELAX NG XML syntax schema to an RNC schema.
91@end itemize
92
93@noindent To convert a RELAX NG XML syntax (@samp{.rng}) schema to a RNC
94one, you can also use the XSLT stylesheet from
95@url{http://www.pantor.com/download.html}.
96
97To convert a W3C XML Schema to an RNC schema, you need first to convert it
4d47208a 98to RELAX NG XML syntax using the RELAX NG converter tool @code{rngconv}
d3dfb185
GM
99(built on top of MSV). See @url{https://github.com/kohsuke/msv}
100and @url{https://msv.dev.java.net/}.
101
102For historical discussions only, see the mailing list archives at
103@url{http://groups.yahoo.com/group/emacs-nxml-mode/}. Please make all new
104discussions on the @samp{help-gnu-emacs} and @samp{emacs-devel} mailing
105lists. Report any bugs with @kbd{M-x report-emacs-bug}.
106
107
8cd39fb3
MH
108@node Completion
109@chapter Completion
110
111Apart from real-time validation, the most important feature that
112nxml-mode provides for assisting in document creation is "completion".
113Completion assists the user in inserting characters at point, based on
114knowledge of the schema and on the contents of the buffer before
115point.
116
117The traditional GNU Emacs key combination for completion in a
118buffer is @kbd{M-@key{TAB}}. However, many window systems
119and window managers use this key combination themselves (typically for
120switching between windows) and do not pass it to applications. It's
121hard to find key combinations in GNU Emacs that are both easy to type
122and not taken by something else. @kbd{C-@key{RET}} (i.e.
123pressing the Enter or Return key, while the Ctrl key is held down) is
124available. It won't be available on a traditional terminal (because
125it is indistinguishable from Return), but it will work with a window
126system. Therefore we adopt the following solution by default: use
127@kbd{C-@key{RET}} when there's a window system and
128@kbd{M-@key{TAB}} when there's not. In the following, I
129will assume that a window system is being used and will therefore
130refer to @kbd{C-@key{RET}}.
131
132Completion works by examining the symbol preceding point. This
133is the symbol to be completed. The symbol to be completed may be the
134empty. Completion considers what symbols starting with the symbol to
135be completed would be valid replacements for the symbol to be
136completed, given the schema and the contents of the buffer before
137point. These symbols are the possible completions. An example may
138make this clearer. Suppose the buffer looks like this (where @point{}
139indicates point):
140
141@example
142<html xmlns="http://www.w3.org/1999/xhtml">
143<h@point{}
144@end example
145
146@noindent
147and the schema is XHTML. In this context, the symbol to be completed
148is @samp{h}. The possible completions consist of just
149@samp{head}. Another example, is
150
151@example
152<html xmlns="http://www.w3.org/1999/xhtml">
153<head>
154<@point{}
155@end example
156
157@noindent
158In this case, the symbol to be completed is empty, and the possible
159completions are @samp{base}, @samp{isindex},
160@samp{link}, @samp{meta}, @samp{script},
161@samp{style}, @samp{title}. Another example is:
162
163@example
164<html xmlns="@point{}
165@end example
166
167@noindent
168In this case, the symbol to be completed is empty, and the possible
169completions are just @samp{http://www.w3.org/1999/xhtml}.
170
171When you type @kbd{C-@key{RET}}, what happens depends
172on what the set of possible completions are.
173
174@itemize @bullet
175@item
176If the set of completions is empty, nothing
177happens.
178@item
179If there is one possible completion, then that completion is
180inserted, together with any following characters that are
181required. For example, in this case:
182
183@example
184<html xmlns="http://www.w3.org/1999/xhtml">
185<@point{}
186@end example
187
188@noindent
189@kbd{C-@key{RET}} will yield
190
191@example
192<html xmlns="http://www.w3.org/1999/xhtml">
193<head@point{}
194@end example
195@item
196If there is more than one possible completion, but all
197possible completions share a common non-empty prefix, then that prefix
198is inserted. For example, suppose the buffer is:
199
200@example
201<html x@point{}
202@end example
203
204@noindent
205The symbol to be completed is @samp{x}. The possible completions
206are @samp{xmlns} and @samp{xml:lang}. These share a
207common prefix of @samp{xml}. Thus, @kbd{C-@key{RET}}
208will yield:
209
210@example
211<html xml@point{}
212@end example
213
214@noindent
215Typically, you would do @kbd{C-@key{RET}} again, which would
216have the result described in the next item.
217@item
218If there is more than one possible completion, but the
219possible completions do not share a non-empty prefix, then Emacs will
220prompt you to input the symbol in the minibuffer, initializing the
221minibuffer with the symbol to be completed, and popping up a buffer
222showing the possible completions. You can now input the symbol to be
223inserted. The symbol you input will be inserted in the buffer instead
224of the symbol to be completed. Emacs will then insert any required
225characters after the symbol. For example, if it contains:
226
227@example
228<html xml@point{}
229@end example
230
231@noindent
232Emacs will prompt you in the minibuffer with
233
234@example
235Attribute: xml@point{}
236@end example
237
238@noindent
239and the buffer showing possible completions will contain
240
241@example
242Possible completions are:
b1fbbb32 243xml:lang xmlns
8cd39fb3
MH
244@end example
245
246@noindent
247If you input @kbd{xmlns}, the result will be:
248
249@example
250<html xmlns="@point{}
251@end example
252
253@noindent
254(If you do @kbd{C-@key{RET}} again, the namespace URI will
255be inserted. Should that happen automatically?)
256@end itemize
257
258@node Inserting end-tags
259@chapter Inserting end-tags
260
261The main redundancy in XML syntax is end-tags. nxml-mode provides
262several ways to make it easier to enter end-tags. You can use all of
263these without a schema.
264
265You can use @kbd{C-@key{RET}} after @samp{</}
266to complete the rest of the end-tag.
267
268@kbd{C-c C-f} inserts an end-tag for the element containing
269point. This command is useful when you want to input the start-tag,
270then input the content and finally input the end-tag. The @samp{f}
271is mnemonic for finish.
272
273If you want to keep tags balanced and input the end-tag at the
274same time as the start-tag, before inputting the content, then you can
275use @kbd{C-c C-i}. This inserts a @samp{>}, then inserts
276the end-tag and leaves point before the end-tag. @kbd{C-c C-b}
277is similar but more convenient for block-level elements: it puts the
278start-tag, point and the end-tag on successive lines, appropriately
279indented. The @samp{i} is mnemonic for inline and the
280@samp{b} is mnemonic for block.
281
282Finally, you can customize nxml-mode so that @kbd{/}
283automatically inserts the rest of the end-tag when it occurs after
284@samp{<}, by doing
285
286@display
287@kbd{M-x customize-variable @key{RET} nxml-slash-auto-complete-flag @key{RET}}
288@end display
289
290@noindent
291and then following the instructions in the displayed buffer.
292
293@node Paragraphs
294@chapter Paragraphs
295
296Emacs has several commands that operate on paragraphs, most
297notably @kbd{M-q}. nXML mode redefines these to work in a way
298that is useful for XML. The exact rules that are used to find the
299beginning and end of a paragraph are complicated; they are designed
300mainly to ensure that @kbd{M-q} does the right thing.
301
302A paragraph consists of one or more complete, consecutive lines.
303A group of lines is not considered a paragraph unless it contains some
304non-whitespace characters between tags or inside comments. A blank
305line separates paragraphs. A single tag on a line by itself also
306separates paragraphs. More precisely, if one tag together with any
307leading and trailing whitespace completely occupy one or more lines,
308then those lines will not be included in any paragraph.
309
310A start-tag at the beginning of the line (possibly indented) may
311be treated as starting a paragraph. Similarly, an end-tag at the end
312of the line may be treated as ending a paragraph. The following rules
313are used to determine whether such a tag is in fact treated as a
314paragraph boundary:
315
316@itemize @bullet
317@item
318If the schema does not allow text at that point, then it
319is a paragraph boundary.
320@item
321If the end-tag corresponding to the start-tag is not at
322the end of its line, or the start-tag corresponding to the end-tag is
323not at the beginning of its line, then it is not a paragraph
324boundary. For example, in
325
326@example
327<p>This is a paragraph with an
328<emph>emphasized</emph> phrase.
329@end example
330
331@noindent
332the @samp{<emph>} start-tag would not be considered as
333starting a paragraph, because its corresponding end-tag is not at the
334end of the line.
335@item
336If there is text that is a sibling in element tree, then
337it is not a paragraph boundary. For example, in
338
339@example
340<p>This is a paragraph with an
341<emph>emphasized phrase that takes one source line</emph>
342@end example
343
344@noindent
345the @samp{<emph>} start-tag would not be considered as
346starting a paragraph, even though its end-tag is at the end of its
347line, because there the text @samp{This is a paragraph with an}
348is a sibling of the @samp{emph} element.
349@item
350Otherwise, it is a paragraph boundary.
351@end itemize
352
353@node Outlining
354@chapter Outlining
355
356nXML mode allows you to display all or part of a buffer as an
44e97401 357outline, in a similar way to Emacs's outline mode. An outline in nXML
8cd39fb3
MH
358mode is based on recognizing two kinds of element: sections and
359headings. There is one heading for every section and one section for
360every heading. A section contains its heading as or within its first
361child element. A section also contains its subordinate sections (its
362subsections). The text content of a section consists of anything in a
363section that is neither a subsection nor a heading.
364
365Note that this is a different model from that used by XHTML.
366nXML mode's outline support will not be useful for XHTML unless you
367adopt a convention of adding a @code{div} to enclose each
368section, rather than having sections implicitly delimited by different
369@code{h@var{n}} elements. This limitation may be removed
370in a future version.
371
372The variable @code{nxml-section-element-name-regexp} gives
373a regexp for the local names (i.e. the part of the name following any
374prefix) of section elements. The variable
375@code{nxml-heading-element-name-regexp} gives a regexp for the
376local names of heading elements. For an element to be recognized
377as a section
378
379@itemize @bullet
380@item
381its start-tag must occur at the beginning of a line
382(possibly indented);
383@item
384its local name must match
385@code{nxml-section-element-name-regexp};
386@item
387either its first child element or a descendant of that
388first child element must have a local name that matches
389@code{nxml-heading-element-name-regexp}; the first such element
390is treated as the section's heading.
391@end itemize
392
393@noindent
394You can customize these variables using @kbd{M-x
395customize-variable}.
396
397There are three possible outline states for a section:
398
399@itemize @bullet
400@item
401normal, showing everything, including its heading, text
402content and subsections; each subsection is displayed according to the
403state of that subsection;
404@item
405showing just its heading, with both its text content and
406its subsections hidden; all subsections are hidden regardless of their
407state;
408@item
409showing its heading and its subsections, with its text
410content hidden; each subsection is displayed according to the state of
411that subsection.
412@end itemize
413
414In the last two states, where the text content is hidden, the
415heading is displayed specially, in an abbreviated form. An element
416like this:
417
418@example
419<section>
420<title>Food</title>
421<para>There are many kinds of food.</para>
422</section>
423@end example
424
425@noindent
426would be displayed on a single line like this:
427
428@example
429<-section>Food...</>
430@end example
431
432@noindent
433If there are hidden subsections, then a @code{+} will be used
434instead of a @code{-} like this:
435
436@example
437<+section>Food...</>
438@end example
439
440@noindent
441If there are non-hidden subsections, then the section will instead be
442displayed like this:
443
444@example
445<-section>Food...
446 <-section>Delicious Food...</>
447 <-section>Distasteful Food...</>
448</-section>
449@end example
450
451@noindent
452The heading is always displayed with an indent that corresponds to its
453depth in the outline, even it is not actually indented in the buffer.
454The variable @code{nxml-outline-child-indent} controls how much
455a subheading is indented with respect to its parent heading when the
456heading is being displayed specially.
457
458Commands to change the outline state of sections are bound to
459key sequences that start with @kbd{C-c C-o} (@kbd{o} is
460mnemonic for outline). The third and final key has been chosen to be
461consistent with outline mode. In the following descriptions
462current section means the section containing point, or, more precisely,
463the innermost section containing the character immediately following
464point.
465
466@itemize @bullet
467@item
468@kbd{C-c C-o C-a} shows all sections in the buffer
469normally.
470@item
471@kbd{C-c C-o C-t} hides the text content
472of all sections in the buffer.
473@item
474@kbd{C-c C-o C-c} hides the text content
475of the current section.
476@item
477@kbd{C-c C-o C-e} shows the text content
478of the current section.
479@item
480@kbd{C-c C-o C-d} hides the text content
481and subsections of the current section.
482@item
867d4bb3 483@kbd{C-c C-o C-s} shows the current section
8cd39fb3
MH
484and all its direct and indirect subsections normally.
485@item
486@kbd{C-c C-o C-k} shows the headings of the
487direct and indirect subsections of the current section.
488@item
489@kbd{C-c C-o C-l} hides the text content of the
490current section and of its direct and indirect
491subsections.
492@item
493@kbd{C-c C-o C-i} shows the headings of the
494direct subsections of the current section.
495@item
496@kbd{C-c C-o C-o} hides as much as possible without
497hiding the current section's text content; the headings of ancestor
498sections of the current section and their child section sections will
499not be hidden.
500@end itemize
501
502When a heading is displayed specially, you can use
503@key{RET} in that heading to show the text content of the section
504in the same way as @kbd{C-c C-o C-e}.
505
506You can also use the mouse to change the outline state:
507@kbd{S-mouse-2} hides the text content of a section in the same
508way as@kbd{C-c C-o C-c}; @kbd{mouse-2} on a specially
509displayed heading shows the text content of the section in the same
510way as @kbd{C-c C-o C-e}; @kbd{mouse-1} on a specially
511displayed start-tag toggles the display of subheadings on and
512off.
513
514The outline state for each section is stored with the first
515character of the section (as a text property). Every command that
516changes the outline state of any section updates the display of the
517buffer so that each section is displayed correctly according to its
518outline state. If the section structure is subsequently changed, then
519it is possible for the display to no longer correctly reflect the
520stored outline state. @kbd{C-c C-o C-r} can be used to refresh
521the display so it is correct again.
522
523@node Locating a schema
524@chapter Locating a schema
525
526nXML mode has a configurable set of rules to locate a schema for
527the file being edited. The rules are contained in one or more schema
528locating files, which are XML documents.
529
530The variable @samp{rng-schema-locating-files} specifies
531the list of the file-names of schema locating files that nXML mode
532should use. The order of the list is significant: when file
533@var{x} occurs in the list before file @var{y} then rules
534from file @var{x} have precedence over rules from file
535@var{y}. A filename specified in
536@samp{rng-schema-locating-files} may be relative. If so, it will
537be resolved relative to the document for which a schema is being
538located. It is not an error if relative file-names in
867d4bb3 539@samp{rng-schema-locating-files} do not exist. You can use
8cd39fb3
MH
540@kbd{M-x customize-variable @key{RET} rng-schema-locating-files
541@key{RET}} to customize the list of schema locating
542files.
543
544By default, @samp{rng-schema-locating-files} list has two
545members: @samp{schemas.xml}, and
546@samp{@var{dist-dir}/schema/schemas.xml} where
547@samp{@var{dist-dir}} is the directory containing the nXML
548distribution. The first member will cause nXML mode to use a file
549@samp{schemas.xml} in the same directory as the document being
550edited if such a file exist. The second member contains rules for the
551schemas that are included with the nXML distribution.
552
553@menu
867d4bb3
JB
554* Commands for locating a schema::
555* Schema locating files::
8cd39fb3
MH
556@end menu
557
558@node Commands for locating a schema
559@section Commands for locating a schema
560
561The command @kbd{C-c C-s C-w} will tell you what schema
562is currently being used.
563
564The rules for locating a schema are applied automatically when
565you visit a file in nXML mode. However, if you have just created a new
566file and the schema cannot be inferred from the file-name, then this
567will not locate the right schema. In this case, you should insert the
40572be6 568start-tag of the root element and then use the command @kbd{C-c C-s
8cd39fb3
MH
569C-a}, which reapplies the rules based on the current content of
570the document. It is usually not necessary to insert the complete
571start-tag; often just @samp{<@var{name}} is
572enough.
573
574If you want to use a schema that has not yet been added to the
575schema locating files, you can use the command @kbd{C-c C-s C-f}
b6f9df0f 576to manually select the file containing the schema for the document in
8cd39fb3
MH
577current buffer. Emacs will read the file-name of the schema from the
578minibuffer. After reading the file-name, Emacs will ask whether you
579wish to add a rule to a schema locating file that persistently
580associates the document with the selected schema. The rule will be
581added to the first file in the list specified
582@samp{rng-schema-locating-files}; it will create the file if
583necessary, but will not create a directory. If the variable
584@samp{rng-schema-locating-files} has not been customized, this
585means that the rule will be added to the file @samp{schemas.xml}
586in the same directory as the document being edited.
587
588The command @kbd{C-c C-s C-t} allows you to select a schema by
589specifying an identifier for the type of the document. The schema
590locating files determine the available type identifiers and what
591schema is used for each type identifier. This is useful when it is
592impossible to infer the right schema from either the file-name or the
593content of the document, even though the schema is already in the
594schema locating file. A situation in which this can occur is when
595there are multiple variants of a schema where all valid documents have
596the same document element. For example, XHTML has Strict and
597Transitional variants. In a situation like this, a schema locating file
598can define a type identifier for each variant. As with @kbd{C-c
599C-s C-f}, Emacs will ask whether you wish to add a rule to a schema
600locating file that persistently associates the document with the
601specified type identifier.
602
603The command @kbd{C-c C-s C-l} adds a rule to a schema
604locating file that persistently associates the document with
605the schema that is currently being used.
606
607@node Schema locating files
608@section Schema locating files
609
610Each schema locating file specifies a list of rules. The rules
611from each file are appended in order. To locate a schema each rule is
612applied in turn until a rule matches. The first matching rule is then
613used to determine the schema.
614
615Schema locating files are designed to be useful for other
616applications that need to locate a schema for a document. In fact,
617there is nothing specific to locating schemas in the design; it could
618equally well be used for locating a stylesheet.
619
620@menu
867d4bb3
JB
621* Schema locating file syntax basics::
622* Using the document's URI to locate a schema::
623* Using the document element to locate a schema::
624* Using type identifiers in schema locating files::
625* Using multiple schema locating files::
8cd39fb3
MH
626@end menu
627
628@node Schema locating file syntax basics
629@subsection Schema locating file syntax basics
630
631There is a schema for schema locating files in the file
632@samp{locate.rnc} in the schema directory. Schema locating
633files must be valid with respect to this schema.
634
635The document element of a schema locating file must be
636@samp{locatingRules} and the namespace URI must be
637@samp{http://thaiopensource.com/ns/locating-rules/1.0}. The
638children of the document element specify rules. The order of the
639children is the same as the order of the rules. Here's a complete
640example of a schema locating file:
641
642@example
643<?xml version="1.0"?>
644<locatingRules xmlns="http://thaiopensource.com/ns/locating-rules/1.0">
645 <namespace ns="http://www.w3.org/1999/xhtml" uri="xhtml.rnc"/>
646 <documentElement localName="book" uri="docbook.rnc"/>
647</locatingRules>
648@end example
649
650@noindent
651This says to use the schema @samp{xhtml.rnc} for a document with
652namespace @samp{http://www.w3.org/1999/xhtml}, and to use the
653schema @samp{docbook.rnc} for a document whose local name is
654@samp{book}. If the document element had both a namespace URI
655of @samp{http://www.w3.org/1999/xhtml} and a local name of
656@samp{book}, then the matching rule that comes first will be
657used and so the schema @samp{xhtml.rnc} would be used. There is
658no precedence between different types of rule; the first matching rule
659of any type is used.
660
661As usual with XML-related technologies, resources are identified
662by URIs. The @samp{uri} attribute identifies the schema by
663specifying the URI. The URI may be relative. If so, it is resolved
664relative to the URI of the schema locating file that contains
665attribute. This means that if the value of @samp{uri} attribute
666does not contain a @samp{/}, then it will refer to a filename in
667the same directory as the schema locating file.
668
669@node Using the document's URI to locate a schema
670@subsection Using the document's URI to locate a schema
671
672A @samp{uri} rule locates a schema based on the URI of the
673document. The @samp{uri} attribute specifies the URI of the
674schema. The @samp{resource} attribute can be used to specify
675the schema for a particular document. For example,
676
677@example
678<uri resource="spec.xml" uri="docbook.rnc"/>
679@end example
680
681@noindent
867d4bb3 682specifies that the schema for @samp{spec.xml} is
8cd39fb3
MH
683@samp{docbook.rnc}.
684
685The @samp{pattern} attribute can be used instead of the
686@samp{resource} attribute to specify the schema for any document
687whose URI matches a pattern. The pattern has the same syntax as an
688absolute or relative URI except that the path component of the URI can
689use a @samp{*} character to stand for zero or more characters
690within a path segment (i.e. any character other @samp{/}).
691Typically, the URI pattern looks like a relative URI, but, whereas a
692relative URI in the @samp{resource} attribute is resolved into a
693particular absolute URI using the base URI of the schema locating
694file, a relative URI pattern matches if it matches some number of
695complete path segments of the document's URI ending with the last path
696segment of the document's URI. For example,
697
698@example
699<uri pattern="*.xsl" uri="xslt.rnc"/>
700@end example
701
702@noindent
703specifies that the schema for documents with a URI whose path ends
704with @samp{.xsl} is @samp{xslt.rnc}.
705
706A @samp{transformURI} rule locates a schema by
707transforming the URI of the document. The @samp{fromPattern}
708attribute specifies a URI pattern with the same meaning as the
709@samp{pattern} attribute of the @samp{uri} element. The
710@samp{toPattern} attribute is a URI pattern that is used to
711generate the URI of the schema. Each @samp{*} in the
712@samp{toPattern} is replaced by the string that matched the
713corresponding @samp{*} in the @samp{fromPattern}. The
714resulting string is appended to the initial part of the document's URI
715that was not explicitly matched by the @samp{fromPattern}. The
716rule matches only if the transformed URI identifies an existing
717resource. For example, the rule
718
719@example
720<transformURI fromPattern="*.xml" toPattern="*.rnc"/>
721@end example
722
723@noindent
724would transform the URI @samp{file:///home/jjc/docs/spec.xml}
725into the URI @samp{file:///home/jjc/docs/spec.rnc}. Thus, this
726rule specifies that to locate a schema for a document
727@samp{@var{foo}.xml}, Emacs should test whether a file
728@samp{@var{foo}.rnc} exists in the same directory as
729@samp{@var{foo}.xml}, and, if so, should use it as the
730schema.
731
732@node Using the document element to locate a schema
733@subsection Using the document element to locate a schema
734
735A @samp{documentElement} rule locates a schema based on
736the local name and prefix of the document element. For example, a rule
737
738@example
739<documentElement prefix="xsl" localName="stylesheet" uri="xslt.rnc"/>
740@end example
741
742@noindent
743specifies that when the name of the document element is
744@samp{xsl:stylesheet}, then @samp{xslt.rnc} should be used
745as the schema. Either the @samp{prefix} or
746@samp{localName} attribute may be omitted to allow any prefix or
747local name.
748
749A @samp{namespace} rule locates a schema based on the
750namespace URI of the document element. For example, a rule
751
752@example
753<namespace ns="http://www.w3.org/1999/XSL/Transform" uri="xslt.rnc"/>
754@end example
755
756@noindent
757specifies that when the namespace URI of the document is
758@samp{http://www.w3.org/1999/XSL/Transform}, then
759@samp{xslt.rnc} should be used as the schema.
760
761@node Using type identifiers in schema locating files
762@subsection Using type identifiers in schema locating files
763
764Type identifiers allow a level of indirection in locating the
765schema for a document. Instead of associating the document directly
766with a schema URI, the document is associated with a type identifier,
767which is in turn associated with a schema URI. nXML mode does not
768constrain the format of type identifiers. They can be simply strings
769without any formal structure or they can be public identifiers or
770URIs. Note that these type identifiers have nothing to do with the
771DOCTYPE declaration. When comparing type identifiers, whitespace is
772normalized in the same way as with the @samp{xsd:token}
773datatype: leading and trailing whitespace is stripped; other sequences
774of whitespace are normalized to a single space character.
775
776Each of the rules described in previous sections that uses a
777@samp{uri} attribute to specify a schema, can instead use a
778@samp{typeId} attribute to specify a type identifier. The type
779identifier can be associated with a URI using a @samp{typeId}
780element. For example,
781
782@example
783<locatingRules xmlns="http://thaiopensource.com/ns/locating-rules/1.0">
784 <namespace ns="http://www.w3.org/1999/xhtml" typeId="XHTML"/>
785 <typeId id="XHTML" typeId="XHTML Strict"/>
786 <typeId id="XHTML Strict" uri="xhtml-strict.rnc"/>
787 <typeId id="XHTML Transitional" uri="xhtml-transitional.rnc"/>
788</locatingRules>
789@end example
790
791@noindent
792declares three type identifiers @samp{XHTML} (representing the
793default variant of XHTML to be used), @samp{XHTML Strict} and
794@samp{XHTML Transitional}. Such a schema locating file would
795use @samp{xhtml-strict.rnc} for a document whose namespace is
796@samp{http://www.w3.org/1999/xhtml}. But it is considerably
797more flexible than a schema locating file that simply specified
798
799@example
800<namespace ns="http://www.w3.org/1999/xhtml" uri="xhtml-strict.rnc"/>
801@end example
802
803@noindent
804A user can easily use @kbd{C-c C-s C-t} to select between XHTML
805Strict and XHTML Transitional. Also, a user can easily add a catalog
806
807@example
808<locatingRules xmlns="http://thaiopensource.com/ns/locating-rules/1.0">
809 <typeId id="XHTML" typeId="XHTML Transitional"/>
810</locatingRules>
811@end example
812
813@noindent
814that makes the default variant of XHTML be XHTML Transitional.
815
816@node Using multiple schema locating files
817@subsection Using multiple schema locating files
818
819The @samp{include} element includes rules from another
820schema locating file. The behavior is exactly as if the rules from
821that file were included in place of the @samp{include} element.
822Relative URIs are resolved into absolute URIs before the inclusion is
823performed. For example,
824
825@example
826<include rules="../rules.xml"/>
827@end example
828
829@noindent
830includes the rules from @samp{rules.xml}.
831
832The process of locating a schema takes as input a list of schema
833locating files. The rules in all these files and in the files they
834include are resolved into a single list of rules, which are applied
835strictly in order. Sometimes this order is not what is needed.
836For example, suppose you have two schema locating files, a private
837file
838
839@example
840<locatingRules xmlns="http://thaiopensource.com/ns/locating-rules/1.0">
841 <namespace ns="http://www.w3.org/1999/xhtml" uri="xhtml.rnc"/>
842</locatingRules>
843@end example
844
845@noindent
846followed by a public file
847
848@example
849<locatingRules xmlns="http://thaiopensource.com/ns/locating-rules/1.0">
850 <transformURI pathSuffix=".xml" replacePathSuffix=".rnc"/>
851 <namespace ns="http://www.w3.org/1999/XSL/Transform" typeId="XSLT"/>
852</locatingRules>
853@end example
854
855@noindent
856The effect of these two files is that the XHTML @samp{namespace}
857rule takes precedence over the @samp{transformURI} rule, which
858is almost certainly not what is needed. This can be solved by adding
859an @samp{applyFollowingRules} to the private file.
860
861@example
862<locatingRules xmlns="http://thaiopensource.com/ns/locating-rules/1.0">
863 <applyFollowingRules ruleType="transformURI"/>
864 <namespace ns="http://www.w3.org/1999/xhtml" uri="xhtml.rnc"/>
865</locatingRules>
866@end example
867
868@node DTDs
869@chapter DTDs
870
871nxml-mode is designed to support the creation of standalone XML
872documents that do not depend on a DTD. Although it is common practice
873to insert a DOCTYPE declaration referencing an external DTD, this has
874undesirable side-effects. It means that the document is no longer
875self-contained. It also means that different XML parsers may interpret
876the document in different ways, since the XML Recommendation does not
877require XML parsers to read the DTD. With DTDs, it was impractical to
878get validation without using an external DTD or reference to an
879parameter entity. With RELAX NG and other schema languages, you can
9858f6c3 880simultaneously get the benefits of validation and standalone XML
8cd39fb3
MH
881documents. Therefore, I recommend that you do not reference an
882external DOCTYPE in your XML documents.
883
884One problem is entities for characters. Typically, as well as
885providing validation, DTDs also provide a set of character entities
886for documents to use. Schemas cannot provide this functionality,
887because schema validation happens after XML parsing. The recommended
888solution is to either use the Unicode characters directly, or, if this
889is impractical, use character references. nXML mode supports this by
890providing commands for entering characters and character references
891using the Unicode names, and can display the glyph corresponding to a
892character reference.
893
894@node Limitations
895@chapter Limitations
896
897nXML mode has some limitations:
898
899@itemize @bullet
900@item
901DTD support is limited. Internal parsed general entities declared
902in the internal subset are supported provided they do not contain
903elements. Other usage of DTDs is ignored.
904@item
905The restrictions on RELAX NG schemas in section 7 of the RELAX NG
906specification are not enforced.
8cd39fb3
MH
907@end itemize
908
909@bye