Add 2012 to FSF copyright years for Emacs files
[bpt/emacs.git] / doc / misc / nxml-mode.texi
CommitLineData
8cd39fb3
MH
1\input texinfo @c -*- texinfo -*-
2@c %**start of header
ac97a16b 3@setfilename ../../info/nxml-mode
8cd39fb3
MH
4@settitle nXML Mode
5@c %**end of header
6
20234d96 7@copying
20234d96 8This manual documents nxml-mode, an Emacs major mode for editing
867d4bb3 9XML with RELAX NG support.
20234d96 10
acaf905b 11Copyright @copyright{} 2007-2012
d3dfb185 12Free Software Foundation, Inc.
20234d96
GM
13
14@quotation
15Permission is granted to copy, distribute and/or modify this document
6a2c4aec 16under the terms of the GNU Free Documentation License, Version 1.3 or
20234d96
GM
17any later version published by the Free Software Foundation; with no
18Invariant Sections, with the Front-Cover texts being ``A GNU
19Manual,'' and with the Back-Cover Texts as in (a) below. A copy of the
20license is included in the section entitled ``GNU Free Documentation
21License'' in the Emacs manual.
22
6f093307
GM
23(a) The FSF's Back-Cover Text is: ``You have the freedom to copy and
24modify this GNU manual. Buying copies from the FSF supports it in
25developing GNU and promoting software freedom.''
20234d96
GM
26
27This document is part of a collection distributed under the GNU Free
28Documentation License. If you want to distribute this document
29separately from the collection, you can do so by adding a copy of the
30license to the document, as described in section 6 of the license.
31@end quotation
32@end copying
33
0c973505 34@dircategory Emacs editing modes
8cd39fb3 35@direntry
7aa579d9 36* nXML Mode: (nxml-mode). XML editing mode with RELAX NG support.
8cd39fb3
MH
37@end direntry
38
39@node Top
40@top nXML Mode
41
5dc584b5
KB
42@insertcopying
43
44This manual is not yet complete.
8cd39fb3
MH
45
46@menu
d3dfb185 47* Introduction::
867d4bb3
JB
48* Completion::
49* Inserting end-tags::
50* Paragraphs::
51* Outlining::
52* Locating a schema::
53* DTDs::
54* Limitations::
8cd39fb3
MH
55@end menu
56
d3dfb185
GM
57@node Introduction
58@chapter Introduction
59
60nXML mode is an Emacs major-mode for editing XML documents. It supports
61editing well-formed XML documents, and provides schema-sensitive editing
62using RELAX NG Compact Syntax. To get started, visit a file containing an
63XML document, and, if necessary, use @kbd{M-x nxml-mode} to switch to nXML
64mode. By default, @code{auto-mode-alist} and @code{magic-fallback-alist}
65put buffers in nXML mode if they have recognizable XML content or file
66extensions. You may wish to customize the settings, for example to
67recognize different file extensions.
68
69Once in nXML mode, you can type @kbd{C-h m} for basic information on the
70mode.
71
72The @file{etc/nxml} directory in the Emacs distribution contains some data
b218c6cd
EW
73files used by nXML mode, and includes two files (@file{test-valid.xml} and
74@file{test-invalid.xml}) that provide examples of valid and invalid XML
d3dfb185
GM
75documents.
76
77To get validation and schema-sensitive editing, you need a RELAX NG Compact
78Syntax (RNC) schema for your document (@pxref{Locating a schema}). The
79@file{etc/schema} directory includes some schemas for popular document
80types. See @url{http://relaxng.org/} for more information on RELAX NG.
81You can use the @samp{Trang} program from
82@url{http://www.thaiopensource.com/relaxng/trang.html} to
83automatically create RNC schemas. This program can:
84
85@itemize @bullet
86@item
87infer an RNC schema from an instance document;
88@item
89convert a DTD to an RNC schema;
90@item
91convert a RELAX NG XML syntax schema to an RNC schema.
92@end itemize
93
94@noindent To convert a RELAX NG XML syntax (@samp{.rng}) schema to a RNC
95one, you can also use the XSLT stylesheet from
96@url{http://www.pantor.com/download.html}.
97
98To convert a W3C XML Schema to an RNC schema, you need first to convert it
4d47208a 99to RELAX NG XML syntax using the RELAX NG converter tool @code{rngconv}
d3dfb185
GM
100(built on top of MSV). See @url{https://github.com/kohsuke/msv}
101and @url{https://msv.dev.java.net/}.
102
103For historical discussions only, see the mailing list archives at
104@url{http://groups.yahoo.com/group/emacs-nxml-mode/}. Please make all new
105discussions on the @samp{help-gnu-emacs} and @samp{emacs-devel} mailing
106lists. Report any bugs with @kbd{M-x report-emacs-bug}.
107
108
8cd39fb3
MH
109@node Completion
110@chapter Completion
111
112Apart from real-time validation, the most important feature that
113nxml-mode provides for assisting in document creation is "completion".
114Completion assists the user in inserting characters at point, based on
115knowledge of the schema and on the contents of the buffer before
116point.
117
118The traditional GNU Emacs key combination for completion in a
119buffer is @kbd{M-@key{TAB}}. However, many window systems
120and window managers use this key combination themselves (typically for
121switching between windows) and do not pass it to applications. It's
122hard to find key combinations in GNU Emacs that are both easy to type
123and not taken by something else. @kbd{C-@key{RET}} (i.e.
124pressing the Enter or Return key, while the Ctrl key is held down) is
125available. It won't be available on a traditional terminal (because
126it is indistinguishable from Return), but it will work with a window
127system. Therefore we adopt the following solution by default: use
128@kbd{C-@key{RET}} when there's a window system and
129@kbd{M-@key{TAB}} when there's not. In the following, I
130will assume that a window system is being used and will therefore
131refer to @kbd{C-@key{RET}}.
132
133Completion works by examining the symbol preceding point. This
134is the symbol to be completed. The symbol to be completed may be the
135empty. Completion considers what symbols starting with the symbol to
136be completed would be valid replacements for the symbol to be
137completed, given the schema and the contents of the buffer before
138point. These symbols are the possible completions. An example may
139make this clearer. Suppose the buffer looks like this (where @point{}
140indicates point):
141
142@example
143<html xmlns="http://www.w3.org/1999/xhtml">
144<h@point{}
145@end example
146
147@noindent
148and the schema is XHTML. In this context, the symbol to be completed
149is @samp{h}. The possible completions consist of just
150@samp{head}. Another example, is
151
152@example
153<html xmlns="http://www.w3.org/1999/xhtml">
154<head>
155<@point{}
156@end example
157
158@noindent
159In this case, the symbol to be completed is empty, and the possible
160completions are @samp{base}, @samp{isindex},
161@samp{link}, @samp{meta}, @samp{script},
162@samp{style}, @samp{title}. Another example is:
163
164@example
165<html xmlns="@point{}
166@end example
167
168@noindent
169In this case, the symbol to be completed is empty, and the possible
170completions are just @samp{http://www.w3.org/1999/xhtml}.
171
172When you type @kbd{C-@key{RET}}, what happens depends
173on what the set of possible completions are.
174
175@itemize @bullet
176@item
177If the set of completions is empty, nothing
178happens.
179@item
180If there is one possible completion, then that completion is
181inserted, together with any following characters that are
182required. For example, in this case:
183
184@example
185<html xmlns="http://www.w3.org/1999/xhtml">
186<@point{}
187@end example
188
189@noindent
190@kbd{C-@key{RET}} will yield
191
192@example
193<html xmlns="http://www.w3.org/1999/xhtml">
194<head@point{}
195@end example
196@item
197If there is more than one possible completion, but all
198possible completions share a common non-empty prefix, then that prefix
199is inserted. For example, suppose the buffer is:
200
201@example
202<html x@point{}
203@end example
204
205@noindent
206The symbol to be completed is @samp{x}. The possible completions
207are @samp{xmlns} and @samp{xml:lang}. These share a
208common prefix of @samp{xml}. Thus, @kbd{C-@key{RET}}
209will yield:
210
211@example
212<html xml@point{}
213@end example
214
215@noindent
216Typically, you would do @kbd{C-@key{RET}} again, which would
217have the result described in the next item.
218@item
219If there is more than one possible completion, but the
220possible completions do not share a non-empty prefix, then Emacs will
221prompt you to input the symbol in the minibuffer, initializing the
222minibuffer with the symbol to be completed, and popping up a buffer
223showing the possible completions. You can now input the symbol to be
224inserted. The symbol you input will be inserted in the buffer instead
225of the symbol to be completed. Emacs will then insert any required
226characters after the symbol. For example, if it contains:
227
228@example
229<html xml@point{}
230@end example
231
232@noindent
233Emacs will prompt you in the minibuffer with
234
235@example
236Attribute: xml@point{}
237@end example
238
239@noindent
240and the buffer showing possible completions will contain
241
242@example
243Possible completions are:
b1fbbb32 244xml:lang xmlns
8cd39fb3
MH
245@end example
246
247@noindent
248If you input @kbd{xmlns}, the result will be:
249
250@example
251<html xmlns="@point{}
252@end example
253
254@noindent
255(If you do @kbd{C-@key{RET}} again, the namespace URI will
256be inserted. Should that happen automatically?)
257@end itemize
258
259@node Inserting end-tags
260@chapter Inserting end-tags
261
262The main redundancy in XML syntax is end-tags. nxml-mode provides
263several ways to make it easier to enter end-tags. You can use all of
264these without a schema.
265
266You can use @kbd{C-@key{RET}} after @samp{</}
267to complete the rest of the end-tag.
268
269@kbd{C-c C-f} inserts an end-tag for the element containing
270point. This command is useful when you want to input the start-tag,
271then input the content and finally input the end-tag. The @samp{f}
272is mnemonic for finish.
273
274If you want to keep tags balanced and input the end-tag at the
275same time as the start-tag, before inputting the content, then you can
276use @kbd{C-c C-i}. This inserts a @samp{>}, then inserts
277the end-tag and leaves point before the end-tag. @kbd{C-c C-b}
278is similar but more convenient for block-level elements: it puts the
279start-tag, point and the end-tag on successive lines, appropriately
280indented. The @samp{i} is mnemonic for inline and the
281@samp{b} is mnemonic for block.
282
283Finally, you can customize nxml-mode so that @kbd{/}
284automatically inserts the rest of the end-tag when it occurs after
285@samp{<}, by doing
286
287@display
288@kbd{M-x customize-variable @key{RET} nxml-slash-auto-complete-flag @key{RET}}
289@end display
290
291@noindent
292and then following the instructions in the displayed buffer.
293
294@node Paragraphs
295@chapter Paragraphs
296
297Emacs has several commands that operate on paragraphs, most
298notably @kbd{M-q}. nXML mode redefines these to work in a way
299that is useful for XML. The exact rules that are used to find the
300beginning and end of a paragraph are complicated; they are designed
301mainly to ensure that @kbd{M-q} does the right thing.
302
303A paragraph consists of one or more complete, consecutive lines.
304A group of lines is not considered a paragraph unless it contains some
305non-whitespace characters between tags or inside comments. A blank
306line separates paragraphs. A single tag on a line by itself also
307separates paragraphs. More precisely, if one tag together with any
308leading and trailing whitespace completely occupy one or more lines,
309then those lines will not be included in any paragraph.
310
311A start-tag at the beginning of the line (possibly indented) may
312be treated as starting a paragraph. Similarly, an end-tag at the end
313of the line may be treated as ending a paragraph. The following rules
314are used to determine whether such a tag is in fact treated as a
315paragraph boundary:
316
317@itemize @bullet
318@item
319If the schema does not allow text at that point, then it
320is a paragraph boundary.
321@item
322If the end-tag corresponding to the start-tag is not at
323the end of its line, or the start-tag corresponding to the end-tag is
324not at the beginning of its line, then it is not a paragraph
325boundary. For example, in
326
327@example
328<p>This is a paragraph with an
329<emph>emphasized</emph> phrase.
330@end example
331
332@noindent
333the @samp{<emph>} start-tag would not be considered as
334starting a paragraph, because its corresponding end-tag is not at the
335end of the line.
336@item
337If there is text that is a sibling in element tree, then
338it is not a paragraph boundary. For example, in
339
340@example
341<p>This is a paragraph with an
342<emph>emphasized phrase that takes one source line</emph>
343@end example
344
345@noindent
346the @samp{<emph>} start-tag would not be considered as
347starting a paragraph, even though its end-tag is at the end of its
348line, because there the text @samp{This is a paragraph with an}
349is a sibling of the @samp{emph} element.
350@item
351Otherwise, it is a paragraph boundary.
352@end itemize
353
354@node Outlining
355@chapter Outlining
356
357nXML mode allows you to display all or part of a buffer as an
358outline, in a similar way to Emacs' outline mode. An outline in nXML
359mode is based on recognizing two kinds of element: sections and
360headings. There is one heading for every section and one section for
361every heading. A section contains its heading as or within its first
362child element. A section also contains its subordinate sections (its
363subsections). The text content of a section consists of anything in a
364section that is neither a subsection nor a heading.
365
366Note that this is a different model from that used by XHTML.
367nXML mode's outline support will not be useful for XHTML unless you
368adopt a convention of adding a @code{div} to enclose each
369section, rather than having sections implicitly delimited by different
370@code{h@var{n}} elements. This limitation may be removed
371in a future version.
372
373The variable @code{nxml-section-element-name-regexp} gives
374a regexp for the local names (i.e. the part of the name following any
375prefix) of section elements. The variable
376@code{nxml-heading-element-name-regexp} gives a regexp for the
377local names of heading elements. For an element to be recognized
378as a section
379
380@itemize @bullet
381@item
382its start-tag must occur at the beginning of a line
383(possibly indented);
384@item
385its local name must match
386@code{nxml-section-element-name-regexp};
387@item
388either its first child element or a descendant of that
389first child element must have a local name that matches
390@code{nxml-heading-element-name-regexp}; the first such element
391is treated as the section's heading.
392@end itemize
393
394@noindent
395You can customize these variables using @kbd{M-x
396customize-variable}.
397
398There are three possible outline states for a section:
399
400@itemize @bullet
401@item
402normal, showing everything, including its heading, text
403content and subsections; each subsection is displayed according to the
404state of that subsection;
405@item
406showing just its heading, with both its text content and
407its subsections hidden; all subsections are hidden regardless of their
408state;
409@item
410showing its heading and its subsections, with its text
411content hidden; each subsection is displayed according to the state of
412that subsection.
413@end itemize
414
415In the last two states, where the text content is hidden, the
416heading is displayed specially, in an abbreviated form. An element
417like this:
418
419@example
420<section>
421<title>Food</title>
422<para>There are many kinds of food.</para>
423</section>
424@end example
425
426@noindent
427would be displayed on a single line like this:
428
429@example
430<-section>Food...</>
431@end example
432
433@noindent
434If there are hidden subsections, then a @code{+} will be used
435instead of a @code{-} like this:
436
437@example
438<+section>Food...</>
439@end example
440
441@noindent
442If there are non-hidden subsections, then the section will instead be
443displayed like this:
444
445@example
446<-section>Food...
447 <-section>Delicious Food...</>
448 <-section>Distasteful Food...</>
449</-section>
450@end example
451
452@noindent
453The heading is always displayed with an indent that corresponds to its
454depth in the outline, even it is not actually indented in the buffer.
455The variable @code{nxml-outline-child-indent} controls how much
456a subheading is indented with respect to its parent heading when the
457heading is being displayed specially.
458
459Commands to change the outline state of sections are bound to
460key sequences that start with @kbd{C-c C-o} (@kbd{o} is
461mnemonic for outline). The third and final key has been chosen to be
462consistent with outline mode. In the following descriptions
463current section means the section containing point, or, more precisely,
464the innermost section containing the character immediately following
465point.
466
467@itemize @bullet
468@item
469@kbd{C-c C-o C-a} shows all sections in the buffer
470normally.
471@item
472@kbd{C-c C-o C-t} hides the text content
473of all sections in the buffer.
474@item
475@kbd{C-c C-o C-c} hides the text content
476of the current section.
477@item
478@kbd{C-c C-o C-e} shows the text content
479of the current section.
480@item
481@kbd{C-c C-o C-d} hides the text content
482and subsections of the current section.
483@item
867d4bb3 484@kbd{C-c C-o C-s} shows the current section
8cd39fb3
MH
485and all its direct and indirect subsections normally.
486@item
487@kbd{C-c C-o C-k} shows the headings of the
488direct and indirect subsections of the current section.
489@item
490@kbd{C-c C-o C-l} hides the text content of the
491current section and of its direct and indirect
492subsections.
493@item
494@kbd{C-c C-o C-i} shows the headings of the
495direct subsections of the current section.
496@item
497@kbd{C-c C-o C-o} hides as much as possible without
498hiding the current section's text content; the headings of ancestor
499sections of the current section and their child section sections will
500not be hidden.
501@end itemize
502
503When a heading is displayed specially, you can use
504@key{RET} in that heading to show the text content of the section
505in the same way as @kbd{C-c C-o C-e}.
506
507You can also use the mouse to change the outline state:
508@kbd{S-mouse-2} hides the text content of a section in the same
509way as@kbd{C-c C-o C-c}; @kbd{mouse-2} on a specially
510displayed heading shows the text content of the section in the same
511way as @kbd{C-c C-o C-e}; @kbd{mouse-1} on a specially
512displayed start-tag toggles the display of subheadings on and
513off.
514
515The outline state for each section is stored with the first
516character of the section (as a text property). Every command that
517changes the outline state of any section updates the display of the
518buffer so that each section is displayed correctly according to its
519outline state. If the section structure is subsequently changed, then
520it is possible for the display to no longer correctly reflect the
521stored outline state. @kbd{C-c C-o C-r} can be used to refresh
522the display so it is correct again.
523
524@node Locating a schema
525@chapter Locating a schema
526
527nXML mode has a configurable set of rules to locate a schema for
528the file being edited. The rules are contained in one or more schema
529locating files, which are XML documents.
530
531The variable @samp{rng-schema-locating-files} specifies
532the list of the file-names of schema locating files that nXML mode
533should use. The order of the list is significant: when file
534@var{x} occurs in the list before file @var{y} then rules
535from file @var{x} have precedence over rules from file
536@var{y}. A filename specified in
537@samp{rng-schema-locating-files} may be relative. If so, it will
538be resolved relative to the document for which a schema is being
539located. It is not an error if relative file-names in
867d4bb3 540@samp{rng-schema-locating-files} do not exist. You can use
8cd39fb3
MH
541@kbd{M-x customize-variable @key{RET} rng-schema-locating-files
542@key{RET}} to customize the list of schema locating
543files.
544
545By default, @samp{rng-schema-locating-files} list has two
546members: @samp{schemas.xml}, and
547@samp{@var{dist-dir}/schema/schemas.xml} where
548@samp{@var{dist-dir}} is the directory containing the nXML
549distribution. The first member will cause nXML mode to use a file
550@samp{schemas.xml} in the same directory as the document being
551edited if such a file exist. The second member contains rules for the
552schemas that are included with the nXML distribution.
553
554@menu
867d4bb3
JB
555* Commands for locating a schema::
556* Schema locating files::
8cd39fb3
MH
557@end menu
558
559@node Commands for locating a schema
560@section Commands for locating a schema
561
562The command @kbd{C-c C-s C-w} will tell you what schema
563is currently being used.
564
565The rules for locating a schema are applied automatically when
566you visit a file in nXML mode. However, if you have just created a new
567file and the schema cannot be inferred from the file-name, then this
568will not locate the right schema. In this case, you should insert the
40572be6 569start-tag of the root element and then use the command @kbd{C-c C-s
8cd39fb3
MH
570C-a}, which reapplies the rules based on the current content of
571the document. It is usually not necessary to insert the complete
572start-tag; often just @samp{<@var{name}} is
573enough.
574
575If you want to use a schema that has not yet been added to the
576schema locating files, you can use the command @kbd{C-c C-s C-f}
b6f9df0f 577to manually select the file containing the schema for the document in
8cd39fb3
MH
578current buffer. Emacs will read the file-name of the schema from the
579minibuffer. After reading the file-name, Emacs will ask whether you
580wish to add a rule to a schema locating file that persistently
581associates the document with the selected schema. The rule will be
582added to the first file in the list specified
583@samp{rng-schema-locating-files}; it will create the file if
584necessary, but will not create a directory. If the variable
585@samp{rng-schema-locating-files} has not been customized, this
586means that the rule will be added to the file @samp{schemas.xml}
587in the same directory as the document being edited.
588
589The command @kbd{C-c C-s C-t} allows you to select a schema by
590specifying an identifier for the type of the document. The schema
591locating files determine the available type identifiers and what
592schema is used for each type identifier. This is useful when it is
593impossible to infer the right schema from either the file-name or the
594content of the document, even though the schema is already in the
595schema locating file. A situation in which this can occur is when
596there are multiple variants of a schema where all valid documents have
597the same document element. For example, XHTML has Strict and
598Transitional variants. In a situation like this, a schema locating file
599can define a type identifier for each variant. As with @kbd{C-c
600C-s C-f}, Emacs will ask whether you wish to add a rule to a schema
601locating file that persistently associates the document with the
602specified type identifier.
603
604The command @kbd{C-c C-s C-l} adds a rule to a schema
605locating file that persistently associates the document with
606the schema that is currently being used.
607
608@node Schema locating files
609@section Schema locating files
610
611Each schema locating file specifies a list of rules. The rules
612from each file are appended in order. To locate a schema each rule is
613applied in turn until a rule matches. The first matching rule is then
614used to determine the schema.
615
616Schema locating files are designed to be useful for other
617applications that need to locate a schema for a document. In fact,
618there is nothing specific to locating schemas in the design; it could
619equally well be used for locating a stylesheet.
620
621@menu
867d4bb3
JB
622* Schema locating file syntax basics::
623* Using the document's URI to locate a schema::
624* Using the document element to locate a schema::
625* Using type identifiers in schema locating files::
626* Using multiple schema locating files::
8cd39fb3
MH
627@end menu
628
629@node Schema locating file syntax basics
630@subsection Schema locating file syntax basics
631
632There is a schema for schema locating files in the file
633@samp{locate.rnc} in the schema directory. Schema locating
634files must be valid with respect to this schema.
635
636The document element of a schema locating file must be
637@samp{locatingRules} and the namespace URI must be
638@samp{http://thaiopensource.com/ns/locating-rules/1.0}. The
639children of the document element specify rules. The order of the
640children is the same as the order of the rules. Here's a complete
641example of a schema locating file:
642
643@example
644<?xml version="1.0"?>
645<locatingRules xmlns="http://thaiopensource.com/ns/locating-rules/1.0">
646 <namespace ns="http://www.w3.org/1999/xhtml" uri="xhtml.rnc"/>
647 <documentElement localName="book" uri="docbook.rnc"/>
648</locatingRules>
649@end example
650
651@noindent
652This says to use the schema @samp{xhtml.rnc} for a document with
653namespace @samp{http://www.w3.org/1999/xhtml}, and to use the
654schema @samp{docbook.rnc} for a document whose local name is
655@samp{book}. If the document element had both a namespace URI
656of @samp{http://www.w3.org/1999/xhtml} and a local name of
657@samp{book}, then the matching rule that comes first will be
658used and so the schema @samp{xhtml.rnc} would be used. There is
659no precedence between different types of rule; the first matching rule
660of any type is used.
661
662As usual with XML-related technologies, resources are identified
663by URIs. The @samp{uri} attribute identifies the schema by
664specifying the URI. The URI may be relative. If so, it is resolved
665relative to the URI of the schema locating file that contains
666attribute. This means that if the value of @samp{uri} attribute
667does not contain a @samp{/}, then it will refer to a filename in
668the same directory as the schema locating file.
669
670@node Using the document's URI to locate a schema
671@subsection Using the document's URI to locate a schema
672
673A @samp{uri} rule locates a schema based on the URI of the
674document. The @samp{uri} attribute specifies the URI of the
675schema. The @samp{resource} attribute can be used to specify
676the schema for a particular document. For example,
677
678@example
679<uri resource="spec.xml" uri="docbook.rnc"/>
680@end example
681
682@noindent
867d4bb3 683specifies that the schema for @samp{spec.xml} is
8cd39fb3
MH
684@samp{docbook.rnc}.
685
686The @samp{pattern} attribute can be used instead of the
687@samp{resource} attribute to specify the schema for any document
688whose URI matches a pattern. The pattern has the same syntax as an
689absolute or relative URI except that the path component of the URI can
690use a @samp{*} character to stand for zero or more characters
691within a path segment (i.e. any character other @samp{/}).
692Typically, the URI pattern looks like a relative URI, but, whereas a
693relative URI in the @samp{resource} attribute is resolved into a
694particular absolute URI using the base URI of the schema locating
695file, a relative URI pattern matches if it matches some number of
696complete path segments of the document's URI ending with the last path
697segment of the document's URI. For example,
698
699@example
700<uri pattern="*.xsl" uri="xslt.rnc"/>
701@end example
702
703@noindent
704specifies that the schema for documents with a URI whose path ends
705with @samp{.xsl} is @samp{xslt.rnc}.
706
707A @samp{transformURI} rule locates a schema by
708transforming the URI of the document. The @samp{fromPattern}
709attribute specifies a URI pattern with the same meaning as the
710@samp{pattern} attribute of the @samp{uri} element. The
711@samp{toPattern} attribute is a URI pattern that is used to
712generate the URI of the schema. Each @samp{*} in the
713@samp{toPattern} is replaced by the string that matched the
714corresponding @samp{*} in the @samp{fromPattern}. The
715resulting string is appended to the initial part of the document's URI
716that was not explicitly matched by the @samp{fromPattern}. The
717rule matches only if the transformed URI identifies an existing
718resource. For example, the rule
719
720@example
721<transformURI fromPattern="*.xml" toPattern="*.rnc"/>
722@end example
723
724@noindent
725would transform the URI @samp{file:///home/jjc/docs/spec.xml}
726into the URI @samp{file:///home/jjc/docs/spec.rnc}. Thus, this
727rule specifies that to locate a schema for a document
728@samp{@var{foo}.xml}, Emacs should test whether a file
729@samp{@var{foo}.rnc} exists in the same directory as
730@samp{@var{foo}.xml}, and, if so, should use it as the
731schema.
732
733@node Using the document element to locate a schema
734@subsection Using the document element to locate a schema
735
736A @samp{documentElement} rule locates a schema based on
737the local name and prefix of the document element. For example, a rule
738
739@example
740<documentElement prefix="xsl" localName="stylesheet" uri="xslt.rnc"/>
741@end example
742
743@noindent
744specifies that when the name of the document element is
745@samp{xsl:stylesheet}, then @samp{xslt.rnc} should be used
746as the schema. Either the @samp{prefix} or
747@samp{localName} attribute may be omitted to allow any prefix or
748local name.
749
750A @samp{namespace} rule locates a schema based on the
751namespace URI of the document element. For example, a rule
752
753@example
754<namespace ns="http://www.w3.org/1999/XSL/Transform" uri="xslt.rnc"/>
755@end example
756
757@noindent
758specifies that when the namespace URI of the document is
759@samp{http://www.w3.org/1999/XSL/Transform}, then
760@samp{xslt.rnc} should be used as the schema.
761
762@node Using type identifiers in schema locating files
763@subsection Using type identifiers in schema locating files
764
765Type identifiers allow a level of indirection in locating the
766schema for a document. Instead of associating the document directly
767with a schema URI, the document is associated with a type identifier,
768which is in turn associated with a schema URI. nXML mode does not
769constrain the format of type identifiers. They can be simply strings
770without any formal structure or they can be public identifiers or
771URIs. Note that these type identifiers have nothing to do with the
772DOCTYPE declaration. When comparing type identifiers, whitespace is
773normalized in the same way as with the @samp{xsd:token}
774datatype: leading and trailing whitespace is stripped; other sequences
775of whitespace are normalized to a single space character.
776
777Each of the rules described in previous sections that uses a
778@samp{uri} attribute to specify a schema, can instead use a
779@samp{typeId} attribute to specify a type identifier. The type
780identifier can be associated with a URI using a @samp{typeId}
781element. For example,
782
783@example
784<locatingRules xmlns="http://thaiopensource.com/ns/locating-rules/1.0">
785 <namespace ns="http://www.w3.org/1999/xhtml" typeId="XHTML"/>
786 <typeId id="XHTML" typeId="XHTML Strict"/>
787 <typeId id="XHTML Strict" uri="xhtml-strict.rnc"/>
788 <typeId id="XHTML Transitional" uri="xhtml-transitional.rnc"/>
789</locatingRules>
790@end example
791
792@noindent
793declares three type identifiers @samp{XHTML} (representing the
794default variant of XHTML to be used), @samp{XHTML Strict} and
795@samp{XHTML Transitional}. Such a schema locating file would
796use @samp{xhtml-strict.rnc} for a document whose namespace is
797@samp{http://www.w3.org/1999/xhtml}. But it is considerably
798more flexible than a schema locating file that simply specified
799
800@example
801<namespace ns="http://www.w3.org/1999/xhtml" uri="xhtml-strict.rnc"/>
802@end example
803
804@noindent
805A user can easily use @kbd{C-c C-s C-t} to select between XHTML
806Strict and XHTML Transitional. Also, a user can easily add a catalog
807
808@example
809<locatingRules xmlns="http://thaiopensource.com/ns/locating-rules/1.0">
810 <typeId id="XHTML" typeId="XHTML Transitional"/>
811</locatingRules>
812@end example
813
814@noindent
815that makes the default variant of XHTML be XHTML Transitional.
816
817@node Using multiple schema locating files
818@subsection Using multiple schema locating files
819
820The @samp{include} element includes rules from another
821schema locating file. The behavior is exactly as if the rules from
822that file were included in place of the @samp{include} element.
823Relative URIs are resolved into absolute URIs before the inclusion is
824performed. For example,
825
826@example
827<include rules="../rules.xml"/>
828@end example
829
830@noindent
831includes the rules from @samp{rules.xml}.
832
833The process of locating a schema takes as input a list of schema
834locating files. The rules in all these files and in the files they
835include are resolved into a single list of rules, which are applied
836strictly in order. Sometimes this order is not what is needed.
837For example, suppose you have two schema locating files, a private
838file
839
840@example
841<locatingRules xmlns="http://thaiopensource.com/ns/locating-rules/1.0">
842 <namespace ns="http://www.w3.org/1999/xhtml" uri="xhtml.rnc"/>
843</locatingRules>
844@end example
845
846@noindent
847followed by a public file
848
849@example
850<locatingRules xmlns="http://thaiopensource.com/ns/locating-rules/1.0">
851 <transformURI pathSuffix=".xml" replacePathSuffix=".rnc"/>
852 <namespace ns="http://www.w3.org/1999/XSL/Transform" typeId="XSLT"/>
853</locatingRules>
854@end example
855
856@noindent
857The effect of these two files is that the XHTML @samp{namespace}
858rule takes precedence over the @samp{transformURI} rule, which
859is almost certainly not what is needed. This can be solved by adding
860an @samp{applyFollowingRules} to the private file.
861
862@example
863<locatingRules xmlns="http://thaiopensource.com/ns/locating-rules/1.0">
864 <applyFollowingRules ruleType="transformURI"/>
865 <namespace ns="http://www.w3.org/1999/xhtml" uri="xhtml.rnc"/>
866</locatingRules>
867@end example
868
869@node DTDs
870@chapter DTDs
871
872nxml-mode is designed to support the creation of standalone XML
873documents that do not depend on a DTD. Although it is common practice
874to insert a DOCTYPE declaration referencing an external DTD, this has
875undesirable side-effects. It means that the document is no longer
876self-contained. It also means that different XML parsers may interpret
877the document in different ways, since the XML Recommendation does not
878require XML parsers to read the DTD. With DTDs, it was impractical to
879get validation without using an external DTD or reference to an
880parameter entity. With RELAX NG and other schema languages, you can
9858f6c3 881simultaneously get the benefits of validation and standalone XML
8cd39fb3
MH
882documents. Therefore, I recommend that you do not reference an
883external DOCTYPE in your XML documents.
884
885One problem is entities for characters. Typically, as well as
886providing validation, DTDs also provide a set of character entities
887for documents to use. Schemas cannot provide this functionality,
888because schema validation happens after XML parsing. The recommended
889solution is to either use the Unicode characters directly, or, if this
890is impractical, use character references. nXML mode supports this by
891providing commands for entering characters and character references
892using the Unicode names, and can display the glyph corresponding to a
893character reference.
894
895@node Limitations
896@chapter Limitations
897
898nXML mode has some limitations:
899
900@itemize @bullet
901@item
902DTD support is limited. Internal parsed general entities declared
903in the internal subset are supported provided they do not contain
904elements. Other usage of DTDs is ignored.
905@item
906The restrictions on RELAX NG schemas in section 7 of the RELAX NG
907specification are not enforced.
8cd39fb3
MH
908@end itemize
909
910@bye