ChangeLog fix for initial nxml import.
[bpt/emacs.git] / doc / misc / nxml-mode.texi
CommitLineData
8cd39fb3
MH
1\input texinfo @c -*- texinfo -*-
2@c %**start of header
ac97a16b 3@setfilename ../../info/nxml-mode
8cd39fb3
MH
4@settitle nXML Mode
5@c %**end of header
6
20234d96 7@copying
20234d96 8This manual documents nxml-mode, an Emacs major mode for editing
867d4bb3 9XML with RELAX NG support.
20234d96 10
114f9c96 11Copyright @copyright{} 2007, 2008, 2009, 2010 Free Software Foundation, Inc.
20234d96
GM
12
13@quotation
14Permission is granted to copy, distribute and/or modify this document
6a2c4aec 15under the terms of the GNU Free Documentation License, Version 1.3 or
20234d96
GM
16any later version published by the Free Software Foundation; with no
17Invariant Sections, with the Front-Cover texts being ``A GNU
18Manual,'' and with the Back-Cover Texts as in (a) below. A copy of the
19license is included in the section entitled ``GNU Free Documentation
20License'' in the Emacs manual.
21
6f093307
GM
22(a) The FSF's Back-Cover Text is: ``You have the freedom to copy and
23modify this GNU manual. Buying copies from the FSF supports it in
24developing GNU and promoting software freedom.''
20234d96
GM
25
26This document is part of a collection distributed under the GNU Free
27Documentation License. If you want to distribute this document
28separately from the collection, you can do so by adding a copy of the
29license to the document, as described in section 6 of the license.
30@end quotation
31@end copying
32
8cd39fb3
MH
33@dircategory Emacs
34@direntry
7aa579d9 35* nXML Mode: (nxml-mode). XML editing mode with RELAX NG support.
8cd39fb3
MH
36@end direntry
37
38@node Top
39@top nXML Mode
40
5dc584b5
KB
41@insertcopying
42
43This manual is not yet complete.
8cd39fb3
MH
44
45@menu
867d4bb3
JB
46* Completion::
47* Inserting end-tags::
48* Paragraphs::
49* Outlining::
50* Locating a schema::
51* DTDs::
52* Limitations::
8cd39fb3
MH
53@end menu
54
55@node Completion
56@chapter Completion
57
58Apart from real-time validation, the most important feature that
59nxml-mode provides for assisting in document creation is "completion".
60Completion assists the user in inserting characters at point, based on
61knowledge of the schema and on the contents of the buffer before
62point.
63
64The traditional GNU Emacs key combination for completion in a
65buffer is @kbd{M-@key{TAB}}. However, many window systems
66and window managers use this key combination themselves (typically for
67switching between windows) and do not pass it to applications. It's
68hard to find key combinations in GNU Emacs that are both easy to type
69and not taken by something else. @kbd{C-@key{RET}} (i.e.
70pressing the Enter or Return key, while the Ctrl key is held down) is
71available. It won't be available on a traditional terminal (because
72it is indistinguishable from Return), but it will work with a window
73system. Therefore we adopt the following solution by default: use
74@kbd{C-@key{RET}} when there's a window system and
75@kbd{M-@key{TAB}} when there's not. In the following, I
76will assume that a window system is being used and will therefore
77refer to @kbd{C-@key{RET}}.
78
79Completion works by examining the symbol preceding point. This
80is the symbol to be completed. The symbol to be completed may be the
81empty. Completion considers what symbols starting with the symbol to
82be completed would be valid replacements for the symbol to be
83completed, given the schema and the contents of the buffer before
84point. These symbols are the possible completions. An example may
85make this clearer. Suppose the buffer looks like this (where @point{}
86indicates point):
87
88@example
89<html xmlns="http://www.w3.org/1999/xhtml">
90<h@point{}
91@end example
92
93@noindent
94and the schema is XHTML. In this context, the symbol to be completed
95is @samp{h}. The possible completions consist of just
96@samp{head}. Another example, is
97
98@example
99<html xmlns="http://www.w3.org/1999/xhtml">
100<head>
101<@point{}
102@end example
103
104@noindent
105In this case, the symbol to be completed is empty, and the possible
106completions are @samp{base}, @samp{isindex},
107@samp{link}, @samp{meta}, @samp{script},
108@samp{style}, @samp{title}. Another example is:
109
110@example
111<html xmlns="@point{}
112@end example
113
114@noindent
115In this case, the symbol to be completed is empty, and the possible
116completions are just @samp{http://www.w3.org/1999/xhtml}.
117
118When you type @kbd{C-@key{RET}}, what happens depends
119on what the set of possible completions are.
120
121@itemize @bullet
122@item
123If the set of completions is empty, nothing
124happens.
125@item
126If there is one possible completion, then that completion is
127inserted, together with any following characters that are
128required. For example, in this case:
129
130@example
131<html xmlns="http://www.w3.org/1999/xhtml">
132<@point{}
133@end example
134
135@noindent
136@kbd{C-@key{RET}} will yield
137
138@example
139<html xmlns="http://www.w3.org/1999/xhtml">
140<head@point{}
141@end example
142@item
143If there is more than one possible completion, but all
144possible completions share a common non-empty prefix, then that prefix
145is inserted. For example, suppose the buffer is:
146
147@example
148<html x@point{}
149@end example
150
151@noindent
152The symbol to be completed is @samp{x}. The possible completions
153are @samp{xmlns} and @samp{xml:lang}. These share a
154common prefix of @samp{xml}. Thus, @kbd{C-@key{RET}}
155will yield:
156
157@example
158<html xml@point{}
159@end example
160
161@noindent
162Typically, you would do @kbd{C-@key{RET}} again, which would
163have the result described in the next item.
164@item
165If there is more than one possible completion, but the
166possible completions do not share a non-empty prefix, then Emacs will
167prompt you to input the symbol in the minibuffer, initializing the
168minibuffer with the symbol to be completed, and popping up a buffer
169showing the possible completions. You can now input the symbol to be
170inserted. The symbol you input will be inserted in the buffer instead
171of the symbol to be completed. Emacs will then insert any required
172characters after the symbol. For example, if it contains:
173
174@example
175<html xml@point{}
176@end example
177
178@noindent
179Emacs will prompt you in the minibuffer with
180
181@example
182Attribute: xml@point{}
183@end example
184
185@noindent
186and the buffer showing possible completions will contain
187
188@example
189Possible completions are:
b1fbbb32 190xml:lang xmlns
8cd39fb3
MH
191@end example
192
193@noindent
194If you input @kbd{xmlns}, the result will be:
195
196@example
197<html xmlns="@point{}
198@end example
199
200@noindent
201(If you do @kbd{C-@key{RET}} again, the namespace URI will
202be inserted. Should that happen automatically?)
203@end itemize
204
205@node Inserting end-tags
206@chapter Inserting end-tags
207
208The main redundancy in XML syntax is end-tags. nxml-mode provides
209several ways to make it easier to enter end-tags. You can use all of
210these without a schema.
211
212You can use @kbd{C-@key{RET}} after @samp{</}
213to complete the rest of the end-tag.
214
215@kbd{C-c C-f} inserts an end-tag for the element containing
216point. This command is useful when you want to input the start-tag,
217then input the content and finally input the end-tag. The @samp{f}
218is mnemonic for finish.
219
220If you want to keep tags balanced and input the end-tag at the
221same time as the start-tag, before inputting the content, then you can
222use @kbd{C-c C-i}. This inserts a @samp{>}, then inserts
223the end-tag and leaves point before the end-tag. @kbd{C-c C-b}
224is similar but more convenient for block-level elements: it puts the
225start-tag, point and the end-tag on successive lines, appropriately
226indented. The @samp{i} is mnemonic for inline and the
227@samp{b} is mnemonic for block.
228
229Finally, you can customize nxml-mode so that @kbd{/}
230automatically inserts the rest of the end-tag when it occurs after
231@samp{<}, by doing
232
233@display
234@kbd{M-x customize-variable @key{RET} nxml-slash-auto-complete-flag @key{RET}}
235@end display
236
237@noindent
238and then following the instructions in the displayed buffer.
239
240@node Paragraphs
241@chapter Paragraphs
242
243Emacs has several commands that operate on paragraphs, most
244notably @kbd{M-q}. nXML mode redefines these to work in a way
245that is useful for XML. The exact rules that are used to find the
246beginning and end of a paragraph are complicated; they are designed
247mainly to ensure that @kbd{M-q} does the right thing.
248
249A paragraph consists of one or more complete, consecutive lines.
250A group of lines is not considered a paragraph unless it contains some
251non-whitespace characters between tags or inside comments. A blank
252line separates paragraphs. A single tag on a line by itself also
253separates paragraphs. More precisely, if one tag together with any
254leading and trailing whitespace completely occupy one or more lines,
255then those lines will not be included in any paragraph.
256
257A start-tag at the beginning of the line (possibly indented) may
258be treated as starting a paragraph. Similarly, an end-tag at the end
259of the line may be treated as ending a paragraph. The following rules
260are used to determine whether such a tag is in fact treated as a
261paragraph boundary:
262
263@itemize @bullet
264@item
265If the schema does not allow text at that point, then it
266is a paragraph boundary.
267@item
268If the end-tag corresponding to the start-tag is not at
269the end of its line, or the start-tag corresponding to the end-tag is
270not at the beginning of its line, then it is not a paragraph
271boundary. For example, in
272
273@example
274<p>This is a paragraph with an
275<emph>emphasized</emph> phrase.
276@end example
277
278@noindent
279the @samp{<emph>} start-tag would not be considered as
280starting a paragraph, because its corresponding end-tag is not at the
281end of the line.
282@item
283If there is text that is a sibling in element tree, then
284it is not a paragraph boundary. For example, in
285
286@example
287<p>This is a paragraph with an
288<emph>emphasized phrase that takes one source line</emph>
289@end example
290
291@noindent
292the @samp{<emph>} start-tag would not be considered as
293starting a paragraph, even though its end-tag is at the end of its
294line, because there the text @samp{This is a paragraph with an}
295is a sibling of the @samp{emph} element.
296@item
297Otherwise, it is a paragraph boundary.
298@end itemize
299
300@node Outlining
301@chapter Outlining
302
303nXML mode allows you to display all or part of a buffer as an
304outline, in a similar way to Emacs' outline mode. An outline in nXML
305mode is based on recognizing two kinds of element: sections and
306headings. There is one heading for every section and one section for
307every heading. A section contains its heading as or within its first
308child element. A section also contains its subordinate sections (its
309subsections). The text content of a section consists of anything in a
310section that is neither a subsection nor a heading.
311
312Note that this is a different model from that used by XHTML.
313nXML mode's outline support will not be useful for XHTML unless you
314adopt a convention of adding a @code{div} to enclose each
315section, rather than having sections implicitly delimited by different
316@code{h@var{n}} elements. This limitation may be removed
317in a future version.
318
319The variable @code{nxml-section-element-name-regexp} gives
320a regexp for the local names (i.e. the part of the name following any
321prefix) of section elements. The variable
322@code{nxml-heading-element-name-regexp} gives a regexp for the
323local names of heading elements. For an element to be recognized
324as a section
325
326@itemize @bullet
327@item
328its start-tag must occur at the beginning of a line
329(possibly indented);
330@item
331its local name must match
332@code{nxml-section-element-name-regexp};
333@item
334either its first child element or a descendant of that
335first child element must have a local name that matches
336@code{nxml-heading-element-name-regexp}; the first such element
337is treated as the section's heading.
338@end itemize
339
340@noindent
341You can customize these variables using @kbd{M-x
342customize-variable}.
343
344There are three possible outline states for a section:
345
346@itemize @bullet
347@item
348normal, showing everything, including its heading, text
349content and subsections; each subsection is displayed according to the
350state of that subsection;
351@item
352showing just its heading, with both its text content and
353its subsections hidden; all subsections are hidden regardless of their
354state;
355@item
356showing its heading and its subsections, with its text
357content hidden; each subsection is displayed according to the state of
358that subsection.
359@end itemize
360
361In the last two states, where the text content is hidden, the
362heading is displayed specially, in an abbreviated form. An element
363like this:
364
365@example
366<section>
367<title>Food</title>
368<para>There are many kinds of food.</para>
369</section>
370@end example
371
372@noindent
373would be displayed on a single line like this:
374
375@example
376<-section>Food...</>
377@end example
378
379@noindent
380If there are hidden subsections, then a @code{+} will be used
381instead of a @code{-} like this:
382
383@example
384<+section>Food...</>
385@end example
386
387@noindent
388If there are non-hidden subsections, then the section will instead be
389displayed like this:
390
391@example
392<-section>Food...
393 <-section>Delicious Food...</>
394 <-section>Distasteful Food...</>
395</-section>
396@end example
397
398@noindent
399The heading is always displayed with an indent that corresponds to its
400depth in the outline, even it is not actually indented in the buffer.
401The variable @code{nxml-outline-child-indent} controls how much
402a subheading is indented with respect to its parent heading when the
403heading is being displayed specially.
404
405Commands to change the outline state of sections are bound to
406key sequences that start with @kbd{C-c C-o} (@kbd{o} is
407mnemonic for outline). The third and final key has been chosen to be
408consistent with outline mode. In the following descriptions
409current section means the section containing point, or, more precisely,
410the innermost section containing the character immediately following
411point.
412
413@itemize @bullet
414@item
415@kbd{C-c C-o C-a} shows all sections in the buffer
416normally.
417@item
418@kbd{C-c C-o C-t} hides the text content
419of all sections in the buffer.
420@item
421@kbd{C-c C-o C-c} hides the text content
422of the current section.
423@item
424@kbd{C-c C-o C-e} shows the text content
425of the current section.
426@item
427@kbd{C-c C-o C-d} hides the text content
428and subsections of the current section.
429@item
867d4bb3 430@kbd{C-c C-o C-s} shows the current section
8cd39fb3
MH
431and all its direct and indirect subsections normally.
432@item
433@kbd{C-c C-o C-k} shows the headings of the
434direct and indirect subsections of the current section.
435@item
436@kbd{C-c C-o C-l} hides the text content of the
437current section and of its direct and indirect
438subsections.
439@item
440@kbd{C-c C-o C-i} shows the headings of the
441direct subsections of the current section.
442@item
443@kbd{C-c C-o C-o} hides as much as possible without
444hiding the current section's text content; the headings of ancestor
445sections of the current section and their child section sections will
446not be hidden.
447@end itemize
448
449When a heading is displayed specially, you can use
450@key{RET} in that heading to show the text content of the section
451in the same way as @kbd{C-c C-o C-e}.
452
453You can also use the mouse to change the outline state:
454@kbd{S-mouse-2} hides the text content of a section in the same
455way as@kbd{C-c C-o C-c}; @kbd{mouse-2} on a specially
456displayed heading shows the text content of the section in the same
457way as @kbd{C-c C-o C-e}; @kbd{mouse-1} on a specially
458displayed start-tag toggles the display of subheadings on and
459off.
460
461The outline state for each section is stored with the first
462character of the section (as a text property). Every command that
463changes the outline state of any section updates the display of the
464buffer so that each section is displayed correctly according to its
465outline state. If the section structure is subsequently changed, then
466it is possible for the display to no longer correctly reflect the
467stored outline state. @kbd{C-c C-o C-r} can be used to refresh
468the display so it is correct again.
469
470@node Locating a schema
471@chapter Locating a schema
472
473nXML mode has a configurable set of rules to locate a schema for
474the file being edited. The rules are contained in one or more schema
475locating files, which are XML documents.
476
477The variable @samp{rng-schema-locating-files} specifies
478the list of the file-names of schema locating files that nXML mode
479should use. The order of the list is significant: when file
480@var{x} occurs in the list before file @var{y} then rules
481from file @var{x} have precedence over rules from file
482@var{y}. A filename specified in
483@samp{rng-schema-locating-files} may be relative. If so, it will
484be resolved relative to the document for which a schema is being
485located. It is not an error if relative file-names in
867d4bb3 486@samp{rng-schema-locating-files} do not exist. You can use
8cd39fb3
MH
487@kbd{M-x customize-variable @key{RET} rng-schema-locating-files
488@key{RET}} to customize the list of schema locating
489files.
490
491By default, @samp{rng-schema-locating-files} list has two
492members: @samp{schemas.xml}, and
493@samp{@var{dist-dir}/schema/schemas.xml} where
494@samp{@var{dist-dir}} is the directory containing the nXML
495distribution. The first member will cause nXML mode to use a file
496@samp{schemas.xml} in the same directory as the document being
497edited if such a file exist. The second member contains rules for the
498schemas that are included with the nXML distribution.
499
500@menu
867d4bb3
JB
501* Commands for locating a schema::
502* Schema locating files::
8cd39fb3
MH
503@end menu
504
505@node Commands for locating a schema
506@section Commands for locating a schema
507
508The command @kbd{C-c C-s C-w} will tell you what schema
509is currently being used.
510
511The rules for locating a schema are applied automatically when
512you visit a file in nXML mode. However, if you have just created a new
513file and the schema cannot be inferred from the file-name, then this
514will not locate the right schema. In this case, you should insert the
40572be6 515start-tag of the root element and then use the command @kbd{C-c C-s
8cd39fb3
MH
516C-a}, which reapplies the rules based on the current content of
517the document. It is usually not necessary to insert the complete
518start-tag; often just @samp{<@var{name}} is
519enough.
520
521If you want to use a schema that has not yet been added to the
522schema locating files, you can use the command @kbd{C-c C-s C-f}
b6f9df0f 523to manually select the file containing the schema for the document in
8cd39fb3
MH
524current buffer. Emacs will read the file-name of the schema from the
525minibuffer. After reading the file-name, Emacs will ask whether you
526wish to add a rule to a schema locating file that persistently
527associates the document with the selected schema. The rule will be
528added to the first file in the list specified
529@samp{rng-schema-locating-files}; it will create the file if
530necessary, but will not create a directory. If the variable
531@samp{rng-schema-locating-files} has not been customized, this
532means that the rule will be added to the file @samp{schemas.xml}
533in the same directory as the document being edited.
534
535The command @kbd{C-c C-s C-t} allows you to select a schema by
536specifying an identifier for the type of the document. The schema
537locating files determine the available type identifiers and what
538schema is used for each type identifier. This is useful when it is
539impossible to infer the right schema from either the file-name or the
540content of the document, even though the schema is already in the
541schema locating file. A situation in which this can occur is when
542there are multiple variants of a schema where all valid documents have
543the same document element. For example, XHTML has Strict and
544Transitional variants. In a situation like this, a schema locating file
545can define a type identifier for each variant. As with @kbd{C-c
546C-s C-f}, Emacs will ask whether you wish to add a rule to a schema
547locating file that persistently associates the document with the
548specified type identifier.
549
550The command @kbd{C-c C-s C-l} adds a rule to a schema
551locating file that persistently associates the document with
552the schema that is currently being used.
553
554@node Schema locating files
555@section Schema locating files
556
557Each schema locating file specifies a list of rules. The rules
558from each file are appended in order. To locate a schema each rule is
559applied in turn until a rule matches. The first matching rule is then
560used to determine the schema.
561
562Schema locating files are designed to be useful for other
563applications that need to locate a schema for a document. In fact,
564there is nothing specific to locating schemas in the design; it could
565equally well be used for locating a stylesheet.
566
567@menu
867d4bb3
JB
568* Schema locating file syntax basics::
569* Using the document's URI to locate a schema::
570* Using the document element to locate a schema::
571* Using type identifiers in schema locating files::
572* Using multiple schema locating files::
8cd39fb3
MH
573@end menu
574
575@node Schema locating file syntax basics
576@subsection Schema locating file syntax basics
577
578There is a schema for schema locating files in the file
579@samp{locate.rnc} in the schema directory. Schema locating
580files must be valid with respect to this schema.
581
582The document element of a schema locating file must be
583@samp{locatingRules} and the namespace URI must be
584@samp{http://thaiopensource.com/ns/locating-rules/1.0}. The
585children of the document element specify rules. The order of the
586children is the same as the order of the rules. Here's a complete
587example of a schema locating file:
588
589@example
590<?xml version="1.0"?>
591<locatingRules xmlns="http://thaiopensource.com/ns/locating-rules/1.0">
592 <namespace ns="http://www.w3.org/1999/xhtml" uri="xhtml.rnc"/>
593 <documentElement localName="book" uri="docbook.rnc"/>
594</locatingRules>
595@end example
596
597@noindent
598This says to use the schema @samp{xhtml.rnc} for a document with
599namespace @samp{http://www.w3.org/1999/xhtml}, and to use the
600schema @samp{docbook.rnc} for a document whose local name is
601@samp{book}. If the document element had both a namespace URI
602of @samp{http://www.w3.org/1999/xhtml} and a local name of
603@samp{book}, then the matching rule that comes first will be
604used and so the schema @samp{xhtml.rnc} would be used. There is
605no precedence between different types of rule; the first matching rule
606of any type is used.
607
608As usual with XML-related technologies, resources are identified
609by URIs. The @samp{uri} attribute identifies the schema by
610specifying the URI. The URI may be relative. If so, it is resolved
611relative to the URI of the schema locating file that contains
612attribute. This means that if the value of @samp{uri} attribute
613does not contain a @samp{/}, then it will refer to a filename in
614the same directory as the schema locating file.
615
616@node Using the document's URI to locate a schema
617@subsection Using the document's URI to locate a schema
618
619A @samp{uri} rule locates a schema based on the URI of the
620document. The @samp{uri} attribute specifies the URI of the
621schema. The @samp{resource} attribute can be used to specify
622the schema for a particular document. For example,
623
624@example
625<uri resource="spec.xml" uri="docbook.rnc"/>
626@end example
627
628@noindent
867d4bb3 629specifies that the schema for @samp{spec.xml} is
8cd39fb3
MH
630@samp{docbook.rnc}.
631
632The @samp{pattern} attribute can be used instead of the
633@samp{resource} attribute to specify the schema for any document
634whose URI matches a pattern. The pattern has the same syntax as an
635absolute or relative URI except that the path component of the URI can
636use a @samp{*} character to stand for zero or more characters
637within a path segment (i.e. any character other @samp{/}).
638Typically, the URI pattern looks like a relative URI, but, whereas a
639relative URI in the @samp{resource} attribute is resolved into a
640particular absolute URI using the base URI of the schema locating
641file, a relative URI pattern matches if it matches some number of
642complete path segments of the document's URI ending with the last path
643segment of the document's URI. For example,
644
645@example
646<uri pattern="*.xsl" uri="xslt.rnc"/>
647@end example
648
649@noindent
650specifies that the schema for documents with a URI whose path ends
651with @samp{.xsl} is @samp{xslt.rnc}.
652
653A @samp{transformURI} rule locates a schema by
654transforming the URI of the document. The @samp{fromPattern}
655attribute specifies a URI pattern with the same meaning as the
656@samp{pattern} attribute of the @samp{uri} element. The
657@samp{toPattern} attribute is a URI pattern that is used to
658generate the URI of the schema. Each @samp{*} in the
659@samp{toPattern} is replaced by the string that matched the
660corresponding @samp{*} in the @samp{fromPattern}. The
661resulting string is appended to the initial part of the document's URI
662that was not explicitly matched by the @samp{fromPattern}. The
663rule matches only if the transformed URI identifies an existing
664resource. For example, the rule
665
666@example
667<transformURI fromPattern="*.xml" toPattern="*.rnc"/>
668@end example
669
670@noindent
671would transform the URI @samp{file:///home/jjc/docs/spec.xml}
672into the URI @samp{file:///home/jjc/docs/spec.rnc}. Thus, this
673rule specifies that to locate a schema for a document
674@samp{@var{foo}.xml}, Emacs should test whether a file
675@samp{@var{foo}.rnc} exists in the same directory as
676@samp{@var{foo}.xml}, and, if so, should use it as the
677schema.
678
679@node Using the document element to locate a schema
680@subsection Using the document element to locate a schema
681
682A @samp{documentElement} rule locates a schema based on
683the local name and prefix of the document element. For example, a rule
684
685@example
686<documentElement prefix="xsl" localName="stylesheet" uri="xslt.rnc"/>
687@end example
688
689@noindent
690specifies that when the name of the document element is
691@samp{xsl:stylesheet}, then @samp{xslt.rnc} should be used
692as the schema. Either the @samp{prefix} or
693@samp{localName} attribute may be omitted to allow any prefix or
694local name.
695
696A @samp{namespace} rule locates a schema based on the
697namespace URI of the document element. For example, a rule
698
699@example
700<namespace ns="http://www.w3.org/1999/XSL/Transform" uri="xslt.rnc"/>
701@end example
702
703@noindent
704specifies that when the namespace URI of the document is
705@samp{http://www.w3.org/1999/XSL/Transform}, then
706@samp{xslt.rnc} should be used as the schema.
707
708@node Using type identifiers in schema locating files
709@subsection Using type identifiers in schema locating files
710
711Type identifiers allow a level of indirection in locating the
712schema for a document. Instead of associating the document directly
713with a schema URI, the document is associated with a type identifier,
714which is in turn associated with a schema URI. nXML mode does not
715constrain the format of type identifiers. They can be simply strings
716without any formal structure or they can be public identifiers or
717URIs. Note that these type identifiers have nothing to do with the
718DOCTYPE declaration. When comparing type identifiers, whitespace is
719normalized in the same way as with the @samp{xsd:token}
720datatype: leading and trailing whitespace is stripped; other sequences
721of whitespace are normalized to a single space character.
722
723Each of the rules described in previous sections that uses a
724@samp{uri} attribute to specify a schema, can instead use a
725@samp{typeId} attribute to specify a type identifier. The type
726identifier can be associated with a URI using a @samp{typeId}
727element. For example,
728
729@example
730<locatingRules xmlns="http://thaiopensource.com/ns/locating-rules/1.0">
731 <namespace ns="http://www.w3.org/1999/xhtml" typeId="XHTML"/>
732 <typeId id="XHTML" typeId="XHTML Strict"/>
733 <typeId id="XHTML Strict" uri="xhtml-strict.rnc"/>
734 <typeId id="XHTML Transitional" uri="xhtml-transitional.rnc"/>
735</locatingRules>
736@end example
737
738@noindent
739declares three type identifiers @samp{XHTML} (representing the
740default variant of XHTML to be used), @samp{XHTML Strict} and
741@samp{XHTML Transitional}. Such a schema locating file would
742use @samp{xhtml-strict.rnc} for a document whose namespace is
743@samp{http://www.w3.org/1999/xhtml}. But it is considerably
744more flexible than a schema locating file that simply specified
745
746@example
747<namespace ns="http://www.w3.org/1999/xhtml" uri="xhtml-strict.rnc"/>
748@end example
749
750@noindent
751A user can easily use @kbd{C-c C-s C-t} to select between XHTML
752Strict and XHTML Transitional. Also, a user can easily add a catalog
753
754@example
755<locatingRules xmlns="http://thaiopensource.com/ns/locating-rules/1.0">
756 <typeId id="XHTML" typeId="XHTML Transitional"/>
757</locatingRules>
758@end example
759
760@noindent
761that makes the default variant of XHTML be XHTML Transitional.
762
763@node Using multiple schema locating files
764@subsection Using multiple schema locating files
765
766The @samp{include} element includes rules from another
767schema locating file. The behavior is exactly as if the rules from
768that file were included in place of the @samp{include} element.
769Relative URIs are resolved into absolute URIs before the inclusion is
770performed. For example,
771
772@example
773<include rules="../rules.xml"/>
774@end example
775
776@noindent
777includes the rules from @samp{rules.xml}.
778
779The process of locating a schema takes as input a list of schema
780locating files. The rules in all these files and in the files they
781include are resolved into a single list of rules, which are applied
782strictly in order. Sometimes this order is not what is needed.
783For example, suppose you have two schema locating files, a private
784file
785
786@example
787<locatingRules xmlns="http://thaiopensource.com/ns/locating-rules/1.0">
788 <namespace ns="http://www.w3.org/1999/xhtml" uri="xhtml.rnc"/>
789</locatingRules>
790@end example
791
792@noindent
793followed by a public file
794
795@example
796<locatingRules xmlns="http://thaiopensource.com/ns/locating-rules/1.0">
797 <transformURI pathSuffix=".xml" replacePathSuffix=".rnc"/>
798 <namespace ns="http://www.w3.org/1999/XSL/Transform" typeId="XSLT"/>
799</locatingRules>
800@end example
801
802@noindent
803The effect of these two files is that the XHTML @samp{namespace}
804rule takes precedence over the @samp{transformURI} rule, which
805is almost certainly not what is needed. This can be solved by adding
806an @samp{applyFollowingRules} to the private file.
807
808@example
809<locatingRules xmlns="http://thaiopensource.com/ns/locating-rules/1.0">
810 <applyFollowingRules ruleType="transformURI"/>
811 <namespace ns="http://www.w3.org/1999/xhtml" uri="xhtml.rnc"/>
812</locatingRules>
813@end example
814
815@node DTDs
816@chapter DTDs
817
818nxml-mode is designed to support the creation of standalone XML
819documents that do not depend on a DTD. Although it is common practice
820to insert a DOCTYPE declaration referencing an external DTD, this has
821undesirable side-effects. It means that the document is no longer
822self-contained. It also means that different XML parsers may interpret
823the document in different ways, since the XML Recommendation does not
824require XML parsers to read the DTD. With DTDs, it was impractical to
825get validation without using an external DTD or reference to an
826parameter entity. With RELAX NG and other schema languages, you can
827simulataneously get the benefits of validation and standalone XML
828documents. Therefore, I recommend that you do not reference an
829external DOCTYPE in your XML documents.
830
831One problem is entities for characters. Typically, as well as
832providing validation, DTDs also provide a set of character entities
833for documents to use. Schemas cannot provide this functionality,
834because schema validation happens after XML parsing. The recommended
835solution is to either use the Unicode characters directly, or, if this
836is impractical, use character references. nXML mode supports this by
837providing commands for entering characters and character references
838using the Unicode names, and can display the glyph corresponding to a
839character reference.
840
841@node Limitations
842@chapter Limitations
843
844nXML mode has some limitations:
845
846@itemize @bullet
847@item
848DTD support is limited. Internal parsed general entities declared
849in the internal subset are supported provided they do not contain
850elements. Other usage of DTDs is ignored.
851@item
852The restrictions on RELAX NG schemas in section 7 of the RELAX NG
853specification are not enforced.
8cd39fb3
MH
854@end itemize
855
856@bye
ab4c34c6
MB
857
858@ignore
859 arch-tag: 3b6e8ac2-ae8d-4f38-bd43-ce9f80be04d6
860@end ignore