Commit | Line | Data |
---|---|---|
8cd39fb3 MH |
1 | \input texinfo @c -*- texinfo -*- |
2 | @c %**start of header | |
ac97a16b | 3 | @setfilename ../../info/nxml-mode |
8cd39fb3 | 4 | @settitle nXML Mode |
c6ab4664 | 5 | @documentencoding UTF-8 |
8cd39fb3 MH |
6 | @c %**end of header |
7 | ||
20234d96 | 8 | @copying |
3d439cd1 | 9 | This manual documents nXML mode, an Emacs major mode for editing |
867d4bb3 | 10 | XML with RELAX NG support. |
20234d96 | 11 | |
6bc383b1 | 12 | Copyright @copyright{} 2007--2014 Free Software Foundation, Inc. |
20234d96 GM |
13 | |
14 | @quotation | |
15 | Permission is granted to copy, distribute and/or modify this document | |
6a2c4aec | 16 | under the terms of the GNU Free Documentation License, Version 1.3 or |
20234d96 | 17 | any later version published by the Free Software Foundation; with no |
0b1af106 GM |
18 | Invariant Sections, with the Front-Cover texts being ``A GNU Manual,'' |
19 | and with the Back-Cover Texts as in (a) below. A copy of the license | |
20 | is included in the section entitled ``GNU Free Documentation License''. | |
20234d96 | 21 | |
6f093307 | 22 | (a) The FSF's Back-Cover Text is: ``You have the freedom to copy and |
6bf430d1 | 23 | modify this GNU manual.'' |
20234d96 GM |
24 | @end quotation |
25 | @end copying | |
26 | ||
0c973505 | 27 | @dircategory Emacs editing modes |
8cd39fb3 | 28 | @direntry |
7aa579d9 | 29 | * nXML Mode: (nxml-mode). XML editing mode with RELAX NG support. |
8cd39fb3 MH |
30 | @end direntry |
31 | ||
4a970ff5 GM |
32 | |
33 | @titlepage | |
34 | @title nXML mode | |
35 | @page | |
36 | @vskip 0pt plus 1filll | |
37 | @insertcopying | |
38 | @end titlepage | |
39 | ||
40 | @contents | |
41 | ||
42 | ||
8cd39fb3 MH |
43 | @node Top |
44 | @top nXML Mode | |
45 | ||
5dc584b5 KB |
46 | @insertcopying |
47 | ||
48 | This manual is not yet complete. | |
8cd39fb3 MH |
49 | |
50 | @menu | |
d3dfb185 | 51 | * Introduction:: |
867d4bb3 JB |
52 | * Completion:: |
53 | * Inserting end-tags:: | |
54 | * Paragraphs:: | |
55 | * Outlining:: | |
56 | * Locating a schema:: | |
57 | * DTDs:: | |
58 | * Limitations:: | |
0b1af106 | 59 | * GNU Free Documentation License:: The license for this documentation. |
8cd39fb3 MH |
60 | @end menu |
61 | ||
d3dfb185 GM |
62 | @node Introduction |
63 | @chapter Introduction | |
64 | ||
65 | nXML mode is an Emacs major-mode for editing XML documents. It supports | |
66 | editing well-formed XML documents, and provides schema-sensitive editing | |
67 | using RELAX NG Compact Syntax. To get started, visit a file containing an | |
68 | XML document, and, if necessary, use @kbd{M-x nxml-mode} to switch to nXML | |
69 | mode. By default, @code{auto-mode-alist} and @code{magic-fallback-alist} | |
70 | put buffers in nXML mode if they have recognizable XML content or file | |
71 | extensions. You may wish to customize the settings, for example to | |
72 | recognize different file extensions. | |
73 | ||
74 | Once in nXML mode, you can type @kbd{C-h m} for basic information on the | |
75 | mode. | |
76 | ||
77 | The @file{etc/nxml} directory in the Emacs distribution contains some data | |
b218c6cd EW |
78 | files used by nXML mode, and includes two files (@file{test-valid.xml} and |
79 | @file{test-invalid.xml}) that provide examples of valid and invalid XML | |
d3dfb185 GM |
80 | documents. |
81 | ||
82 | To get validation and schema-sensitive editing, you need a RELAX NG Compact | |
83 | Syntax (RNC) schema for your document (@pxref{Locating a schema}). The | |
84 | @file{etc/schema} directory includes some schemas for popular document | |
1df7defd | 85 | types. See @url{http://relaxng.org/} for more information on RELAX NG@. |
d3dfb185 GM |
86 | You can use the @samp{Trang} program from |
87 | @url{http://www.thaiopensource.com/relaxng/trang.html} to | |
88 | automatically create RNC schemas. This program can: | |
89 | ||
90 | @itemize @bullet | |
91 | @item | |
92 | infer an RNC schema from an instance document; | |
93 | @item | |
94 | convert a DTD to an RNC schema; | |
95 | @item | |
96 | convert a RELAX NG XML syntax schema to an RNC schema. | |
97 | @end itemize | |
98 | ||
99 | @noindent To convert a RELAX NG XML syntax (@samp{.rng}) schema to a RNC | |
100 | one, you can also use the XSLT stylesheet from | |
583873a9 GM |
101 | @url{https://github.com/oleg-pavliv/emacs/tree/master/xsl}. |
102 | @ignore | |
103 | @c Original location, now defunct. | |
d3dfb185 | 104 | @url{http://www.pantor.com/download.html}. |
583873a9 | 105 | @end ignore |
d3dfb185 GM |
106 | |
107 | To convert a W3C XML Schema to an RNC schema, you need first to convert it | |
4d47208a | 108 | to RELAX NG XML syntax using the RELAX NG converter tool @code{rngconv} |
d3dfb185 GM |
109 | (built on top of MSV). See @url{https://github.com/kohsuke/msv} |
110 | and @url{https://msv.dev.java.net/}. | |
111 | ||
112 | For historical discussions only, see the mailing list archives at | |
113 | @url{http://groups.yahoo.com/group/emacs-nxml-mode/}. Please make all new | |
114 | discussions on the @samp{help-gnu-emacs} and @samp{emacs-devel} mailing | |
115 | lists. Report any bugs with @kbd{M-x report-emacs-bug}. | |
116 | ||
117 | ||
8cd39fb3 MH |
118 | @node Completion |
119 | @chapter Completion | |
120 | ||
3d439cd1 CY |
121 | Apart from real-time validation, the most important feature that nXML |
122 | mode provides for assisting in document creation is "completion". | |
8cd39fb3 MH |
123 | Completion assists the user in inserting characters at point, based on |
124 | knowledge of the schema and on the contents of the buffer before | |
125 | point. | |
126 | ||
3d439cd1 CY |
127 | nXML mode adapts the standard GNU Emacs command for completion in a |
128 | buffer: @code{completion-at-point}, which is bound to @kbd{C-M-i} and | |
129 | @kbd{M-@key{TAB}}. Note that many window systems and window managers | |
130 | use @kbd{M-@key{TAB}} themselves (typically for switching between | |
131 | windows) and do not pass it to applications. In that case, you should | |
132 | type @kbd{C-M-i} or @kbd{@key{ESC} @key{TAB}} for completion, or bind | |
133 | @code{completion-at-point} to a key that is convenient for you. In | |
134 | the following, I will assume that you type @kbd{C-M-i}. | |
135 | ||
136 | nXML mode completion works by examining the symbol preceding point. | |
137 | This is the symbol to be completed. The symbol to be completed may be | |
138 | the empty. Completion considers what symbols starting with the symbol | |
139 | to be completed would be valid replacements for the symbol to be | |
8cd39fb3 MH |
140 | completed, given the schema and the contents of the buffer before |
141 | point. These symbols are the possible completions. An example may | |
142 | make this clearer. Suppose the buffer looks like this (where @point{} | |
143 | indicates point): | |
144 | ||
145 | @example | |
146 | <html xmlns="http://www.w3.org/1999/xhtml"> | |
147 | <h@point{} | |
148 | @end example | |
149 | ||
150 | @noindent | |
1df7defd | 151 | and the schema is XHTML@. In this context, the symbol to be completed |
8cd39fb3 MH |
152 | is @samp{h}. The possible completions consist of just |
153 | @samp{head}. Another example, is | |
154 | ||
155 | @example | |
156 | <html xmlns="http://www.w3.org/1999/xhtml"> | |
157 | <head> | |
158 | <@point{} | |
159 | @end example | |
160 | ||
161 | @noindent | |
162 | In this case, the symbol to be completed is empty, and the possible | |
163 | completions are @samp{base}, @samp{isindex}, | |
164 | @samp{link}, @samp{meta}, @samp{script}, | |
165 | @samp{style}, @samp{title}. Another example is: | |
166 | ||
167 | @example | |
168 | <html xmlns="@point{} | |
169 | @end example | |
170 | ||
171 | @noindent | |
172 | In this case, the symbol to be completed is empty, and the possible | |
173 | completions are just @samp{http://www.w3.org/1999/xhtml}. | |
174 | ||
3d439cd1 | 175 | When you type @kbd{C-M-i}, what happens depends |
8cd39fb3 MH |
176 | on what the set of possible completions are. |
177 | ||
178 | @itemize @bullet | |
179 | @item | |
180 | If the set of completions is empty, nothing | |
181 | happens. | |
182 | @item | |
183 | If there is one possible completion, then that completion is | |
184 | inserted, together with any following characters that are | |
185 | required. For example, in this case: | |
186 | ||
187 | @example | |
188 | <html xmlns="http://www.w3.org/1999/xhtml"> | |
189 | <@point{} | |
190 | @end example | |
191 | ||
192 | @noindent | |
3d439cd1 | 193 | @kbd{C-M-i} will yield |
8cd39fb3 MH |
194 | |
195 | @example | |
196 | <html xmlns="http://www.w3.org/1999/xhtml"> | |
197 | <head@point{} | |
198 | @end example | |
199 | @item | |
200 | If there is more than one possible completion, but all | |
201 | possible completions share a common non-empty prefix, then that prefix | |
202 | is inserted. For example, suppose the buffer is: | |
203 | ||
204 | @example | |
205 | <html x@point{} | |
206 | @end example | |
207 | ||
208 | @noindent | |
3d439cd1 CY |
209 | The symbol to be completed is @samp{x}. The possible completions are |
210 | @samp{xmlns} and @samp{xml:lang}. These share a common prefix of | |
211 | @samp{xml}. Thus, @kbd{C-M-i} will yield: | |
8cd39fb3 MH |
212 | |
213 | @example | |
214 | <html xml@point{} | |
215 | @end example | |
216 | ||
217 | @noindent | |
3d439cd1 CY |
218 | Typically, you would do @kbd{C-M-i} again, which would have the result |
219 | described in the next item. | |
8cd39fb3 MH |
220 | @item |
221 | If there is more than one possible completion, but the | |
222 | possible completions do not share a non-empty prefix, then Emacs will | |
223 | prompt you to input the symbol in the minibuffer, initializing the | |
224 | minibuffer with the symbol to be completed, and popping up a buffer | |
225 | showing the possible completions. You can now input the symbol to be | |
226 | inserted. The symbol you input will be inserted in the buffer instead | |
227 | of the symbol to be completed. Emacs will then insert any required | |
228 | characters after the symbol. For example, if it contains: | |
229 | ||
230 | @example | |
231 | <html xml@point{} | |
232 | @end example | |
233 | ||
234 | @noindent | |
235 | Emacs will prompt you in the minibuffer with | |
236 | ||
237 | @example | |
238 | Attribute: xml@point{} | |
239 | @end example | |
240 | ||
241 | @noindent | |
242 | and the buffer showing possible completions will contain | |
243 | ||
244 | @example | |
245 | Possible completions are: | |
b1fbbb32 | 246 | xml:lang xmlns |
8cd39fb3 MH |
247 | @end example |
248 | ||
249 | @noindent | |
250 | If you input @kbd{xmlns}, the result will be: | |
251 | ||
252 | @example | |
253 | <html xmlns="@point{} | |
254 | @end example | |
255 | ||
256 | @noindent | |
3d439cd1 CY |
257 | (If you do @kbd{C-M-i} again, the namespace URI will be |
258 | inserted. Should that happen automatically?) | |
8cd39fb3 MH |
259 | @end itemize |
260 | ||
261 | @node Inserting end-tags | |
262 | @chapter Inserting end-tags | |
263 | ||
3d439cd1 | 264 | The main redundancy in XML syntax is end-tags. nXML mode provides |
8cd39fb3 MH |
265 | several ways to make it easier to enter end-tags. You can use all of |
266 | these without a schema. | |
267 | ||
3d439cd1 CY |
268 | You can use @kbd{C-M-i} after @samp{</} to complete the rest of the |
269 | end-tag. | |
8cd39fb3 MH |
270 | |
271 | @kbd{C-c C-f} inserts an end-tag for the element containing | |
272 | point. This command is useful when you want to input the start-tag, | |
273 | then input the content and finally input the end-tag. The @samp{f} | |
274 | is mnemonic for finish. | |
275 | ||
276 | If you want to keep tags balanced and input the end-tag at the | |
277 | same time as the start-tag, before inputting the content, then you can | |
278 | use @kbd{C-c C-i}. This inserts a @samp{>}, then inserts | |
279 | the end-tag and leaves point before the end-tag. @kbd{C-c C-b} | |
280 | is similar but more convenient for block-level elements: it puts the | |
281 | start-tag, point and the end-tag on successive lines, appropriately | |
282 | indented. The @samp{i} is mnemonic for inline and the | |
283 | @samp{b} is mnemonic for block. | |
284 | ||
3d439cd1 CY |
285 | Finally, you can customize nXML mode so that @kbd{/} automatically |
286 | inserts the rest of the end-tag when it occurs after @samp{<}, by | |
287 | doing | |
8cd39fb3 MH |
288 | |
289 | @display | |
290 | @kbd{M-x customize-variable @key{RET} nxml-slash-auto-complete-flag @key{RET}} | |
291 | @end display | |
292 | ||
293 | @noindent | |
294 | and then following the instructions in the displayed buffer. | |
295 | ||
296 | @node Paragraphs | |
297 | @chapter Paragraphs | |
298 | ||
299 | Emacs has several commands that operate on paragraphs, most | |
300 | notably @kbd{M-q}. nXML mode redefines these to work in a way | |
1df7defd | 301 | that is useful for XML@. The exact rules that are used to find the |
8cd39fb3 MH |
302 | beginning and end of a paragraph are complicated; they are designed |
303 | mainly to ensure that @kbd{M-q} does the right thing. | |
304 | ||
305 | A paragraph consists of one or more complete, consecutive lines. | |
306 | A group of lines is not considered a paragraph unless it contains some | |
307 | non-whitespace characters between tags or inside comments. A blank | |
308 | line separates paragraphs. A single tag on a line by itself also | |
309 | separates paragraphs. More precisely, if one tag together with any | |
310 | leading and trailing whitespace completely occupy one or more lines, | |
311 | then those lines will not be included in any paragraph. | |
312 | ||
313 | A start-tag at the beginning of the line (possibly indented) may | |
314 | be treated as starting a paragraph. Similarly, an end-tag at the end | |
315 | of the line may be treated as ending a paragraph. The following rules | |
316 | are used to determine whether such a tag is in fact treated as a | |
317 | paragraph boundary: | |
318 | ||
319 | @itemize @bullet | |
320 | @item | |
321 | If the schema does not allow text at that point, then it | |
322 | is a paragraph boundary. | |
323 | @item | |
324 | If the end-tag corresponding to the start-tag is not at | |
325 | the end of its line, or the start-tag corresponding to the end-tag is | |
326 | not at the beginning of its line, then it is not a paragraph | |
327 | boundary. For example, in | |
328 | ||
329 | @example | |
330 | <p>This is a paragraph with an | |
331 | <emph>emphasized</emph> phrase. | |
332 | @end example | |
333 | ||
334 | @noindent | |
335 | the @samp{<emph>} start-tag would not be considered as | |
336 | starting a paragraph, because its corresponding end-tag is not at the | |
337 | end of the line. | |
338 | @item | |
339 | If there is text that is a sibling in element tree, then | |
340 | it is not a paragraph boundary. For example, in | |
341 | ||
342 | @example | |
343 | <p>This is a paragraph with an | |
344 | <emph>emphasized phrase that takes one source line</emph> | |
345 | @end example | |
346 | ||
347 | @noindent | |
348 | the @samp{<emph>} start-tag would not be considered as | |
349 | starting a paragraph, even though its end-tag is at the end of its | |
350 | line, because there the text @samp{This is a paragraph with an} | |
351 | is a sibling of the @samp{emph} element. | |
352 | @item | |
353 | Otherwise, it is a paragraph boundary. | |
354 | @end itemize | |
355 | ||
356 | @node Outlining | |
357 | @chapter Outlining | |
358 | ||
359 | nXML mode allows you to display all or part of a buffer as an | |
44e97401 | 360 | outline, in a similar way to Emacs's outline mode. An outline in nXML |
8cd39fb3 MH |
361 | mode is based on recognizing two kinds of element: sections and |
362 | headings. There is one heading for every section and one section for | |
363 | every heading. A section contains its heading as or within its first | |
364 | child element. A section also contains its subordinate sections (its | |
365 | subsections). The text content of a section consists of anything in a | |
366 | section that is neither a subsection nor a heading. | |
367 | ||
1df7defd | 368 | Note that this is a different model from that used by XHTML@. |
8cd39fb3 MH |
369 | nXML mode's outline support will not be useful for XHTML unless you |
370 | adopt a convention of adding a @code{div} to enclose each | |
371 | section, rather than having sections implicitly delimited by different | |
372 | @code{h@var{n}} elements. This limitation may be removed | |
373 | in a future version. | |
374 | ||
375 | The variable @code{nxml-section-element-name-regexp} gives | |
1df7defd | 376 | a regexp for the local names (i.e., the part of the name following any |
8cd39fb3 MH |
377 | prefix) of section elements. The variable |
378 | @code{nxml-heading-element-name-regexp} gives a regexp for the | |
379 | local names of heading elements. For an element to be recognized | |
380 | as a section | |
381 | ||
382 | @itemize @bullet | |
383 | @item | |
384 | its start-tag must occur at the beginning of a line | |
385 | (possibly indented); | |
386 | @item | |
387 | its local name must match | |
388 | @code{nxml-section-element-name-regexp}; | |
389 | @item | |
390 | either its first child element or a descendant of that | |
391 | first child element must have a local name that matches | |
392 | @code{nxml-heading-element-name-regexp}; the first such element | |
393 | is treated as the section's heading. | |
394 | @end itemize | |
395 | ||
396 | @noindent | |
397 | You can customize these variables using @kbd{M-x | |
398 | customize-variable}. | |
399 | ||
400 | There are three possible outline states for a section: | |
401 | ||
402 | @itemize @bullet | |
403 | @item | |
404 | normal, showing everything, including its heading, text | |
405 | content and subsections; each subsection is displayed according to the | |
406 | state of that subsection; | |
407 | @item | |
408 | showing just its heading, with both its text content and | |
409 | its subsections hidden; all subsections are hidden regardless of their | |
410 | state; | |
411 | @item | |
412 | showing its heading and its subsections, with its text | |
413 | content hidden; each subsection is displayed according to the state of | |
414 | that subsection. | |
415 | @end itemize | |
416 | ||
417 | In the last two states, where the text content is hidden, the | |
418 | heading is displayed specially, in an abbreviated form. An element | |
419 | like this: | |
420 | ||
421 | @example | |
422 | <section> | |
423 | <title>Food</title> | |
424 | <para>There are many kinds of food.</para> | |
425 | </section> | |
426 | @end example | |
427 | ||
428 | @noindent | |
429 | would be displayed on a single line like this: | |
430 | ||
431 | @example | |
432 | <-section>Food...</> | |
433 | @end example | |
434 | ||
435 | @noindent | |
436 | If there are hidden subsections, then a @code{+} will be used | |
437 | instead of a @code{-} like this: | |
438 | ||
439 | @example | |
440 | <+section>Food...</> | |
441 | @end example | |
442 | ||
443 | @noindent | |
444 | If there are non-hidden subsections, then the section will instead be | |
445 | displayed like this: | |
446 | ||
447 | @example | |
448 | <-section>Food... | |
449 | <-section>Delicious Food...</> | |
450 | <-section>Distasteful Food...</> | |
451 | </-section> | |
452 | @end example | |
453 | ||
454 | @noindent | |
455 | The heading is always displayed with an indent that corresponds to its | |
456 | depth in the outline, even it is not actually indented in the buffer. | |
457 | The variable @code{nxml-outline-child-indent} controls how much | |
458 | a subheading is indented with respect to its parent heading when the | |
459 | heading is being displayed specially. | |
460 | ||
461 | Commands to change the outline state of sections are bound to | |
462 | key sequences that start with @kbd{C-c C-o} (@kbd{o} is | |
463 | mnemonic for outline). The third and final key has been chosen to be | |
464 | consistent with outline mode. In the following descriptions | |
465 | current section means the section containing point, or, more precisely, | |
466 | the innermost section containing the character immediately following | |
467 | point. | |
468 | ||
469 | @itemize @bullet | |
470 | @item | |
471 | @kbd{C-c C-o C-a} shows all sections in the buffer | |
472 | normally. | |
473 | @item | |
474 | @kbd{C-c C-o C-t} hides the text content | |
475 | of all sections in the buffer. | |
476 | @item | |
477 | @kbd{C-c C-o C-c} hides the text content | |
478 | of the current section. | |
479 | @item | |
480 | @kbd{C-c C-o C-e} shows the text content | |
481 | of the current section. | |
482 | @item | |
483 | @kbd{C-c C-o C-d} hides the text content | |
484 | and subsections of the current section. | |
485 | @item | |
867d4bb3 | 486 | @kbd{C-c C-o C-s} shows the current section |
8cd39fb3 MH |
487 | and all its direct and indirect subsections normally. |
488 | @item | |
489 | @kbd{C-c C-o C-k} shows the headings of the | |
490 | direct and indirect subsections of the current section. | |
491 | @item | |
492 | @kbd{C-c C-o C-l} hides the text content of the | |
493 | current section and of its direct and indirect | |
494 | subsections. | |
495 | @item | |
496 | @kbd{C-c C-o C-i} shows the headings of the | |
497 | direct subsections of the current section. | |
498 | @item | |
499 | @kbd{C-c C-o C-o} hides as much as possible without | |
500 | hiding the current section's text content; the headings of ancestor | |
501 | sections of the current section and their child section sections will | |
502 | not be hidden. | |
503 | @end itemize | |
504 | ||
505 | When a heading is displayed specially, you can use | |
506 | @key{RET} in that heading to show the text content of the section | |
507 | in the same way as @kbd{C-c C-o C-e}. | |
508 | ||
509 | You can also use the mouse to change the outline state: | |
510 | @kbd{S-mouse-2} hides the text content of a section in the same | |
511 | way as@kbd{C-c C-o C-c}; @kbd{mouse-2} on a specially | |
512 | displayed heading shows the text content of the section in the same | |
513 | way as @kbd{C-c C-o C-e}; @kbd{mouse-1} on a specially | |
514 | displayed start-tag toggles the display of subheadings on and | |
515 | off. | |
516 | ||
517 | The outline state for each section is stored with the first | |
518 | character of the section (as a text property). Every command that | |
519 | changes the outline state of any section updates the display of the | |
520 | buffer so that each section is displayed correctly according to its | |
521 | outline state. If the section structure is subsequently changed, then | |
522 | it is possible for the display to no longer correctly reflect the | |
523 | stored outline state. @kbd{C-c C-o C-r} can be used to refresh | |
524 | the display so it is correct again. | |
525 | ||
526 | @node Locating a schema | |
527 | @chapter Locating a schema | |
528 | ||
529 | nXML mode has a configurable set of rules to locate a schema for | |
530 | the file being edited. The rules are contained in one or more schema | |
531 | locating files, which are XML documents. | |
532 | ||
533 | The variable @samp{rng-schema-locating-files} specifies | |
534 | the list of the file-names of schema locating files that nXML mode | |
535 | should use. The order of the list is significant: when file | |
536 | @var{x} occurs in the list before file @var{y} then rules | |
537 | from file @var{x} have precedence over rules from file | |
538 | @var{y}. A filename specified in | |
539 | @samp{rng-schema-locating-files} may be relative. If so, it will | |
540 | be resolved relative to the document for which a schema is being | |
541 | located. It is not an error if relative file-names in | |
867d4bb3 | 542 | @samp{rng-schema-locating-files} do not exist. You can use |
8cd39fb3 MH |
543 | @kbd{M-x customize-variable @key{RET} rng-schema-locating-files |
544 | @key{RET}} to customize the list of schema locating | |
545 | files. | |
546 | ||
547 | By default, @samp{rng-schema-locating-files} list has two | |
548 | members: @samp{schemas.xml}, and | |
549 | @samp{@var{dist-dir}/schema/schemas.xml} where | |
550 | @samp{@var{dist-dir}} is the directory containing the nXML | |
551 | distribution. The first member will cause nXML mode to use a file | |
552 | @samp{schemas.xml} in the same directory as the document being | |
553 | edited if such a file exist. The second member contains rules for the | |
554 | schemas that are included with the nXML distribution. | |
555 | ||
556 | @menu | |
867d4bb3 JB |
557 | * Commands for locating a schema:: |
558 | * Schema locating files:: | |
8cd39fb3 MH |
559 | @end menu |
560 | ||
561 | @node Commands for locating a schema | |
562 | @section Commands for locating a schema | |
563 | ||
564 | The command @kbd{C-c C-s C-w} will tell you what schema | |
565 | is currently being used. | |
566 | ||
567 | The rules for locating a schema are applied automatically when | |
568 | you visit a file in nXML mode. However, if you have just created a new | |
569 | file and the schema cannot be inferred from the file-name, then this | |
570 | will not locate the right schema. In this case, you should insert the | |
40572be6 | 571 | start-tag of the root element and then use the command @kbd{C-c C-s |
8cd39fb3 MH |
572 | C-a}, which reapplies the rules based on the current content of |
573 | the document. It is usually not necessary to insert the complete | |
574 | start-tag; often just @samp{<@var{name}} is | |
575 | enough. | |
576 | ||
577 | If you want to use a schema that has not yet been added to the | |
578 | schema locating files, you can use the command @kbd{C-c C-s C-f} | |
b6f9df0f | 579 | to manually select the file containing the schema for the document in |
8cd39fb3 MH |
580 | current buffer. Emacs will read the file-name of the schema from the |
581 | minibuffer. After reading the file-name, Emacs will ask whether you | |
582 | wish to add a rule to a schema locating file that persistently | |
583 | associates the document with the selected schema. The rule will be | |
584 | added to the first file in the list specified | |
585 | @samp{rng-schema-locating-files}; it will create the file if | |
586 | necessary, but will not create a directory. If the variable | |
587 | @samp{rng-schema-locating-files} has not been customized, this | |
588 | means that the rule will be added to the file @samp{schemas.xml} | |
589 | in the same directory as the document being edited. | |
590 | ||
591 | The command @kbd{C-c C-s C-t} allows you to select a schema by | |
592 | specifying an identifier for the type of the document. The schema | |
593 | locating files determine the available type identifiers and what | |
594 | schema is used for each type identifier. This is useful when it is | |
595 | impossible to infer the right schema from either the file-name or the | |
596 | content of the document, even though the schema is already in the | |
597 | schema locating file. A situation in which this can occur is when | |
598 | there are multiple variants of a schema where all valid documents have | |
599 | the same document element. For example, XHTML has Strict and | |
600 | Transitional variants. In a situation like this, a schema locating file | |
601 | can define a type identifier for each variant. As with @kbd{C-c | |
602 | C-s C-f}, Emacs will ask whether you wish to add a rule to a schema | |
603 | locating file that persistently associates the document with the | |
604 | specified type identifier. | |
605 | ||
606 | The command @kbd{C-c C-s C-l} adds a rule to a schema | |
607 | locating file that persistently associates the document with | |
608 | the schema that is currently being used. | |
609 | ||
610 | @node Schema locating files | |
611 | @section Schema locating files | |
612 | ||
613 | Each schema locating file specifies a list of rules. The rules | |
614 | from each file are appended in order. To locate a schema each rule is | |
615 | applied in turn until a rule matches. The first matching rule is then | |
616 | used to determine the schema. | |
617 | ||
618 | Schema locating files are designed to be useful for other | |
619 | applications that need to locate a schema for a document. In fact, | |
620 | there is nothing specific to locating schemas in the design; it could | |
621 | equally well be used for locating a stylesheet. | |
622 | ||
623 | @menu | |
867d4bb3 JB |
624 | * Schema locating file syntax basics:: |
625 | * Using the document's URI to locate a schema:: | |
626 | * Using the document element to locate a schema:: | |
627 | * Using type identifiers in schema locating files:: | |
628 | * Using multiple schema locating files:: | |
8cd39fb3 MH |
629 | @end menu |
630 | ||
631 | @node Schema locating file syntax basics | |
632 | @subsection Schema locating file syntax basics | |
633 | ||
634 | There is a schema for schema locating files in the file | |
635 | @samp{locate.rnc} in the schema directory. Schema locating | |
636 | files must be valid with respect to this schema. | |
637 | ||
638 | The document element of a schema locating file must be | |
639 | @samp{locatingRules} and the namespace URI must be | |
640 | @samp{http://thaiopensource.com/ns/locating-rules/1.0}. The | |
641 | children of the document element specify rules. The order of the | |
642 | children is the same as the order of the rules. Here's a complete | |
643 | example of a schema locating file: | |
644 | ||
645 | @example | |
646 | <?xml version="1.0"?> | |
647 | <locatingRules xmlns="http://thaiopensource.com/ns/locating-rules/1.0"> | |
648 | <namespace ns="http://www.w3.org/1999/xhtml" uri="xhtml.rnc"/> | |
649 | <documentElement localName="book" uri="docbook.rnc"/> | |
650 | </locatingRules> | |
651 | @end example | |
652 | ||
653 | @noindent | |
654 | This says to use the schema @samp{xhtml.rnc} for a document with | |
655 | namespace @samp{http://www.w3.org/1999/xhtml}, and to use the | |
656 | schema @samp{docbook.rnc} for a document whose local name is | |
657 | @samp{book}. If the document element had both a namespace URI | |
658 | of @samp{http://www.w3.org/1999/xhtml} and a local name of | |
659 | @samp{book}, then the matching rule that comes first will be | |
660 | used and so the schema @samp{xhtml.rnc} would be used. There is | |
661 | no precedence between different types of rule; the first matching rule | |
662 | of any type is used. | |
663 | ||
664 | As usual with XML-related technologies, resources are identified | |
665 | by URIs. The @samp{uri} attribute identifies the schema by | |
1df7defd | 666 | specifying the URI@. The URI may be relative. If so, it is resolved |
8cd39fb3 MH |
667 | relative to the URI of the schema locating file that contains |
668 | attribute. This means that if the value of @samp{uri} attribute | |
669 | does not contain a @samp{/}, then it will refer to a filename in | |
670 | the same directory as the schema locating file. | |
671 | ||
672 | @node Using the document's URI to locate a schema | |
673 | @subsection Using the document's URI to locate a schema | |
674 | ||
675 | A @samp{uri} rule locates a schema based on the URI of the | |
676 | document. The @samp{uri} attribute specifies the URI of the | |
677 | schema. The @samp{resource} attribute can be used to specify | |
678 | the schema for a particular document. For example, | |
679 | ||
680 | @example | |
681 | <uri resource="spec.xml" uri="docbook.rnc"/> | |
682 | @end example | |
683 | ||
684 | @noindent | |
867d4bb3 | 685 | specifies that the schema for @samp{spec.xml} is |
8cd39fb3 MH |
686 | @samp{docbook.rnc}. |
687 | ||
688 | The @samp{pattern} attribute can be used instead of the | |
689 | @samp{resource} attribute to specify the schema for any document | |
690 | whose URI matches a pattern. The pattern has the same syntax as an | |
691 | absolute or relative URI except that the path component of the URI can | |
692 | use a @samp{*} character to stand for zero or more characters | |
1df7defd | 693 | within a path segment (i.e., any character other @samp{/}). |
8cd39fb3 MH |
694 | Typically, the URI pattern looks like a relative URI, but, whereas a |
695 | relative URI in the @samp{resource} attribute is resolved into a | |
696 | particular absolute URI using the base URI of the schema locating | |
697 | file, a relative URI pattern matches if it matches some number of | |
698 | complete path segments of the document's URI ending with the last path | |
1df7defd | 699 | segment of the document's URI@. For example, |
8cd39fb3 MH |
700 | |
701 | @example | |
702 | <uri pattern="*.xsl" uri="xslt.rnc"/> | |
703 | @end example | |
704 | ||
705 | @noindent | |
706 | specifies that the schema for documents with a URI whose path ends | |
707 | with @samp{.xsl} is @samp{xslt.rnc}. | |
708 | ||
709 | A @samp{transformURI} rule locates a schema by | |
710 | transforming the URI of the document. The @samp{fromPattern} | |
711 | attribute specifies a URI pattern with the same meaning as the | |
712 | @samp{pattern} attribute of the @samp{uri} element. The | |
713 | @samp{toPattern} attribute is a URI pattern that is used to | |
714 | generate the URI of the schema. Each @samp{*} in the | |
715 | @samp{toPattern} is replaced by the string that matched the | |
716 | corresponding @samp{*} in the @samp{fromPattern}. The | |
717 | resulting string is appended to the initial part of the document's URI | |
718 | that was not explicitly matched by the @samp{fromPattern}. The | |
719 | rule matches only if the transformed URI identifies an existing | |
720 | resource. For example, the rule | |
721 | ||
722 | @example | |
723 | <transformURI fromPattern="*.xml" toPattern="*.rnc"/> | |
724 | @end example | |
725 | ||
726 | @noindent | |
727 | would transform the URI @samp{file:///home/jjc/docs/spec.xml} | |
728 | into the URI @samp{file:///home/jjc/docs/spec.rnc}. Thus, this | |
729 | rule specifies that to locate a schema for a document | |
730 | @samp{@var{foo}.xml}, Emacs should test whether a file | |
731 | @samp{@var{foo}.rnc} exists in the same directory as | |
732 | @samp{@var{foo}.xml}, and, if so, should use it as the | |
733 | schema. | |
734 | ||
735 | @node Using the document element to locate a schema | |
736 | @subsection Using the document element to locate a schema | |
737 | ||
738 | A @samp{documentElement} rule locates a schema based on | |
739 | the local name and prefix of the document element. For example, a rule | |
740 | ||
741 | @example | |
742 | <documentElement prefix="xsl" localName="stylesheet" uri="xslt.rnc"/> | |
743 | @end example | |
744 | ||
745 | @noindent | |
746 | specifies that when the name of the document element is | |
747 | @samp{xsl:stylesheet}, then @samp{xslt.rnc} should be used | |
748 | as the schema. Either the @samp{prefix} or | |
749 | @samp{localName} attribute may be omitted to allow any prefix or | |
750 | local name. | |
751 | ||
752 | A @samp{namespace} rule locates a schema based on the | |
753 | namespace URI of the document element. For example, a rule | |
754 | ||
755 | @example | |
756 | <namespace ns="http://www.w3.org/1999/XSL/Transform" uri="xslt.rnc"/> | |
757 | @end example | |
758 | ||
759 | @noindent | |
760 | specifies that when the namespace URI of the document is | |
761 | @samp{http://www.w3.org/1999/XSL/Transform}, then | |
762 | @samp{xslt.rnc} should be used as the schema. | |
763 | ||
764 | @node Using type identifiers in schema locating files | |
765 | @subsection Using type identifiers in schema locating files | |
766 | ||
767 | Type identifiers allow a level of indirection in locating the | |
768 | schema for a document. Instead of associating the document directly | |
769 | with a schema URI, the document is associated with a type identifier, | |
1df7defd | 770 | which is in turn associated with a schema URI@. nXML mode does not |
8cd39fb3 MH |
771 | constrain the format of type identifiers. They can be simply strings |
772 | without any formal structure or they can be public identifiers or | |
773 | URIs. Note that these type identifiers have nothing to do with the | |
774 | DOCTYPE declaration. When comparing type identifiers, whitespace is | |
775 | normalized in the same way as with the @samp{xsd:token} | |
776 | datatype: leading and trailing whitespace is stripped; other sequences | |
777 | of whitespace are normalized to a single space character. | |
778 | ||
779 | Each of the rules described in previous sections that uses a | |
780 | @samp{uri} attribute to specify a schema, can instead use a | |
781 | @samp{typeId} attribute to specify a type identifier. The type | |
782 | identifier can be associated with a URI using a @samp{typeId} | |
783 | element. For example, | |
784 | ||
785 | @example | |
786 | <locatingRules xmlns="http://thaiopensource.com/ns/locating-rules/1.0"> | |
787 | <namespace ns="http://www.w3.org/1999/xhtml" typeId="XHTML"/> | |
788 | <typeId id="XHTML" typeId="XHTML Strict"/> | |
789 | <typeId id="XHTML Strict" uri="xhtml-strict.rnc"/> | |
790 | <typeId id="XHTML Transitional" uri="xhtml-transitional.rnc"/> | |
791 | </locatingRules> | |
792 | @end example | |
793 | ||
794 | @noindent | |
795 | declares three type identifiers @samp{XHTML} (representing the | |
796 | default variant of XHTML to be used), @samp{XHTML Strict} and | |
797 | @samp{XHTML Transitional}. Such a schema locating file would | |
798 | use @samp{xhtml-strict.rnc} for a document whose namespace is | |
799 | @samp{http://www.w3.org/1999/xhtml}. But it is considerably | |
800 | more flexible than a schema locating file that simply specified | |
801 | ||
802 | @example | |
803 | <namespace ns="http://www.w3.org/1999/xhtml" uri="xhtml-strict.rnc"/> | |
804 | @end example | |
805 | ||
806 | @noindent | |
807 | A user can easily use @kbd{C-c C-s C-t} to select between XHTML | |
808 | Strict and XHTML Transitional. Also, a user can easily add a catalog | |
809 | ||
810 | @example | |
811 | <locatingRules xmlns="http://thaiopensource.com/ns/locating-rules/1.0"> | |
812 | <typeId id="XHTML" typeId="XHTML Transitional"/> | |
813 | </locatingRules> | |
814 | @end example | |
815 | ||
816 | @noindent | |
817 | that makes the default variant of XHTML be XHTML Transitional. | |
818 | ||
819 | @node Using multiple schema locating files | |
820 | @subsection Using multiple schema locating files | |
821 | ||
822 | The @samp{include} element includes rules from another | |
823 | schema locating file. The behavior is exactly as if the rules from | |
824 | that file were included in place of the @samp{include} element. | |
825 | Relative URIs are resolved into absolute URIs before the inclusion is | |
826 | performed. For example, | |
827 | ||
828 | @example | |
829 | <include rules="../rules.xml"/> | |
830 | @end example | |
831 | ||
832 | @noindent | |
833 | includes the rules from @samp{rules.xml}. | |
834 | ||
835 | The process of locating a schema takes as input a list of schema | |
836 | locating files. The rules in all these files and in the files they | |
837 | include are resolved into a single list of rules, which are applied | |
838 | strictly in order. Sometimes this order is not what is needed. | |
839 | For example, suppose you have two schema locating files, a private | |
840 | file | |
841 | ||
842 | @example | |
843 | <locatingRules xmlns="http://thaiopensource.com/ns/locating-rules/1.0"> | |
844 | <namespace ns="http://www.w3.org/1999/xhtml" uri="xhtml.rnc"/> | |
845 | </locatingRules> | |
846 | @end example | |
847 | ||
848 | @noindent | |
849 | followed by a public file | |
850 | ||
851 | @example | |
852 | <locatingRules xmlns="http://thaiopensource.com/ns/locating-rules/1.0"> | |
853 | <transformURI pathSuffix=".xml" replacePathSuffix=".rnc"/> | |
854 | <namespace ns="http://www.w3.org/1999/XSL/Transform" typeId="XSLT"/> | |
855 | </locatingRules> | |
856 | @end example | |
857 | ||
858 | @noindent | |
859 | The effect of these two files is that the XHTML @samp{namespace} | |
860 | rule takes precedence over the @samp{transformURI} rule, which | |
861 | is almost certainly not what is needed. This can be solved by adding | |
862 | an @samp{applyFollowingRules} to the private file. | |
863 | ||
864 | @example | |
865 | <locatingRules xmlns="http://thaiopensource.com/ns/locating-rules/1.0"> | |
866 | <applyFollowingRules ruleType="transformURI"/> | |
867 | <namespace ns="http://www.w3.org/1999/xhtml" uri="xhtml.rnc"/> | |
868 | </locatingRules> | |
869 | @end example | |
870 | ||
871 | @node DTDs | |
872 | @chapter DTDs | |
873 | ||
3d439cd1 | 874 | nXML mode is designed to support the creation of standalone XML |
1df7defd | 875 | documents that do not depend on a DTD@. Although it is common practice |
8cd39fb3 MH |
876 | to insert a DOCTYPE declaration referencing an external DTD, this has |
877 | undesirable side-effects. It means that the document is no longer | |
878 | self-contained. It also means that different XML parsers may interpret | |
879 | the document in different ways, since the XML Recommendation does not | |
1df7defd | 880 | require XML parsers to read the DTD@. With DTDs, it was impractical to |
8cd39fb3 MH |
881 | get validation without using an external DTD or reference to an |
882 | parameter entity. With RELAX NG and other schema languages, you can | |
9858f6c3 | 883 | simultaneously get the benefits of validation and standalone XML |
8cd39fb3 MH |
884 | documents. Therefore, I recommend that you do not reference an |
885 | external DOCTYPE in your XML documents. | |
886 | ||
887 | One problem is entities for characters. Typically, as well as | |
888 | providing validation, DTDs also provide a set of character entities | |
889 | for documents to use. Schemas cannot provide this functionality, | |
890 | because schema validation happens after XML parsing. The recommended | |
891 | solution is to either use the Unicode characters directly, or, if this | |
892 | is impractical, use character references. nXML mode supports this by | |
893 | providing commands for entering characters and character references | |
894 | using the Unicode names, and can display the glyph corresponding to a | |
895 | character reference. | |
896 | ||
897 | @node Limitations | |
898 | @chapter Limitations | |
899 | ||
900 | nXML mode has some limitations: | |
901 | ||
902 | @itemize @bullet | |
903 | @item | |
904 | DTD support is limited. Internal parsed general entities declared | |
905 | in the internal subset are supported provided they do not contain | |
906 | elements. Other usage of DTDs is ignored. | |
907 | @item | |
908 | The restrictions on RELAX NG schemas in section 7 of the RELAX NG | |
909 | specification are not enforced. | |
8cd39fb3 MH |
910 | @end itemize |
911 | ||
0b1af106 GM |
912 | @node GNU Free Documentation License |
913 | @appendix GNU Free Documentation License | |
914 | @include doclicense.texi | |
915 | ||
8cd39fb3 | 916 | @bye |