doc/misc/nxml-mode.texi

   1 \input texinfo @c -*- texinfo -*-
   2 @c %**start of header
   3 @setfilename ../../info/nxml-mode
   4 @settitle nXML Mode
   5 @c %**end of header
   6
   7 @copying
   8 This manual documents nXML mode, an Emacs major mode for editing
   9 XML with RELAX NG support.
  10
  11 Copyright @copyright{} 2007--2012 Free Software Foundation, Inc.
  12
  13 @quotation
  14 Permission is granted to copy, distribute and/or modify this document
  15 under the terms of the GNU Free Documentation License, Version 1.3 or
  16 any later version published by the Free Software Foundation; with no
  17 Invariant Sections, with the Front-Cover texts being ``A GNU Manual,''
  18 and with the Back-Cover Texts as in (a) below.  A copy of the license
  19 is included in the section entitled ``GNU Free Documentation License''.
  20
  21 (a) The FSF's Back-Cover Text is: ``You have the freedom to copy and
  22 modify this GNU manual.''
  23 @end quotation
  24 @end copying
  25
  26 @dircategory Emacs editing modes
  27 @direntry
  28 * nXML Mode: (nxml-mode).       XML editing mode with RELAX NG support.
  29 @end direntry
  30
  31 @node Top
  32 @top nXML Mode
  33
  34 @insertcopying
  35
  36 This manual is not yet complete.
  37
  38 @menu
  39 * Introduction::
  40 * Completion::
  41 * Inserting end-tags::
  42 * Paragraphs::
  43 * Outlining::
  44 * Locating a schema::
  45 * DTDs::
  46 * Limitations::
  47 * GNU Free Documentation License::  The license for this documentation.
  48 @end menu
  49
  50 @node Introduction
  51 @chapter Introduction
  52
  53 nXML mode is an Emacs major-mode for editing XML documents.  It supports
  54 editing well-formed XML documents, and provides schema-sensitive editing
  55 using RELAX NG Compact Syntax.  To get started, visit a file containing an
  56 XML document, and, if necessary, use @kbd{M-x nxml-mode} to switch to nXML
  57 mode.  By default, @code{auto-mode-alist} and @code{magic-fallback-alist}
  58 put buffers in nXML mode if they have recognizable XML content or file
  59 extensions.  You may wish to customize the settings, for example to
  60 recognize different file extensions.
  61
  62 Once in nXML mode, you can type @kbd{C-h m} for basic information on the
  63 mode.
  64
  65 The @file{etc/nxml} directory in the Emacs distribution contains some data
  66 files used by nXML mode, and includes two files (@file{test-valid.xml} and
  67 @file{test-invalid.xml}) that provide examples of valid and invalid XML
  68 documents.
  69
  70 To get validation and schema-sensitive editing, you need a RELAX NG Compact
  71 Syntax (RNC) schema for your document (@pxref{Locating a schema}).  The
  72 @file{etc/schema} directory includes some schemas for popular document
  73 types.  See @url{http://relaxng.org/} for more information on RELAX NG@.
  74 You can use the @samp{Trang} program from
  75 @url{http://www.thaiopensource.com/relaxng/trang.html} to
  76 automatically create RNC schemas.  This program can:
  77
  78 @itemize @bullet
  79 @item
  80 infer an RNC schema from an instance document;
  81 @item
  82 convert a DTD to an RNC schema;
  83 @item
  84 convert a RELAX NG XML syntax schema to an RNC schema.
  85 @end itemize
  86
  87 @noindent To convert a RELAX NG XML syntax (@samp{.rng}) schema to a RNC
  88 one, you can also use the XSLT stylesheet from
  89 @url{http://www.pantor.com/download.html}.
  90
  91 To convert a W3C XML Schema to an RNC schema, you need first to convert it
  92 to RELAX NG XML syntax using the RELAX NG converter tool @code{rngconv}
  93 (built on top of MSV).  See @url{https://github.com/kohsuke/msv}
  94 and @url{https://msv.dev.java.net/}.
  95
  96 For historical discussions only, see the mailing list archives at
  97 @url{http://groups.yahoo.com/group/emacs-nxml-mode/}.  Please make all new
  98 discussions on the @samp{help-gnu-emacs} and @samp{emacs-devel} mailing
  99 lists.  Report any bugs with @kbd{M-x report-emacs-bug}.
 100
 101
 102 @node Completion
 103 @chapter Completion
 104
 105 Apart from real-time validation, the most important feature that nXML
 106 mode provides for assisting in document creation is "completion".
 107 Completion assists the user in inserting characters at point, based on
 108 knowledge of the schema and on the contents of the buffer before
 109 point.
 110
 111 nXML mode adapts the standard GNU Emacs command for completion in a
 112 buffer: @code{completion-at-point}, which is bound to @kbd{C-M-i} and
 113 @kbd{M-@key{TAB}}.  Note that many window systems and window managers
 114 use @kbd{M-@key{TAB}} themselves (typically for switching between
 115 windows) and do not pass it to applications.  In that case, you should
 116 type @kbd{C-M-i} or @kbd{@key{ESC} @key{TAB}} for completion, or bind
 117 @code{completion-at-point} to a key that is convenient for you.  In
 118 the following, I will assume that you type @kbd{C-M-i}.
 119
 120 nXML mode completion works by examining the symbol preceding point.
 121 This is the symbol to be completed. The symbol to be completed may be
 122 the empty. Completion considers what symbols starting with the symbol
 123 to be completed would be valid replacements for the symbol to be
 124 completed, given the schema and the contents of the buffer before
 125 point.  These symbols are the possible completions.  An example may
 126 make this clearer.  Suppose the buffer looks like this (where @point{}
 127 indicates point):
 128
 129 @example
 130 <html xmlns="http://www.w3.org/1999/xhtml">
 131 <h@point{}
 132 @end example
 133
 134 @noindent
 135 and the schema is XHTML@.  In this context, the symbol to be completed
 136 is @samp{h}.  The possible completions consist of just
 137 @samp{head}.  Another example, is
 138
 139 @example
 140 <html xmlns="http://www.w3.org/1999/xhtml">
 141 <head>
 142 <@point{}
 143 @end example
 144
 145 @noindent
 146 In this case, the symbol to be completed is empty, and the possible
 147 completions are @samp{base}, @samp{isindex},
 148 @samp{link}, @samp{meta}, @samp{script},
 149 @samp{style}, @samp{title}.  Another example is:
 150
 151 @example
 152 <html xmlns="@point{}
 153 @end example
 154
 155 @noindent
 156 In this case, the symbol to be completed is empty, and the possible
 157 completions are just @samp{http://www.w3.org/1999/xhtml}.
 158
 159 When you type @kbd{C-M-i}, what happens depends
 160 on what the set of possible completions are.
 161
 162 @itemize @bullet
 163 @item
 164 If the set of completions is empty, nothing
 165 happens.
 166 @item
 167 If there is one possible completion, then that completion is
 168 inserted, together with any following characters that are
 169 required. For example, in this case:
 170
 171 @example
 172 <html xmlns="http://www.w3.org/1999/xhtml">
 173 <@point{}
 174 @end example
 175
 176 @noindent
 177 @kbd{C-M-i} will yield
 178
 179 @example
 180 <html xmlns="http://www.w3.org/1999/xhtml">
 181 <head@point{}
 182 @end example
 183 @item
 184 If there is more than one possible completion, but all
 185 possible completions share a common non-empty prefix, then that prefix
 186 is inserted. For example, suppose the buffer is:
 187
 188 @example
 189 <html x@point{}
 190 @end example
 191
 192 @noindent
 193 The symbol to be completed is @samp{x}. The possible completions are
 194 @samp{xmlns} and @samp{xml:lang}.  These share a common prefix of
 195 @samp{xml}.  Thus, @kbd{C-M-i} will yield:
 196
 197 @example
 198 <html xml@point{}
 199 @end example
 200
 201 @noindent
 202 Typically, you would do @kbd{C-M-i} again, which would have the result
 203 described in the next item.
 204 @item
 205 If there is more than one possible completion, but the
 206 possible completions do not share a non-empty prefix, then Emacs will
 207 prompt you to input the symbol in the minibuffer, initializing the
 208 minibuffer with the symbol to be completed, and popping up a buffer
 209 showing the possible completions.  You can now input the symbol to be
 210 inserted.  The symbol you input will be inserted in the buffer instead
 211 of the symbol to be completed.  Emacs will then insert any required
 212 characters after the symbol.  For example, if it contains:
 213
 214 @example
 215 <html xml@point{}
 216 @end example
 217
 218 @noindent
 219 Emacs will prompt you in the minibuffer with
 220
 221 @example
 222 Attribute: xml@point{}
 223 @end example
 224
 225 @noindent
 226 and the buffer showing possible completions will contain
 227
 228 @example
 229 Possible completions are:
 230 xml:lang                           xmlns
 231 @end example
 232
 233 @noindent
 234 If you input @kbd{xmlns}, the result will be:
 235
 236 @example
 237 <html xmlns="@point{}
 238 @end example
 239
 240 @noindent
 241 (If you do @kbd{C-M-i} again, the namespace URI will be
 242 inserted. Should that happen automatically?)
 243 @end itemize
 244
 245 @node Inserting end-tags
 246 @chapter Inserting end-tags
 247
 248 The main redundancy in XML syntax is end-tags.  nXML mode provides
 249 several ways to make it easier to enter end-tags.  You can use all of
 250 these without a schema.
 251
 252 You can use @kbd{C-M-i} after @samp{</} to complete the rest of the
 253 end-tag.
 254
 255 @kbd{C-c C-f} inserts an end-tag for the element containing
 256 point. This command is useful when you want to input the start-tag,
 257 then input the content and finally input the end-tag. The @samp{f}
 258 is mnemonic for finish.
 259
 260 If you want to keep tags balanced and input the end-tag at the
 261 same time as the start-tag, before inputting the content, then you can
 262 use @kbd{C-c C-i}. This inserts a @samp{>}, then inserts
 263 the end-tag and leaves point before the end-tag.  @kbd{C-c C-b}
 264 is similar but more convenient for block-level elements: it puts the
 265 start-tag, point and the end-tag on successive lines, appropriately
 266 indented. The @samp{i} is mnemonic for inline and the
 267 @samp{b} is mnemonic for block.
 268
 269 Finally, you can customize nXML mode so that @kbd{/} automatically
 270 inserts the rest of the end-tag when it occurs after @samp{<}, by
 271 doing
 272
 273 @display
 274 @kbd{M-x customize-variable @key{RET} nxml-slash-auto-complete-flag @key{RET}}
 275 @end display
 276
 277 @noindent
 278 and then following the instructions in the displayed buffer.
 279
 280 @node Paragraphs
 281 @chapter Paragraphs
 282
 283 Emacs has several commands that operate on paragraphs, most
 284 notably @kbd{M-q}. nXML mode redefines these to work in a way
 285 that is useful for XML@.  The exact rules that are used to find the
 286 beginning and end of a paragraph are complicated; they are designed
 287 mainly to ensure that @kbd{M-q} does the right thing.
 288
 289 A paragraph consists of one or more complete, consecutive lines.
 290 A group of lines is not considered a paragraph unless it contains some
 291 non-whitespace characters between tags or inside comments.  A blank
 292 line separates paragraphs.  A single tag on a line by itself also
 293 separates paragraphs.  More precisely, if one tag together with any
 294 leading and trailing whitespace completely occupy one or more lines,
 295 then those lines will not be included in any paragraph.
 296
 297 A start-tag at the beginning of the line (possibly indented) may
 298 be treated as starting a paragraph.  Similarly, an end-tag at the end
 299 of the line may be treated as ending a paragraph. The following rules
 300 are used to determine whether such a tag is in fact treated as a
 301 paragraph boundary:
 302
 303 @itemize @bullet
 304 @item
 305 If the schema does not allow text at that point, then it
 306 is a paragraph boundary.
 307 @item
 308 If the end-tag corresponding to the start-tag is not at
 309 the end of its line, or the start-tag corresponding to the end-tag is
 310 not at the beginning of its line, then it is not a paragraph
 311 boundary. For example, in
 312
 313 @example
 314 <p>This is a paragraph with an
 315 <emph>emphasized</emph> phrase.
 316 @end example
 317
 318 @noindent
 319 the @samp{<emph>} start-tag would not be considered as
 320 starting a paragraph, because its corresponding end-tag is not at the
 321 end of the line.
 322 @item
 323 If there is text that is a sibling in element tree, then
 324 it is not a paragraph boundary.  For example, in
 325
 326 @example
 327 <p>This is a paragraph with an
 328 <emph>emphasized phrase that takes one source line</emph>
 329 @end example
 330
 331 @noindent
 332 the @samp{<emph>} start-tag would not be considered as
 333 starting a paragraph, even though its end-tag is at the end of its
 334 line, because there the text @samp{This is a paragraph with an}
 335 is a sibling of the @samp{emph} element.
 336 @item
 337 Otherwise, it is a paragraph boundary.
 338 @end itemize
 339
 340 @node Outlining
 341 @chapter Outlining
 342
 343 nXML mode allows you to display all or part of a buffer as an
 344 outline, in a similar way to Emacs's outline mode.  An outline in nXML
 345 mode is based on recognizing two kinds of element: sections and
 346 headings.  There is one heading for every section and one section for
 347 every heading.  A section contains its heading as or within its first
 348 child element.  A section also contains its subordinate sections (its
 349 subsections).  The text content of a section consists of anything in a
 350 section that is neither a subsection nor a heading.
 351
 352 Note that this is a different model from that used by XHTML@.
 353 nXML mode's outline support will not be useful for XHTML unless you
 354 adopt a convention of adding a @code{div} to enclose each
 355 section, rather than having sections implicitly delimited by different
 356 @code{h@var{n}} elements.  This limitation may be removed
 357 in a future version.
 358
 359 The variable @code{nxml-section-element-name-regexp} gives
 360 a regexp for the local names (i.e., the part of the name following any
 361 prefix) of section elements. The variable
 362 @code{nxml-heading-element-name-regexp} gives a regexp for the
 363 local names of heading elements. For an element to be recognized
 364 as a section
 365
 366 @itemize @bullet
 367 @item
 368 its start-tag must occur at the beginning of a line
 369 (possibly indented);
 370 @item
 371 its local name must match
 372 @code{nxml-section-element-name-regexp};
 373 @item
 374 either its first child element or a descendant of that
 375 first child element must have a local name that matches
 376 @code{nxml-heading-element-name-regexp}; the first such element
 377 is treated as the section's heading.
 378 @end itemize
 379
 380 @noindent
 381 You can customize these variables using @kbd{M-x
 382 customize-variable}.
 383
 384 There are three possible outline states for a section:
 385
 386 @itemize @bullet
 387 @item
 388 normal, showing everything, including its heading, text
 389 content and subsections; each subsection is displayed according to the
 390 state of that subsection;
 391 @item
 392 showing just its heading, with both its text content and
 393 its subsections hidden; all subsections are hidden regardless of their
 394 state;
 395 @item
 396 showing its heading and its subsections, with its text
 397 content hidden; each subsection is displayed according to the state of
 398 that subsection.
 399 @end itemize
 400
 401 In the last two states, where the text content is hidden, the
 402 heading is displayed specially, in an abbreviated form. An element
 403 like this:
 404
 405 @example
 406 <section>
 407 <title>Food</title>
 408 <para>There are many kinds of food.</para>
 409 </section>
 410 @end example
 411
 412 @noindent
 413 would be displayed on a single line like this:
 414
 415 @example
 416 <-section>Food...</>
 417 @end example
 418
 419 @noindent
 420 If there are hidden subsections, then a @code{+} will be used
 421 instead of a @code{-} like this:
 422
 423 @example
 424 <+section>Food...</>
 425 @end example
 426
 427 @noindent
 428 If there are non-hidden subsections, then the section will instead be
 429 displayed like this:
 430
 431 @example
 432 <-section>Food...
 433   <-section>Delicious Food...</>
 434   <-section>Distasteful Food...</>
 435 </-section>
 436 @end example
 437
 438 @noindent
 439 The heading is always displayed with an indent that corresponds to its
 440 depth in the outline, even it is not actually indented in the buffer.
 441 The variable @code{nxml-outline-child-indent} controls how much
 442 a subheading is indented with respect to its parent heading when the
 443 heading is being displayed specially.
 444
 445 Commands to change the outline state of sections are bound to
 446 key sequences that start with @kbd{C-c C-o} (@kbd{o} is
 447 mnemonic for outline).  The third and final key has been chosen to be
 448 consistent with outline mode.  In the following descriptions
 449 current section means the section containing point, or, more precisely,
 450 the innermost section containing the character immediately following
 451 point.
 452
 453 @itemize @bullet
 454 @item
 455 @kbd{C-c C-o C-a} shows all sections in the buffer
 456 normally.
 457 @item
 458 @kbd{C-c C-o C-t} hides the text content
 459 of all sections in the buffer.
 460 @item
 461 @kbd{C-c C-o C-c} hides the text content
 462 of the current section.
 463 @item
 464 @kbd{C-c C-o C-e} shows the text content
 465 of the current section.
 466 @item
 467 @kbd{C-c C-o C-d} hides the text content
 468 and subsections of the current section.
 469 @item
 470 @kbd{C-c C-o C-s} shows the current section
 471 and all its direct and indirect subsections normally.
 472 @item
 473 @kbd{C-c C-o C-k} shows the headings of the
 474 direct and indirect subsections of the current section.
 475 @item
 476 @kbd{C-c C-o C-l} hides the text content of the
 477 current section and of its direct and indirect
 478 subsections.
 479 @item
 480 @kbd{C-c C-o C-i} shows the headings of the
 481 direct subsections of the current section.
 482 @item
 483 @kbd{C-c C-o C-o} hides as much as possible without
 484 hiding the current section's text content; the headings of ancestor
 485 sections of the current section and their child section sections will
 486 not be hidden.
 487 @end itemize
 488
 489 When a heading is displayed specially, you can use
 490 @key{RET} in that heading to show the text content of the section
 491 in the same way as @kbd{C-c C-o C-e}.
 492
 493 You can also use the mouse to change the outline state:
 494 @kbd{S-mouse-2} hides the text content of a section in the same
 495 way as@kbd{C-c C-o C-c}; @kbd{mouse-2} on a specially
 496 displayed heading shows the text content of the section in the same
 497 way as @kbd{C-c C-o C-e}; @kbd{mouse-1} on a specially
 498 displayed start-tag toggles the display of subheadings on and
 499 off.
 500
 501 The outline state for each section is stored with the first
 502 character of the section (as a text property). Every command that
 503 changes the outline state of any section updates the display of the
 504 buffer so that each section is displayed correctly according to its
 505 outline state.  If the section structure is subsequently changed, then
 506 it is possible for the display to no longer correctly reflect the
 507 stored outline state. @kbd{C-c C-o C-r} can be used to refresh
 508 the display so it is correct again.
 509
 510 @node Locating a schema
 511 @chapter Locating a schema
 512
 513 nXML mode has a configurable set of rules to locate a schema for
 514 the file being edited.  The rules are contained in one or more schema
 515 locating files, which are XML documents.
 516
 517 The variable @samp{rng-schema-locating-files} specifies
 518 the list of the file-names of schema locating files that nXML mode
 519 should use.  The order of the list is significant: when file
 520 @var{x} occurs in the list before file @var{y} then rules
 521 from file @var{x} have precedence over rules from file
 522 @var{y}.  A filename specified in
 523 @samp{rng-schema-locating-files} may be relative. If so, it will
 524 be resolved relative to the document for which a schema is being
 525 located. It is not an error if relative file-names in
 526 @samp{rng-schema-locating-files} do not exist. You can use
 527 @kbd{M-x customize-variable @key{RET} rng-schema-locating-files
 528 @key{RET}} to customize the list of schema locating
 529 files.
 530
 531 By default, @samp{rng-schema-locating-files} list has two
 532 members: @samp{schemas.xml}, and
 533 @samp{@var{dist-dir}/schema/schemas.xml} where
 534 @samp{@var{dist-dir}} is the directory containing the nXML
 535 distribution. The first member will cause nXML mode to use a file
 536 @samp{schemas.xml} in the same directory as the document being
 537 edited if such a file exist.  The second member contains rules for the
 538 schemas that are included with the nXML distribution.
 539
 540 @menu
 541 * Commands for locating a schema::
 542 * Schema locating files::
 543 @end menu
 544
 545 @node Commands for locating a schema
 546 @section Commands for locating a schema
 547
 548 The command @kbd{C-c C-s C-w} will tell you what schema
 549 is currently being used.
 550
 551 The rules for locating a schema are applied automatically when
 552 you visit a file in nXML mode. However, if you have just created a new
 553 file and the schema cannot be inferred from the file-name, then this
 554 will not locate the right schema.  In this case, you should insert the
 555 start-tag of the root element and then use the command @kbd{C-c C-s
 556 C-a}, which reapplies the rules based on the current content of
 557 the document.  It is usually not necessary to insert the complete
 558 start-tag; often just @samp{<@var{name}} is
 559 enough.
 560
 561 If you want to use a schema that has not yet been added to the
 562 schema locating files, you can use the command @kbd{C-c C-s C-f}
 563 to manually select the file containing the schema for the document in
 564 current buffer.  Emacs will read the file-name of the schema from the
 565 minibuffer. After reading the file-name, Emacs will ask whether you
 566 wish to add a rule to a schema locating file that persistently
 567 associates the document with the selected schema.  The rule will be
 568 added to the first file in the list specified
 569 @samp{rng-schema-locating-files}; it will create the file if
 570 necessary, but will not create a directory. If the variable
 571 @samp{rng-schema-locating-files} has not been customized, this
 572 means that the rule will be added to the file @samp{schemas.xml}
 573 in the same directory as the document being edited.
 574
 575 The command @kbd{C-c C-s C-t} allows you to select a schema by
 576 specifying an identifier for the type of the document.  The schema
 577 locating files determine the available type identifiers and what
 578 schema is used for each type identifier. This is useful when it is
 579 impossible to infer the right schema from either the file-name or the
 580 content of the document, even though the schema is already in the
 581 schema locating file.  A situation in which this can occur is when
 582 there are multiple variants of a schema where all valid documents have
 583 the same document element.  For example, XHTML has Strict and
 584 Transitional variants.  In a situation like this, a schema locating file
 585 can define a type identifier for each variant. As with @kbd{C-c
 586 C-s C-f}, Emacs will ask whether you wish to add a rule to a schema
 587 locating file that persistently associates the document with the
 588 specified type identifier.
 589
 590 The command @kbd{C-c C-s C-l} adds a rule to a schema
 591 locating file that persistently associates the document with
 592 the schema that is currently being used.
 593
 594 @node Schema locating files
 595 @section Schema locating files
 596
 597 Each schema locating file specifies a list of rules.  The rules
 598 from each file are appended in order. To locate a schema each rule is
 599 applied in turn until a rule matches.  The first matching rule is then
 600 used to determine the schema.
 601
 602 Schema locating files are designed to be useful for other
 603 applications that need to locate a schema for a document. In fact,
 604 there is nothing specific to locating schemas in the design; it could
 605 equally well be used for locating a stylesheet.
 606
 607 @menu
 608 * Schema locating file syntax basics::
 609 * Using the document's URI to locate a schema::
 610 * Using the document element to locate a schema::
 611 * Using type identifiers in schema locating files::
 612 * Using multiple schema locating files::
 613 @end menu
 614
 615 @node Schema locating file syntax basics
 616 @subsection Schema locating file syntax basics
 617
 618 There is a schema for schema locating files in the file
 619 @samp{locate.rnc} in the schema directory.  Schema locating
 620 files must be valid with respect to this schema.
 621
 622 The document element of a schema locating file must be
 623 @samp{locatingRules} and the namespace URI must be
 624 @samp{http://thaiopensource.com/ns/locating-rules/1.0}.  The
 625 children of the document element specify rules. The order of the
 626 children is the same as the order of the rules.  Here's a complete
 627 example of a schema locating file:
 628
 629 @example
 630 <?xml version="1.0"?>
 631 <locatingRules xmlns="http://thaiopensource.com/ns/locating-rules/1.0">
 632   <namespace ns="http://www.w3.org/1999/xhtml" uri="xhtml.rnc"/>
 633   <documentElement localName="book" uri="docbook.rnc"/>
 634 </locatingRules>
 635 @end example
 636
 637 @noindent
 638 This says to use the schema @samp{xhtml.rnc} for a document with
 639 namespace @samp{http://www.w3.org/1999/xhtml}, and to use the
 640 schema @samp{docbook.rnc} for a document whose local name is
 641 @samp{book}.  If the document element had both a namespace URI
 642 of @samp{http://www.w3.org/1999/xhtml} and a local name of
 643 @samp{book}, then the matching rule that comes first will be
 644 used and so the schema @samp{xhtml.rnc} would be used.  There is
 645 no precedence between different types of rule; the first matching rule
 646 of any type is used.
 647
 648 As usual with XML-related technologies, resources are identified
 649 by URIs.  The @samp{uri} attribute identifies the schema by
 650 specifying the URI@.  The URI may be relative.  If so, it is resolved
 651 relative to the URI of the schema locating file that contains
 652 attribute. This means that if the value of @samp{uri} attribute
 653 does not contain a @samp{/}, then it will refer to a filename in
 654 the same directory as the schema locating file.
 655
 656 @node Using the document's URI to locate a schema
 657 @subsection Using the document's URI to locate a schema
 658
 659 A @samp{uri} rule locates a schema based on the URI of the
 660 document.  The @samp{uri} attribute specifies the URI of the
 661 schema.  The @samp{resource} attribute can be used to specify
 662 the schema for a particular document.  For example,
 663
 664 @example
 665 <uri resource="spec.xml" uri="docbook.rnc"/>
 666 @end example
 667
 668 @noindent
 669 specifies that the schema for @samp{spec.xml} is
 670 @samp{docbook.rnc}.
 671
 672 The @samp{pattern} attribute can be used instead of the
 673 @samp{resource} attribute to specify the schema for any document
 674 whose URI matches a pattern.  The pattern has the same syntax as an
 675 absolute or relative URI except that the path component of the URI can
 676 use a @samp{*} character to stand for zero or more characters
 677 within a path segment (i.e., any character other @samp{/}).
 678 Typically, the URI pattern looks like a relative URI, but, whereas a
 679 relative URI in the @samp{resource} attribute is resolved into a
 680 particular absolute URI using the base URI of the schema locating
 681 file, a relative URI pattern matches if it matches some number of
 682 complete path segments of the document's URI ending with the last path
 683 segment of the document's URI@. For example,
 684
 685 @example
 686 <uri pattern="*.xsl" uri="xslt.rnc"/>
 687 @end example
 688
 689 @noindent
 690 specifies that the schema for documents with a URI whose path ends
 691 with @samp{.xsl} is @samp{xslt.rnc}.
 692
 693 A @samp{transformURI} rule locates a schema by
 694 transforming the URI of the document. The @samp{fromPattern}
 695 attribute specifies a URI pattern with the same meaning as the
 696 @samp{pattern} attribute of the @samp{uri} element.  The
 697 @samp{toPattern} attribute is a URI pattern that is used to
 698 generate the URI of the schema.  Each @samp{*} in the
 699 @samp{toPattern} is replaced by the string that matched the
 700 corresponding @samp{*} in the @samp{fromPattern}.  The
 701 resulting string is appended to the initial part of the document's URI
 702 that was not explicitly matched by the @samp{fromPattern}.  The
 703 rule matches only if the transformed URI identifies an existing
 704 resource.  For example, the rule
 705
 706 @example
 707 <transformURI fromPattern="*.xml" toPattern="*.rnc"/>
 708 @end example
 709
 710 @noindent
 711 would transform the URI @samp{file:///home/jjc/docs/spec.xml}
 712 into the URI @samp{file:///home/jjc/docs/spec.rnc}.  Thus, this
 713 rule specifies that to locate a schema for a document
 714 @samp{@var{foo}.xml}, Emacs should test whether a file
 715 @samp{@var{foo}.rnc} exists in the same directory as
 716 @samp{@var{foo}.xml}, and, if so, should use it as the
 717 schema.
 718
 719 @node Using the document element to locate a schema
 720 @subsection Using the document element to locate a schema
 721
 722 A @samp{documentElement} rule locates a schema based on
 723 the local name and prefix of the document element. For example, a rule
 724
 725 @example
 726 <documentElement prefix="xsl" localName="stylesheet" uri="xslt.rnc"/>
 727 @end example
 728
 729 @noindent
 730 specifies that when the name of the document element is
 731 @samp{xsl:stylesheet}, then @samp{xslt.rnc} should be used
 732 as the schema. Either the @samp{prefix} or
 733 @samp{localName} attribute may be omitted to allow any prefix or
 734 local name.
 735
 736 A @samp{namespace} rule locates a schema based on the
 737 namespace URI of the document element. For example, a rule
 738
 739 @example
 740 <namespace ns="http://www.w3.org/1999/XSL/Transform" uri="xslt.rnc"/>
 741 @end example
 742
 743 @noindent
 744 specifies that when the namespace URI of the document is
 745 @samp{http://www.w3.org/1999/XSL/Transform}, then
 746 @samp{xslt.rnc} should be used as the schema.
 747
 748 @node Using type identifiers in schema locating files
 749 @subsection Using type identifiers in schema locating files
 750
 751 Type identifiers allow a level of indirection in locating the
 752 schema for a document.  Instead of associating the document directly
 753 with a schema URI, the document is associated with a type identifier,
 754 which is in turn associated with a schema URI@. nXML mode does not
 755 constrain the format of type identifiers.  They can be simply strings
 756 without any formal structure or they can be public identifiers or
 757 URIs.  Note that these type identifiers have nothing to do with the
 758 DOCTYPE declaration.  When comparing type identifiers, whitespace is
 759 normalized in the same way as with the @samp{xsd:token}
 760 datatype: leading and trailing whitespace is stripped; other sequences
 761 of whitespace are normalized to a single space character.
 762
 763 Each of the rules described in previous sections that uses a
 764 @samp{uri} attribute to specify a schema, can instead use a
 765 @samp{typeId} attribute to specify a type identifier.  The type
 766 identifier can be associated with a URI using a @samp{typeId}
 767 element. For example,
 768
 769 @example
 770 <locatingRules xmlns="http://thaiopensource.com/ns/locating-rules/1.0">
 771   <namespace ns="http://www.w3.org/1999/xhtml" typeId="XHTML"/>
 772   <typeId id="XHTML" typeId="XHTML Strict"/>
 773   <typeId id="XHTML Strict" uri="xhtml-strict.rnc"/>
 774   <typeId id="XHTML Transitional" uri="xhtml-transitional.rnc"/>
 775 </locatingRules>
 776 @end example
 777
 778 @noindent
 779 declares three type identifiers @samp{XHTML} (representing the
 780 default variant of XHTML to be used), @samp{XHTML Strict} and
 781 @samp{XHTML Transitional}.  Such a schema locating file would
 782 use @samp{xhtml-strict.rnc} for a document whose namespace is
 783 @samp{http://www.w3.org/1999/xhtml}.  But it is considerably
 784 more flexible than a schema locating file that simply specified
 785
 786 @example
 787 <namespace ns="http://www.w3.org/1999/xhtml" uri="xhtml-strict.rnc"/>
 788 @end example
 789
 790 @noindent
 791 A user can easily use @kbd{C-c C-s C-t} to select between XHTML
 792 Strict and XHTML Transitional. Also, a user can easily add a catalog
 793
 794 @example
 795 <locatingRules xmlns="http://thaiopensource.com/ns/locating-rules/1.0">
 796   <typeId id="XHTML" typeId="XHTML Transitional"/>
 797 </locatingRules>
 798 @end example
 799
 800 @noindent
 801 that makes the default variant of XHTML be XHTML Transitional.
 802
 803 @node Using multiple schema locating files
 804 @subsection Using multiple schema locating files
 805
 806 The @samp{include} element includes rules from another
 807 schema locating file.  The behavior is exactly as if the rules from
 808 that file were included in place of the @samp{include} element.
 809 Relative URIs are resolved into absolute URIs before the inclusion is
 810 performed. For example,
 811
 812 @example
 813 <include rules="../rules.xml"/>
 814 @end example
 815
 816 @noindent
 817 includes the rules from @samp{rules.xml}.
 818
 819 The process of locating a schema takes as input a list of schema
 820 locating files.  The rules in all these files and in the files they
 821 include are resolved into a single list of rules, which are applied
 822 strictly in order.  Sometimes this order is not what is needed.
 823 For example, suppose you have two schema locating files, a private
 824 file
 825
 826 @example
 827 <locatingRules xmlns="http://thaiopensource.com/ns/locating-rules/1.0">
 828   <namespace ns="http://www.w3.org/1999/xhtml" uri="xhtml.rnc"/>
 829 </locatingRules>
 830 @end example
 831
 832 @noindent
 833 followed by a public file
 834
 835 @example
 836 <locatingRules xmlns="http://thaiopensource.com/ns/locating-rules/1.0">
 837   <transformURI pathSuffix=".xml" replacePathSuffix=".rnc"/>
 838   <namespace ns="http://www.w3.org/1999/XSL/Transform" typeId="XSLT"/>
 839 </locatingRules>
 840 @end example
 841
 842 @noindent
 843 The effect of these two files is that the XHTML @samp{namespace}
 844 rule takes precedence over the @samp{transformURI} rule, which
 845 is almost certainly not what is needed.  This can be solved by adding
 846 an @samp{applyFollowingRules} to the private file.
 847
 848 @example
 849 <locatingRules xmlns="http://thaiopensource.com/ns/locating-rules/1.0">
 850   <applyFollowingRules ruleType="transformURI"/>
 851   <namespace ns="http://www.w3.org/1999/xhtml" uri="xhtml.rnc"/>
 852 </locatingRules>
 853 @end example
 854
 855 @node DTDs
 856 @chapter DTDs
 857
 858 nXML mode is designed to support the creation of standalone XML
 859 documents that do not depend on a DTD@.  Although it is common practice
 860 to insert a DOCTYPE declaration referencing an external DTD, this has
 861 undesirable side-effects.  It means that the document is no longer
 862 self-contained. It also means that different XML parsers may interpret
 863 the document in different ways, since the XML Recommendation does not
 864 require XML parsers to read the DTD@.  With DTDs, it was impractical to
 865 get validation without using an external DTD or reference to an
 866 parameter entity.  With RELAX NG and other schema languages, you can
 867 simultaneously get the benefits of validation and standalone XML
 868 documents.  Therefore, I recommend that you do not reference an
 869 external DOCTYPE in your XML documents.
 870
 871 One problem is entities for characters. Typically, as well as
 872 providing validation, DTDs also provide a set of character entities
 873 for documents to use. Schemas cannot provide this functionality,
 874 because schema validation happens after XML parsing.  The recommended
 875 solution is to either use the Unicode characters directly, or, if this
 876 is impractical, use character references.  nXML mode supports this by
 877 providing commands for entering characters and character references
 878 using the Unicode names, and can display the glyph corresponding to a
 879 character reference.
 880
 881 @node Limitations
 882 @chapter Limitations
 883
 884 nXML mode has some limitations:
 885
 886 @itemize @bullet
 887 @item
 888 DTD support is limited.  Internal parsed general entities declared
 889 in the internal subset are supported provided they do not contain
 890 elements. Other usage of DTDs is ignored.
 891 @item
 892 The restrictions on RELAX NG schemas in section 7 of the RELAX NG
 893 specification are not enforced.
 894 @end itemize
 895
 896 @node GNU Free Documentation License
 897 @appendix GNU Free Documentation License
 898 @include doclicense.texi
 899
 900 @bye