doc/misc/nxml-mode.texi

   1 \input texinfo @c -*- texinfo -*-
   2 @c %**start of header
   3 @setfilename ../../info/nxml-mode
   4 @settitle nXML Mode
   5 @c %**end of header
   6
   7 @dircategory Emacs
   8 @direntry
   9 * nXML Mode: (nxml-mode).       XML editing mode with RELAX NG support.
  10 @end direntry
  11
  12 @node Top
  13 @top nXML Mode
  14
  15 This manual documents nxml-mode, an Emacs major mode for editing
  16 XML with RELAX NG support.  This manual is not yet complete.
  17
  18 @menu
  19 * Completion::
  20 * Inserting end-tags::
  21 * Paragraphs::
  22 * Outlining::
  23 * Locating a schema::
  24 * DTDs::
  25 * Limitations::
  26 @end menu
  27
  28 @node Completion
  29 @chapter Completion
  30
  31 Apart from real-time validation, the most important feature that
  32 nxml-mode provides for assisting in document creation is "completion".
  33 Completion assists the user in inserting characters at point, based on
  34 knowledge of the schema and on the contents of the buffer before
  35 point.
  36
  37 The traditional GNU Emacs key combination for completion in a
  38 buffer is @kbd{M-@key{TAB}}. However, many window systems
  39 and window managers use this key combination themselves (typically for
  40 switching between windows) and do not pass it to applications. It's
  41 hard to find key combinations in GNU Emacs that are both easy to type
  42 and not taken by something else.  @kbd{C-@key{RET}} (i.e.
  43 pressing the Enter or Return key, while the Ctrl key is held down) is
  44 available.  It won't be available on a traditional terminal (because
  45 it is indistinguishable from Return), but it will work with a window
  46 system.  Therefore we adopt the following solution by default: use
  47 @kbd{C-@key{RET}} when there's a window system and
  48 @kbd{M-@key{TAB}} when there's not.  In the following, I
  49 will assume that a window system is being used and will therefore
  50 refer to @kbd{C-@key{RET}}.
  51
  52 Completion works by examining the symbol preceding point.  This
  53 is the symbol to be completed. The symbol to be completed may be the
  54 empty. Completion considers what symbols starting with the symbol to
  55 be completed would be valid replacements for the symbol to be
  56 completed, given the schema and the contents of the buffer before
  57 point.  These symbols are the possible completions.  An example may
  58 make this clearer.  Suppose the buffer looks like this (where @point{}
  59 indicates point):
  60
  61 @example
  62 <html xmlns="http://www.w3.org/1999/xhtml">
  63 <h@point{}
  64 @end example
  65
  66 @noindent
  67 and the schema is XHTML.  In this context, the symbol to be completed
  68 is @samp{h}.  The possible completions consist of just
  69 @samp{head}.  Another example, is
  70
  71 @example
  72 <html xmlns="http://www.w3.org/1999/xhtml">
  73 <head>
  74 <@point{}
  75 @end example
  76
  77 @noindent
  78 In this case, the symbol to be completed is empty, and the possible
  79 completions are @samp{base}, @samp{isindex},
  80 @samp{link}, @samp{meta}, @samp{script},
  81 @samp{style}, @samp{title}.  Another example is:
  82
  83 @example
  84 <html xmlns="@point{}
  85 @end example
  86
  87 @noindent
  88 In this case, the symbol to be completed is empty, and the possible
  89 completions are just @samp{http://www.w3.org/1999/xhtml}.
  90
  91 When you type @kbd{C-@key{RET}}, what happens depends
  92 on what the set of possible completions are.
  93
  94 @itemize @bullet
  95 @item
  96 If the set of completions is empty, nothing
  97 happens.
  98 @item
  99 If there is one possible completion, then that completion is
 100 inserted, together with any following characters that are
 101 required. For example, in this case:
 102
 103 @example
 104 <html xmlns="http://www.w3.org/1999/xhtml">
 105 <@point{}
 106 @end example
 107
 108 @noindent
 109 @kbd{C-@key{RET}} will yield
 110
 111 @example
 112 <html xmlns="http://www.w3.org/1999/xhtml">
 113 <head@point{}
 114 @end example
 115 @item
 116 If there is more than one possible completion, but all
 117 possible completions share a common non-empty prefix, then that prefix
 118 is inserted. For example, suppose the buffer is:
 119
 120 @example
 121 <html x@point{}
 122 @end example
 123
 124 @noindent
 125 The symbol to be completed is @samp{x}. The possible completions
 126 are @samp{xmlns} and @samp{xml:lang}.  These share a
 127 common prefix of @samp{xml}.  Thus, @kbd{C-@key{RET}}
 128 will yield:
 129
 130 @example
 131 <html xml@point{}
 132 @end example
 133
 134 @noindent
 135 Typically, you would do @kbd{C-@key{RET}} again, which would
 136 have the result described in the next item.
 137 @item
 138 If there is more than one possible completion, but the
 139 possible completions do not share a non-empty prefix, then Emacs will
 140 prompt you to input the symbol in the minibuffer, initializing the
 141 minibuffer with the symbol to be completed, and popping up a buffer
 142 showing the possible completions.  You can now input the symbol to be
 143 inserted.  The symbol you input will be inserted in the buffer instead
 144 of the symbol to be completed.  Emacs will then insert any required
 145 characters after the symbol.  For example, if it contains:
 146
 147 @example
 148 <html xml@point{}
 149 @end example
 150
 151 @noindent
 152 Emacs will prompt you in the minibuffer with
 153
 154 @example
 155 Attribute: xml@point{}
 156 @end example
 157
 158 @noindent
 159 and the buffer showing possible completions will contain
 160
 161 @example
 162 Possible completions are:
 163 xml:lang                           xmlns
 164 @end example
 165
 166 @noindent
 167 If you input @kbd{xmlns}, the result will be:
 168
 169 @example
 170 <html xmlns="@point{}
 171 @end example
 172
 173 @noindent
 174 (If you do @kbd{C-@key{RET}} again, the namespace URI will
 175 be inserted. Should that happen automatically?)
 176 @end itemize
 177
 178 @node Inserting end-tags
 179 @chapter Inserting end-tags
 180
 181 The main redundancy in XML syntax is end-tags.  nxml-mode provides
 182 several ways to make it easier to enter end-tags.  You can use all of
 183 these without a schema.
 184
 185 You can use @kbd{C-@key{RET}} after @samp{</}
 186 to complete the rest of the end-tag.
 187
 188 @kbd{C-c C-f} inserts an end-tag for the element containing
 189 point. This command is useful when you want to input the start-tag,
 190 then input the content and finally input the end-tag. The @samp{f}
 191 is mnemonic for finish.
 192
 193 If you want to keep tags balanced and input the end-tag at the
 194 same time as the start-tag, before inputting the content, then you can
 195 use @kbd{C-c C-i}. This inserts a @samp{>}, then inserts
 196 the end-tag and leaves point before the end-tag.  @kbd{C-c C-b}
 197 is similar but more convenient for block-level elements: it puts the
 198 start-tag, point and the end-tag on successive lines, appropriately
 199 indented. The @samp{i} is mnemonic for inline and the
 200 @samp{b} is mnemonic for block.
 201
 202 Finally, you can customize nxml-mode so that @kbd{/}
 203 automatically inserts the rest of the end-tag when it occurs after
 204 @samp{<}, by doing
 205
 206 @display
 207 @kbd{M-x customize-variable @key{RET} nxml-slash-auto-complete-flag @key{RET}}
 208 @end display
 209
 210 @noindent
 211 and then following the instructions in the displayed buffer.
 212
 213 @node Paragraphs
 214 @chapter Paragraphs
 215
 216 Emacs has several commands that operate on paragraphs, most
 217 notably @kbd{M-q}. nXML mode redefines these to work in a way
 218 that is useful for XML.  The exact rules that are used to find the
 219 beginning and end of a paragraph are complicated; they are designed
 220 mainly to ensure that @kbd{M-q} does the right thing.
 221
 222 A paragraph consists of one or more complete, consecutive lines.
 223 A group of lines is not considered a paragraph unless it contains some
 224 non-whitespace characters between tags or inside comments.  A blank
 225 line separates paragraphs.  A single tag on a line by itself also
 226 separates paragraphs.  More precisely, if one tag together with any
 227 leading and trailing whitespace completely occupy one or more lines,
 228 then those lines will not be included in any paragraph.
 229
 230 A start-tag at the beginning of the line (possibly indented) may
 231 be treated as starting a paragraph.  Similarly, an end-tag at the end
 232 of the line may be treated as ending a paragraph. The following rules
 233 are used to determine whether such a tag is in fact treated as a
 234 paragraph boundary:
 235
 236 @itemize @bullet
 237 @item
 238 If the schema does not allow text at that point, then it
 239 is a paragraph boundary.
 240 @item
 241 If the end-tag corresponding to the start-tag is not at
 242 the end of its line, or the start-tag corresponding to the end-tag is
 243 not at the beginning of its line, then it is not a paragraph
 244 boundary. For example, in
 245
 246 @example
 247 <p>This is a paragraph with an
 248 <emph>emphasized</emph> phrase.
 249 @end example
 250
 251 @noindent
 252 the @samp{<emph>} start-tag would not be considered as
 253 starting a paragraph, because its corresponding end-tag is not at the
 254 end of the line.
 255 @item
 256 If there is text that is a sibling in element tree, then
 257 it is not a paragraph boundary.  For example, in
 258
 259 @example
 260 <p>This is a paragraph with an
 261 <emph>emphasized phrase that takes one source line</emph>
 262 @end example
 263
 264 @noindent
 265 the @samp{<emph>} start-tag would not be considered as
 266 starting a paragraph, even though its end-tag is at the end of its
 267 line, because there the text @samp{This is a paragraph with an}
 268 is a sibling of the @samp{emph} element.
 269 @item
 270 Otherwise, it is a paragraph boundary.
 271 @end itemize
 272
 273 @node Outlining
 274 @chapter Outlining
 275
 276 nXML mode allows you to display all or part of a buffer as an
 277 outline, in a similar way to Emacs' outline mode.  An outline in nXML
 278 mode is based on recognizing two kinds of element: sections and
 279 headings.  There is one heading for every section and one section for
 280 every heading.  A section contains its heading as or within its first
 281 child element.  A section also contains its subordinate sections (its
 282 subsections).  The text content of a section consists of anything in a
 283 section that is neither a subsection nor a heading.
 284
 285 Note that this is a different model from that used by XHTML.
 286 nXML mode's outline support will not be useful for XHTML unless you
 287 adopt a convention of adding a @code{div} to enclose each
 288 section, rather than having sections implicitly delimited by different
 289 @code{h@var{n}} elements.  This limitation may be removed
 290 in a future version.
 291
 292 The variable @code{nxml-section-element-name-regexp} gives
 293 a regexp for the local names (i.e. the part of the name following any
 294 prefix) of section elements. The variable
 295 @code{nxml-heading-element-name-regexp} gives a regexp for the
 296 local names of heading elements. For an element to be recognized
 297 as a section
 298
 299 @itemize @bullet
 300 @item
 301 its start-tag must occur at the beginning of a line
 302 (possibly indented);
 303 @item
 304 its local name must match
 305 @code{nxml-section-element-name-regexp};
 306 @item
 307 either its first child element or a descendant of that
 308 first child element must have a local name that matches
 309 @code{nxml-heading-element-name-regexp}; the first such element
 310 is treated as the section's heading.
 311 @end itemize
 312
 313 @noindent
 314 You can customize these variables using @kbd{M-x
 315 customize-variable}.
 316
 317 There are three possible outline states for a section:
 318
 319 @itemize @bullet
 320 @item
 321 normal, showing everything, including its heading, text
 322 content and subsections; each subsection is displayed according to the
 323 state of that subsection;
 324 @item
 325 showing just its heading, with both its text content and
 326 its subsections hidden; all subsections are hidden regardless of their
 327 state;
 328 @item
 329 showing its heading and its subsections, with its text
 330 content hidden; each subsection is displayed according to the state of
 331 that subsection.
 332 @end itemize
 333
 334 In the last two states, where the text content is hidden, the
 335 heading is displayed specially, in an abbreviated form. An element
 336 like this:
 337
 338 @example
 339 <section>
 340 <title>Food</title>
 341 <para>There are many kinds of food.</para>
 342 </section>
 343 @end example
 344
 345 @noindent
 346 would be displayed on a single line like this:
 347
 348 @example
 349 <-section>Food...</>
 350 @end example
 351
 352 @noindent
 353 If there are hidden subsections, then a @code{+} will be used
 354 instead of a @code{-} like this:
 355
 356 @example
 357 <+section>Food...</>
 358 @end example
 359
 360 @noindent
 361 If there are non-hidden subsections, then the section will instead be
 362 displayed like this:
 363
 364 @example
 365 <-section>Food...
 366   <-section>Delicious Food...</>
 367   <-section>Distasteful Food...</>
 368 </-section>
 369 @end example
 370
 371 @noindent
 372 The heading is always displayed with an indent that corresponds to its
 373 depth in the outline, even it is not actually indented in the buffer.
 374 The variable @code{nxml-outline-child-indent} controls how much
 375 a subheading is indented with respect to its parent heading when the
 376 heading is being displayed specially.
 377
 378 Commands to change the outline state of sections are bound to
 379 key sequences that start with @kbd{C-c C-o} (@kbd{o} is
 380 mnemonic for outline).  The third and final key has been chosen to be
 381 consistent with outline mode.  In the following descriptions
 382 current section means the section containing point, or, more precisely,
 383 the innermost section containing the character immediately following
 384 point.
 385
 386 @itemize @bullet
 387 @item
 388 @kbd{C-c C-o C-a} shows all sections in the buffer
 389 normally.
 390 @item
 391 @kbd{C-c C-o C-t} hides the text content
 392 of all sections in the buffer.
 393 @item
 394 @kbd{C-c C-o C-c} hides the text content
 395 of the current section.
 396 @item
 397 @kbd{C-c C-o C-e} shows the text content
 398 of the current section.
 399 @item
 400 @kbd{C-c C-o C-d} hides the text content
 401 and subsections of the current section.
 402 @item
 403 @kbd{C-c C-o C-s} shows the current section
 404 and all its direct and indirect subsections normally.
 405 @item
 406 @kbd{C-c C-o C-k} shows the headings of the
 407 direct and indirect subsections of the current section.
 408 @item
 409 @kbd{C-c C-o C-l} hides the text content of the
 410 current section and of its direct and indirect
 411 subsections.
 412 @item
 413 @kbd{C-c C-o C-i} shows the headings of the
 414 direct subsections of the current section.
 415 @item
 416 @kbd{C-c C-o C-o} hides as much as possible without
 417 hiding the current section's text content; the headings of ancestor
 418 sections of the current section and their child section sections will
 419 not be hidden.
 420 @end itemize
 421
 422 When a heading is displayed specially, you can use
 423 @key{RET} in that heading to show the text content of the section
 424 in the same way as @kbd{C-c C-o C-e}.
 425
 426 You can also use the mouse to change the outline state:
 427 @kbd{S-mouse-2} hides the text content of a section in the same
 428 way as@kbd{C-c C-o C-c}; @kbd{mouse-2} on a specially
 429 displayed heading shows the text content of the section in the same
 430 way as @kbd{C-c C-o C-e}; @kbd{mouse-1} on a specially
 431 displayed start-tag toggles the display of subheadings on and
 432 off.
 433
 434 The outline state for each section is stored with the first
 435 character of the section (as a text property). Every command that
 436 changes the outline state of any section updates the display of the
 437 buffer so that each section is displayed correctly according to its
 438 outline state.  If the section structure is subsequently changed, then
 439 it is possible for the display to no longer correctly reflect the
 440 stored outline state. @kbd{C-c C-o C-r} can be used to refresh
 441 the display so it is correct again.
 442
 443 @node Locating a schema
 444 @chapter Locating a schema
 445
 446 nXML mode has a configurable set of rules to locate a schema for
 447 the file being edited.  The rules are contained in one or more schema
 448 locating files, which are XML documents.
 449
 450 The variable @samp{rng-schema-locating-files} specifies
 451 the list of the file-names of schema locating files that nXML mode
 452 should use.  The order of the list is significant: when file
 453 @var{x} occurs in the list before file @var{y} then rules
 454 from file @var{x} have precedence over rules from file
 455 @var{y}.  A filename specified in
 456 @samp{rng-schema-locating-files} may be relative. If so, it will
 457 be resolved relative to the document for which a schema is being
 458 located. It is not an error if relative file-names in
 459 @samp{rng-schema-locating-files} do not not exist. You can use
 460 @kbd{M-x customize-variable @key{RET} rng-schema-locating-files
 461 @key{RET}} to customize the list of schema locating
 462 files.
 463
 464 By default, @samp{rng-schema-locating-files} list has two
 465 members: @samp{schemas.xml}, and
 466 @samp{@var{dist-dir}/schema/schemas.xml} where
 467 @samp{@var{dist-dir}} is the directory containing the nXML
 468 distribution. The first member will cause nXML mode to use a file
 469 @samp{schemas.xml} in the same directory as the document being
 470 edited if such a file exist.  The second member contains rules for the
 471 schemas that are included with the nXML distribution.
 472
 473 @menu
 474 * Commands for locating a schema::
 475 * Schema locating files::
 476 @end menu
 477
 478 @node Commands for locating a schema
 479 @section Commands for locating a schema
 480
 481 The command @kbd{C-c C-s C-w} will tell you what schema
 482 is currently being used.
 483
 484 The rules for locating a schema are applied automatically when
 485 you visit a file in nXML mode. However, if you have just created a new
 486 file and the schema cannot be inferred from the file-name, then this
 487 will not locate the right schema.  In this case, you should insert the
 488 start-tag of the root element and then use the command @kbd{C-c
 489 C-a}, which reapplies the rules based on the current content of
 490 the document.  It is usually not necessary to insert the complete
 491 start-tag; often just @samp{<@var{name}} is
 492 enough.
 493
 494 If you want to use a schema that has not yet been added to the
 495 schema locating files, you can use the command @kbd{C-c C-s C-f}
 496 to manually select the file contaiing the schema for the document in
 497 current buffer.  Emacs will read the file-name of the schema from the
 498 minibuffer. After reading the file-name, Emacs will ask whether you
 499 wish to add a rule to a schema locating file that persistently
 500 associates the document with the selected schema.  The rule will be
 501 added to the first file in the list specified
 502 @samp{rng-schema-locating-files}; it will create the file if
 503 necessary, but will not create a directory. If the variable
 504 @samp{rng-schema-locating-files} has not been customized, this
 505 means that the rule will be added to the file @samp{schemas.xml}
 506 in the same directory as the document being edited.
 507
 508 The command @kbd{C-c C-s C-t} allows you to select a schema by
 509 specifying an identifier for the type of the document.  The schema
 510 locating files determine the available type identifiers and what
 511 schema is used for each type identifier. This is useful when it is
 512 impossible to infer the right schema from either the file-name or the
 513 content of the document, even though the schema is already in the
 514 schema locating file.  A situation in which this can occur is when
 515 there are multiple variants of a schema where all valid documents have
 516 the same document element.  For example, XHTML has Strict and
 517 Transitional variants.  In a situation like this, a schema locating file
 518 can define a type identifier for each variant. As with @kbd{C-c
 519 C-s C-f}, Emacs will ask whether you wish to add a rule to a schema
 520 locating file that persistently associates the document with the
 521 specified type identifier.
 522
 523 The command @kbd{C-c C-s C-l} adds a rule to a schema
 524 locating file that persistently associates the document with
 525 the schema that is currently being used.
 526
 527 @node Schema locating files
 528 @section Schema locating files
 529
 530 Each schema locating file specifies a list of rules.  The rules
 531 from each file are appended in order. To locate a schema each rule is
 532 applied in turn until a rule matches.  The first matching rule is then
 533 used to determine the schema.
 534
 535 Schema locating files are designed to be useful for other
 536 applications that need to locate a schema for a document. In fact,
 537 there is nothing specific to locating schemas in the design; it could
 538 equally well be used for locating a stylesheet.
 539
 540 @menu
 541 * Schema locating file syntax basics::
 542 * Using the document's URI to locate a schema::
 543 * Using the document element to locate a schema::
 544 * Using type identifiers in schema locating files::
 545 * Using multiple schema locating files::
 546 @end menu
 547
 548 @node Schema locating file syntax basics
 549 @subsection Schema locating file syntax basics
 550
 551 There is a schema for schema locating files in the file
 552 @samp{locate.rnc} in the schema directory.  Schema locating
 553 files must be valid with respect to this schema.
 554
 555 The document element of a schema locating file must be
 556 @samp{locatingRules} and the namespace URI must be
 557 @samp{http://thaiopensource.com/ns/locating-rules/1.0}.  The
 558 children of the document element specify rules. The order of the
 559 children is the same as the order of the rules.  Here's a complete
 560 example of a schema locating file:
 561
 562 @example
 563 <?xml version="1.0"?>
 564 <locatingRules xmlns="http://thaiopensource.com/ns/locating-rules/1.0">
 565   <namespace ns="http://www.w3.org/1999/xhtml" uri="xhtml.rnc"/>
 566   <documentElement localName="book" uri="docbook.rnc"/>
 567 </locatingRules>
 568 @end example
 569
 570 @noindent
 571 This says to use the schema @samp{xhtml.rnc} for a document with
 572 namespace @samp{http://www.w3.org/1999/xhtml}, and to use the
 573 schema @samp{docbook.rnc} for a document whose local name is
 574 @samp{book}.  If the document element had both a namespace URI
 575 of @samp{http://www.w3.org/1999/xhtml} and a local name of
 576 @samp{book}, then the matching rule that comes first will be
 577 used and so the schema @samp{xhtml.rnc} would be used.  There is
 578 no precedence between different types of rule; the first matching rule
 579 of any type is used.
 580
 581 As usual with XML-related technologies, resources are identified
 582 by URIs.  The @samp{uri} attribute identifies the schema by
 583 specifying the URI.  The URI may be relative.  If so, it is resolved
 584 relative to the URI of the schema locating file that contains
 585 attribute. This means that if the value of @samp{uri} attribute
 586 does not contain a @samp{/}, then it will refer to a filename in
 587 the same directory as the schema locating file.
 588
 589 @node Using the document's URI to locate a schema
 590 @subsection Using the document's URI to locate a schema
 591
 592 A @samp{uri} rule locates a schema based on the URI of the
 593 document.  The @samp{uri} attribute specifies the URI of the
 594 schema.  The @samp{resource} attribute can be used to specify
 595 the schema for a particular document.  For example,
 596
 597 @example
 598 <uri resource="spec.xml" uri="docbook.rnc"/>
 599 @end example
 600
 601 @noindent
 602 specifies that that the schema for @samp{spec.xml} is
 603 @samp{docbook.rnc}.
 604
 605 The @samp{pattern} attribute can be used instead of the
 606 @samp{resource} attribute to specify the schema for any document
 607 whose URI matches a pattern.  The pattern has the same syntax as an
 608 absolute or relative URI except that the path component of the URI can
 609 use a @samp{*} character to stand for zero or more characters
 610 within a path segment (i.e. any character other @samp{/}).
 611 Typically, the URI pattern looks like a relative URI, but, whereas a
 612 relative URI in the @samp{resource} attribute is resolved into a
 613 particular absolute URI using the base URI of the schema locating
 614 file, a relative URI pattern matches if it matches some number of
 615 complete path segments of the document's URI ending with the last path
 616 segment of the document's URI. For example,
 617
 618 @example
 619 <uri pattern="*.xsl" uri="xslt.rnc"/>
 620 @end example
 621
 622 @noindent
 623 specifies that the schema for documents with a URI whose path ends
 624 with @samp{.xsl} is @samp{xslt.rnc}.
 625
 626 A @samp{transformURI} rule locates a schema by
 627 transforming the URI of the document. The @samp{fromPattern}
 628 attribute specifies a URI pattern with the same meaning as the
 629 @samp{pattern} attribute of the @samp{uri} element.  The
 630 @samp{toPattern} attribute is a URI pattern that is used to
 631 generate the URI of the schema.  Each @samp{*} in the
 632 @samp{toPattern} is replaced by the string that matched the
 633 corresponding @samp{*} in the @samp{fromPattern}.  The
 634 resulting string is appended to the initial part of the document's URI
 635 that was not explicitly matched by the @samp{fromPattern}.  The
 636 rule matches only if the transformed URI identifies an existing
 637 resource.  For example, the rule
 638
 639 @example
 640 <transformURI fromPattern="*.xml" toPattern="*.rnc"/>
 641 @end example
 642
 643 @noindent
 644 would transform the URI @samp{file:///home/jjc/docs/spec.xml}
 645 into the URI @samp{file:///home/jjc/docs/spec.rnc}.  Thus, this
 646 rule specifies that to locate a schema for a document
 647 @samp{@var{foo}.xml}, Emacs should test whether a file
 648 @samp{@var{foo}.rnc} exists in the same directory as
 649 @samp{@var{foo}.xml}, and, if so, should use it as the
 650 schema.
 651
 652 @node Using the document element to locate a schema
 653 @subsection Using the document element to locate a schema
 654
 655 A @samp{documentElement} rule locates a schema based on
 656 the local name and prefix of the document element. For example, a rule
 657
 658 @example
 659 <documentElement prefix="xsl" localName="stylesheet" uri="xslt.rnc"/>
 660 @end example
 661
 662 @noindent
 663 specifies that when the name of the document element is
 664 @samp{xsl:stylesheet}, then @samp{xslt.rnc} should be used
 665 as the schema. Either the @samp{prefix} or
 666 @samp{localName} attribute may be omitted to allow any prefix or
 667 local name.
 668
 669 A @samp{namespace} rule locates a schema based on the
 670 namespace URI of the document element. For example, a rule
 671
 672 @example
 673 <namespace ns="http://www.w3.org/1999/XSL/Transform" uri="xslt.rnc"/>
 674 @end example
 675
 676 @noindent
 677 specifies that when the namespace URI of the document is
 678 @samp{http://www.w3.org/1999/XSL/Transform}, then
 679 @samp{xslt.rnc} should be used as the schema.
 680
 681 @node Using type identifiers in schema locating files
 682 @subsection Using type identifiers in schema locating files
 683
 684 Type identifiers allow a level of indirection in locating the
 685 schema for a document.  Instead of associating the document directly
 686 with a schema URI, the document is associated with a type identifier,
 687 which is in turn associated with a schema URI. nXML mode does not
 688 constrain the format of type identifiers.  They can be simply strings
 689 without any formal structure or they can be public identifiers or
 690 URIs.  Note that these type identifiers have nothing to do with the
 691 DOCTYPE declaration.  When comparing type identifiers, whitespace is
 692 normalized in the same way as with the @samp{xsd:token}
 693 datatype: leading and trailing whitespace is stripped; other sequences
 694 of whitespace are normalized to a single space character.
 695
 696 Each of the rules described in previous sections that uses a
 697 @samp{uri} attribute to specify a schema, can instead use a
 698 @samp{typeId} attribute to specify a type identifier.  The type
 699 identifier can be associated with a URI using a @samp{typeId}
 700 element. For example,
 701
 702 @example
 703 <locatingRules xmlns="http://thaiopensource.com/ns/locating-rules/1.0">
 704   <namespace ns="http://www.w3.org/1999/xhtml" typeId="XHTML"/>
 705   <typeId id="XHTML" typeId="XHTML Strict"/>
 706   <typeId id="XHTML Strict" uri="xhtml-strict.rnc"/>
 707   <typeId id="XHTML Transitional" uri="xhtml-transitional.rnc"/>
 708 </locatingRules>
 709 @end example
 710
 711 @noindent
 712 declares three type identifiers @samp{XHTML} (representing the
 713 default variant of XHTML to be used), @samp{XHTML Strict} and
 714 @samp{XHTML Transitional}.  Such a schema locating file would
 715 use @samp{xhtml-strict.rnc} for a document whose namespace is
 716 @samp{http://www.w3.org/1999/xhtml}.  But it is considerably
 717 more flexible than a schema locating file that simply specified
 718
 719 @example
 720 <namespace ns="http://www.w3.org/1999/xhtml" uri="xhtml-strict.rnc"/>
 721 @end example
 722
 723 @noindent
 724 A user can easily use @kbd{C-c C-s C-t} to select between XHTML
 725 Strict and XHTML Transitional. Also, a user can easily add a catalog
 726
 727 @example
 728 <locatingRules xmlns="http://thaiopensource.com/ns/locating-rules/1.0">
 729   <typeId id="XHTML" typeId="XHTML Transitional"/>
 730 </locatingRules>
 731 @end example
 732
 733 @noindent
 734 that makes the default variant of XHTML be XHTML Transitional.
 735
 736 @node Using multiple schema locating files
 737 @subsection Using multiple schema locating files
 738
 739 The @samp{include} element includes rules from another
 740 schema locating file.  The behavior is exactly as if the rules from
 741 that file were included in place of the @samp{include} element.
 742 Relative URIs are resolved into absolute URIs before the inclusion is
 743 performed. For example,
 744
 745 @example
 746 <include rules="../rules.xml"/>
 747 @end example
 748
 749 @noindent
 750 includes the rules from @samp{rules.xml}.
 751
 752 The process of locating a schema takes as input a list of schema
 753 locating files.  The rules in all these files and in the files they
 754 include are resolved into a single list of rules, which are applied
 755 strictly in order.  Sometimes this order is not what is needed.
 756 For example, suppose you have two schema locating files, a private
 757 file
 758
 759 @example
 760 <locatingRules xmlns="http://thaiopensource.com/ns/locating-rules/1.0">
 761   <namespace ns="http://www.w3.org/1999/xhtml" uri="xhtml.rnc"/>
 762 </locatingRules>
 763 @end example
 764
 765 @noindent
 766 followed by a public file
 767
 768 @example
 769 <locatingRules xmlns="http://thaiopensource.com/ns/locating-rules/1.0">
 770   <transformURI pathSuffix=".xml" replacePathSuffix=".rnc"/>
 771   <namespace ns="http://www.w3.org/1999/XSL/Transform" typeId="XSLT"/>
 772 </locatingRules>
 773 @end example
 774
 775 @noindent
 776 The effect of these two files is that the XHTML @samp{namespace}
 777 rule takes precedence over the @samp{transformURI} rule, which
 778 is almost certainly not what is needed.  This can be solved by adding
 779 an @samp{applyFollowingRules} to the private file.
 780
 781 @example
 782 <locatingRules xmlns="http://thaiopensource.com/ns/locating-rules/1.0">
 783   <applyFollowingRules ruleType="transformURI"/>
 784   <namespace ns="http://www.w3.org/1999/xhtml" uri="xhtml.rnc"/>
 785 </locatingRules>
 786 @end example
 787
 788 @node DTDs
 789 @chapter DTDs
 790
 791 nxml-mode is designed to support the creation of standalone XML
 792 documents that do not depend on a DTD.  Although it is common practice
 793 to insert a DOCTYPE declaration referencing an external DTD, this has
 794 undesirable side-effects.  It means that the document is no longer
 795 self-contained. It also means that different XML parsers may interpret
 796 the document in different ways, since the XML Recommendation does not
 797 require XML parsers to read the DTD.  With DTDs, it was impractical to
 798 get validation without using an external DTD or reference to an
 799 parameter entity.  With RELAX NG and other schema languages, you can
 800 simulataneously get the benefits of validation and standalone XML
 801 documents.  Therefore, I recommend that you do not reference an
 802 external DOCTYPE in your XML documents.
 803
 804 One problem is entities for characters. Typically, as well as
 805 providing validation, DTDs also provide a set of character entities
 806 for documents to use. Schemas cannot provide this functionality,
 807 because schema validation happens after XML parsing.  The recommended
 808 solution is to either use the Unicode characters directly, or, if this
 809 is impractical, use character references.  nXML mode supports this by
 810 providing commands for entering characters and character references
 811 using the Unicode names, and can display the glyph corresponding to a
 812 character reference.
 813
 814 @node Limitations
 815 @chapter Limitations
 816
 817 nXML mode has some limitations:
 818
 819 @itemize @bullet
 820 @item
 821 DTD support is limited.  Internal parsed general entities declared
 822 in the internal subset are supported provided they do not contain
 823 elements. Other usage of DTDs is ignored.
 824 @item
 825 The restrictions on RELAX NG schemas in section 7 of the RELAX NG
 826 specification are not enforced.
 827 @item
 828 Unicode support has problems. This stems mostly from the fact that
 829 the XML (and RELAX NG) character model is based squarely on Unicode,
 830 whereas the Emacs character model is not.  Emacs 22 is slated to have
 831 full Unicode support, which should improve the situation here.
 832 @end itemize
 833
 834 @bye
 835
 836 @ignore
 837    arch-tag: 3b6e8ac2-ae8d-4f38-bd43-ce9f80be04d6
 838 @end ignore