doc/misc/nxml-mode.texi

   1 \input texinfo @c -*- texinfo -*-
   2 @c %**start of header
   3 @setfilename ../../info/nxml-mode
   4 @settitle nXML Mode
   5 @c %**end of header
   6
   7 @copying
   8 This manual documents nxml-mode, an Emacs major mode for editing
   9 XML with RELAX NG support.
  10
  11 Copyright @copyright{} 2007, 2008, 2009, 2010 Free Software Foundation, Inc.
  12
  13 @quotation
  14 Permission is granted to copy, distribute and/or modify this document
  15 under the terms of the GNU Free Documentation License, Version 1.3 or
  16 any later version published by the Free Software Foundation; with no
  17 Invariant Sections, with the Front-Cover texts being ``A GNU
  18 Manual,'' and with the Back-Cover Texts as in (a) below.  A copy of the
  19 license is included in the section entitled ``GNU Free Documentation
  20 License'' in the Emacs manual.
  21
  22 (a) The FSF's Back-Cover Text is: ``You have the freedom to copy and
  23 modify this GNU manual.  Buying copies from the FSF supports it in
  24 developing GNU and promoting software freedom.''
  25
  26 This document is part of a collection distributed under the GNU Free
  27 Documentation License.  If you want to distribute this document
  28 separately from the collection, you can do so by adding a copy of the
  29 license to the document, as described in section 6 of the license.
  30 @end quotation
  31 @end copying
  32
  33 @dircategory Emacs
  34 @direntry
  35 * nXML Mode: (nxml-mode).       XML editing mode with RELAX NG support.
  36 @end direntry
  37
  38 @node Top
  39 @top nXML Mode
  40
  41 @insertcopying
  42
  43 This manual is not yet complete.
  44
  45 @menu
  46 * Completion::
  47 * Inserting end-tags::
  48 * Paragraphs::
  49 * Outlining::
  50 * Locating a schema::
  51 * DTDs::
  52 * Limitations::
  53 @end menu
  54
  55 @node Completion
  56 @chapter Completion
  57
  58 Apart from real-time validation, the most important feature that
  59 nxml-mode provides for assisting in document creation is "completion".
  60 Completion assists the user in inserting characters at point, based on
  61 knowledge of the schema and on the contents of the buffer before
  62 point.
  63
  64 The traditional GNU Emacs key combination for completion in a
  65 buffer is @kbd{M-@key{TAB}}. However, many window systems
  66 and window managers use this key combination themselves (typically for
  67 switching between windows) and do not pass it to applications. It's
  68 hard to find key combinations in GNU Emacs that are both easy to type
  69 and not taken by something else.  @kbd{C-@key{RET}} (i.e.
  70 pressing the Enter or Return key, while the Ctrl key is held down) is
  71 available.  It won't be available on a traditional terminal (because
  72 it is indistinguishable from Return), but it will work with a window
  73 system.  Therefore we adopt the following solution by default: use
  74 @kbd{C-@key{RET}} when there's a window system and
  75 @kbd{M-@key{TAB}} when there's not.  In the following, I
  76 will assume that a window system is being used and will therefore
  77 refer to @kbd{C-@key{RET}}.
  78
  79 Completion works by examining the symbol preceding point.  This
  80 is the symbol to be completed. The symbol to be completed may be the
  81 empty. Completion considers what symbols starting with the symbol to
  82 be completed would be valid replacements for the symbol to be
  83 completed, given the schema and the contents of the buffer before
  84 point.  These symbols are the possible completions.  An example may
  85 make this clearer.  Suppose the buffer looks like this (where @point{}
  86 indicates point):
  87
  88 @example
  89 <html xmlns="http://www.w3.org/1999/xhtml">
  90 <h@point{}
  91 @end example
  92
  93 @noindent
  94 and the schema is XHTML.  In this context, the symbol to be completed
  95 is @samp{h}.  The possible completions consist of just
  96 @samp{head}.  Another example, is
  97
  98 @example
  99 <html xmlns="http://www.w3.org/1999/xhtml">
 100 <head>
 101 <@point{}
 102 @end example
 103
 104 @noindent
 105 In this case, the symbol to be completed is empty, and the possible
 106 completions are @samp{base}, @samp{isindex},
 107 @samp{link}, @samp{meta}, @samp{script},
 108 @samp{style}, @samp{title}.  Another example is:
 109
 110 @example
 111 <html xmlns="@point{}
 112 @end example
 113
 114 @noindent
 115 In this case, the symbol to be completed is empty, and the possible
 116 completions are just @samp{http://www.w3.org/1999/xhtml}.
 117
 118 When you type @kbd{C-@key{RET}}, what happens depends
 119 on what the set of possible completions are.
 120
 121 @itemize @bullet
 122 @item
 123 If the set of completions is empty, nothing
 124 happens.
 125 @item
 126 If there is one possible completion, then that completion is
 127 inserted, together with any following characters that are
 128 required. For example, in this case:
 129
 130 @example
 131 <html xmlns="http://www.w3.org/1999/xhtml">
 132 <@point{}
 133 @end example
 134
 135 @noindent
 136 @kbd{C-@key{RET}} will yield
 137
 138 @example
 139 <html xmlns="http://www.w3.org/1999/xhtml">
 140 <head@point{}
 141 @end example
 142 @item
 143 If there is more than one possible completion, but all
 144 possible completions share a common non-empty prefix, then that prefix
 145 is inserted. For example, suppose the buffer is:
 146
 147 @example
 148 <html x@point{}
 149 @end example
 150
 151 @noindent
 152 The symbol to be completed is @samp{x}. The possible completions
 153 are @samp{xmlns} and @samp{xml:lang}.  These share a
 154 common prefix of @samp{xml}.  Thus, @kbd{C-@key{RET}}
 155 will yield:
 156
 157 @example
 158 <html xml@point{}
 159 @end example
 160
 161 @noindent
 162 Typically, you would do @kbd{C-@key{RET}} again, which would
 163 have the result described in the next item.
 164 @item
 165 If there is more than one possible completion, but the
 166 possible completions do not share a non-empty prefix, then Emacs will
 167 prompt you to input the symbol in the minibuffer, initializing the
 168 minibuffer with the symbol to be completed, and popping up a buffer
 169 showing the possible completions.  You can now input the symbol to be
 170 inserted.  The symbol you input will be inserted in the buffer instead
 171 of the symbol to be completed.  Emacs will then insert any required
 172 characters after the symbol.  For example, if it contains:
 173
 174 @example
 175 <html xml@point{}
 176 @end example
 177
 178 @noindent
 179 Emacs will prompt you in the minibuffer with
 180
 181 @example
 182 Attribute: xml@point{}
 183 @end example
 184
 185 @noindent
 186 and the buffer showing possible completions will contain
 187
 188 @example
 189 Possible completions are:
 190 xml:lang                           xmlns
 191 @end example
 192
 193 @noindent
 194 If you input @kbd{xmlns}, the result will be:
 195
 196 @example
 197 <html xmlns="@point{}
 198 @end example
 199
 200 @noindent
 201 (If you do @kbd{C-@key{RET}} again, the namespace URI will
 202 be inserted. Should that happen automatically?)
 203 @end itemize
 204
 205 @node Inserting end-tags
 206 @chapter Inserting end-tags
 207
 208 The main redundancy in XML syntax is end-tags.  nxml-mode provides
 209 several ways to make it easier to enter end-tags.  You can use all of
 210 these without a schema.
 211
 212 You can use @kbd{C-@key{RET}} after @samp{</}
 213 to complete the rest of the end-tag.
 214
 215 @kbd{C-c C-f} inserts an end-tag for the element containing
 216 point. This command is useful when you want to input the start-tag,
 217 then input the content and finally input the end-tag. The @samp{f}
 218 is mnemonic for finish.
 219
 220 If you want to keep tags balanced and input the end-tag at the
 221 same time as the start-tag, before inputting the content, then you can
 222 use @kbd{C-c C-i}. This inserts a @samp{>}, then inserts
 223 the end-tag and leaves point before the end-tag.  @kbd{C-c C-b}
 224 is similar but more convenient for block-level elements: it puts the
 225 start-tag, point and the end-tag on successive lines, appropriately
 226 indented. The @samp{i} is mnemonic for inline and the
 227 @samp{b} is mnemonic for block.
 228
 229 Finally, you can customize nxml-mode so that @kbd{/}
 230 automatically inserts the rest of the end-tag when it occurs after
 231 @samp{<}, by doing
 232
 233 @display
 234 @kbd{M-x customize-variable @key{RET} nxml-slash-auto-complete-flag @key{RET}}
 235 @end display
 236
 237 @noindent
 238 and then following the instructions in the displayed buffer.
 239
 240 @node Paragraphs
 241 @chapter Paragraphs
 242
 243 Emacs has several commands that operate on paragraphs, most
 244 notably @kbd{M-q}. nXML mode redefines these to work in a way
 245 that is useful for XML.  The exact rules that are used to find the
 246 beginning and end of a paragraph are complicated; they are designed
 247 mainly to ensure that @kbd{M-q} does the right thing.
 248
 249 A paragraph consists of one or more complete, consecutive lines.
 250 A group of lines is not considered a paragraph unless it contains some
 251 non-whitespace characters between tags or inside comments.  A blank
 252 line separates paragraphs.  A single tag on a line by itself also
 253 separates paragraphs.  More precisely, if one tag together with any
 254 leading and trailing whitespace completely occupy one or more lines,
 255 then those lines will not be included in any paragraph.
 256
 257 A start-tag at the beginning of the line (possibly indented) may
 258 be treated as starting a paragraph.  Similarly, an end-tag at the end
 259 of the line may be treated as ending a paragraph. The following rules
 260 are used to determine whether such a tag is in fact treated as a
 261 paragraph boundary:
 262
 263 @itemize @bullet
 264 @item
 265 If the schema does not allow text at that point, then it
 266 is a paragraph boundary.
 267 @item
 268 If the end-tag corresponding to the start-tag is not at
 269 the end of its line, or the start-tag corresponding to the end-tag is
 270 not at the beginning of its line, then it is not a paragraph
 271 boundary. For example, in
 272
 273 @example
 274 <p>This is a paragraph with an
 275 <emph>emphasized</emph> phrase.
 276 @end example
 277
 278 @noindent
 279 the @samp{<emph>} start-tag would not be considered as
 280 starting a paragraph, because its corresponding end-tag is not at the
 281 end of the line.
 282 @item
 283 If there is text that is a sibling in element tree, then
 284 it is not a paragraph boundary.  For example, in
 285
 286 @example
 287 <p>This is a paragraph with an
 288 <emph>emphasized phrase that takes one source line</emph>
 289 @end example
 290
 291 @noindent
 292 the @samp{<emph>} start-tag would not be considered as
 293 starting a paragraph, even though its end-tag is at the end of its
 294 line, because there the text @samp{This is a paragraph with an}
 295 is a sibling of the @samp{emph} element.
 296 @item
 297 Otherwise, it is a paragraph boundary.
 298 @end itemize
 299
 300 @node Outlining
 301 @chapter Outlining
 302
 303 nXML mode allows you to display all or part of a buffer as an
 304 outline, in a similar way to Emacs' outline mode.  An outline in nXML
 305 mode is based on recognizing two kinds of element: sections and
 306 headings.  There is one heading for every section and one section for
 307 every heading.  A section contains its heading as or within its first
 308 child element.  A section also contains its subordinate sections (its
 309 subsections).  The text content of a section consists of anything in a
 310 section that is neither a subsection nor a heading.
 311
 312 Note that this is a different model from that used by XHTML.
 313 nXML mode's outline support will not be useful for XHTML unless you
 314 adopt a convention of adding a @code{div} to enclose each
 315 section, rather than having sections implicitly delimited by different
 316 @code{h@var{n}} elements.  This limitation may be removed
 317 in a future version.
 318
 319 The variable @code{nxml-section-element-name-regexp} gives
 320 a regexp for the local names (i.e. the part of the name following any
 321 prefix) of section elements. The variable
 322 @code{nxml-heading-element-name-regexp} gives a regexp for the
 323 local names of heading elements. For an element to be recognized
 324 as a section
 325
 326 @itemize @bullet
 327 @item
 328 its start-tag must occur at the beginning of a line
 329 (possibly indented);
 330 @item
 331 its local name must match
 332 @code{nxml-section-element-name-regexp};
 333 @item
 334 either its first child element or a descendant of that
 335 first child element must have a local name that matches
 336 @code{nxml-heading-element-name-regexp}; the first such element
 337 is treated as the section's heading.
 338 @end itemize
 339
 340 @noindent
 341 You can customize these variables using @kbd{M-x
 342 customize-variable}.
 343
 344 There are three possible outline states for a section:
 345
 346 @itemize @bullet
 347 @item
 348 normal, showing everything, including its heading, text
 349 content and subsections; each subsection is displayed according to the
 350 state of that subsection;
 351 @item
 352 showing just its heading, with both its text content and
 353 its subsections hidden; all subsections are hidden regardless of their
 354 state;
 355 @item
 356 showing its heading and its subsections, with its text
 357 content hidden; each subsection is displayed according to the state of
 358 that subsection.
 359 @end itemize
 360
 361 In the last two states, where the text content is hidden, the
 362 heading is displayed specially, in an abbreviated form. An element
 363 like this:
 364
 365 @example
 366 <section>
 367 <title>Food</title>
 368 <para>There are many kinds of food.</para>
 369 </section>
 370 @end example
 371
 372 @noindent
 373 would be displayed on a single line like this:
 374
 375 @example
 376 <-section>Food...</>
 377 @end example
 378
 379 @noindent
 380 If there are hidden subsections, then a @code{+} will be used
 381 instead of a @code{-} like this:
 382
 383 @example
 384 <+section>Food...</>
 385 @end example
 386
 387 @noindent
 388 If there are non-hidden subsections, then the section will instead be
 389 displayed like this:
 390
 391 @example
 392 <-section>Food...
 393   <-section>Delicious Food...</>
 394   <-section>Distasteful Food...</>
 395 </-section>
 396 @end example
 397
 398 @noindent
 399 The heading is always displayed with an indent that corresponds to its
 400 depth in the outline, even it is not actually indented in the buffer.
 401 The variable @code{nxml-outline-child-indent} controls how much
 402 a subheading is indented with respect to its parent heading when the
 403 heading is being displayed specially.
 404
 405 Commands to change the outline state of sections are bound to
 406 key sequences that start with @kbd{C-c C-o} (@kbd{o} is
 407 mnemonic for outline).  The third and final key has been chosen to be
 408 consistent with outline mode.  In the following descriptions
 409 current section means the section containing point, or, more precisely,
 410 the innermost section containing the character immediately following
 411 point.
 412
 413 @itemize @bullet
 414 @item
 415 @kbd{C-c C-o C-a} shows all sections in the buffer
 416 normally.
 417 @item
 418 @kbd{C-c C-o C-t} hides the text content
 419 of all sections in the buffer.
 420 @item
 421 @kbd{C-c C-o C-c} hides the text content
 422 of the current section.
 423 @item
 424 @kbd{C-c C-o C-e} shows the text content
 425 of the current section.
 426 @item
 427 @kbd{C-c C-o C-d} hides the text content
 428 and subsections of the current section.
 429 @item
 430 @kbd{C-c C-o C-s} shows the current section
 431 and all its direct and indirect subsections normally.
 432 @item
 433 @kbd{C-c C-o C-k} shows the headings of the
 434 direct and indirect subsections of the current section.
 435 @item
 436 @kbd{C-c C-o C-l} hides the text content of the
 437 current section and of its direct and indirect
 438 subsections.
 439 @item
 440 @kbd{C-c C-o C-i} shows the headings of the
 441 direct subsections of the current section.
 442 @item
 443 @kbd{C-c C-o C-o} hides as much as possible without
 444 hiding the current section's text content; the headings of ancestor
 445 sections of the current section and their child section sections will
 446 not be hidden.
 447 @end itemize
 448
 449 When a heading is displayed specially, you can use
 450 @key{RET} in that heading to show the text content of the section
 451 in the same way as @kbd{C-c C-o C-e}.
 452
 453 You can also use the mouse to change the outline state:
 454 @kbd{S-mouse-2} hides the text content of a section in the same
 455 way as@kbd{C-c C-o C-c}; @kbd{mouse-2} on a specially
 456 displayed heading shows the text content of the section in the same
 457 way as @kbd{C-c C-o C-e}; @kbd{mouse-1} on a specially
 458 displayed start-tag toggles the display of subheadings on and
 459 off.
 460
 461 The outline state for each section is stored with the first
 462 character of the section (as a text property). Every command that
 463 changes the outline state of any section updates the display of the
 464 buffer so that each section is displayed correctly according to its
 465 outline state.  If the section structure is subsequently changed, then
 466 it is possible for the display to no longer correctly reflect the
 467 stored outline state. @kbd{C-c C-o C-r} can be used to refresh
 468 the display so it is correct again.
 469
 470 @node Locating a schema
 471 @chapter Locating a schema
 472
 473 nXML mode has a configurable set of rules to locate a schema for
 474 the file being edited.  The rules are contained in one or more schema
 475 locating files, which are XML documents.
 476
 477 The variable @samp{rng-schema-locating-files} specifies
 478 the list of the file-names of schema locating files that nXML mode
 479 should use.  The order of the list is significant: when file
 480 @var{x} occurs in the list before file @var{y} then rules
 481 from file @var{x} have precedence over rules from file
 482 @var{y}.  A filename specified in
 483 @samp{rng-schema-locating-files} may be relative. If so, it will
 484 be resolved relative to the document for which a schema is being
 485 located. It is not an error if relative file-names in
 486 @samp{rng-schema-locating-files} do not exist. You can use
 487 @kbd{M-x customize-variable @key{RET} rng-schema-locating-files
 488 @key{RET}} to customize the list of schema locating
 489 files.
 490
 491 By default, @samp{rng-schema-locating-files} list has two
 492 members: @samp{schemas.xml}, and
 493 @samp{@var{dist-dir}/schema/schemas.xml} where
 494 @samp{@var{dist-dir}} is the directory containing the nXML
 495 distribution. The first member will cause nXML mode to use a file
 496 @samp{schemas.xml} in the same directory as the document being
 497 edited if such a file exist.  The second member contains rules for the
 498 schemas that are included with the nXML distribution.
 499
 500 @menu
 501 * Commands for locating a schema::
 502 * Schema locating files::
 503 @end menu
 504
 505 @node Commands for locating a schema
 506 @section Commands for locating a schema
 507
 508 The command @kbd{C-c C-s C-w} will tell you what schema
 509 is currently being used.
 510
 511 The rules for locating a schema are applied automatically when
 512 you visit a file in nXML mode. However, if you have just created a new
 513 file and the schema cannot be inferred from the file-name, then this
 514 will not locate the right schema.  In this case, you should insert the
 515 start-tag of the root element and then use the command @kbd{C-c C-s
 516 C-a}, which reapplies the rules based on the current content of
 517 the document.  It is usually not necessary to insert the complete
 518 start-tag; often just @samp{<@var{name}} is
 519 enough.
 520
 521 If you want to use a schema that has not yet been added to the
 522 schema locating files, you can use the command @kbd{C-c C-s C-f}
 523 to manually select the file containing the schema for the document in
 524 current buffer.  Emacs will read the file-name of the schema from the
 525 minibuffer. After reading the file-name, Emacs will ask whether you
 526 wish to add a rule to a schema locating file that persistently
 527 associates the document with the selected schema.  The rule will be
 528 added to the first file in the list specified
 529 @samp{rng-schema-locating-files}; it will create the file if
 530 necessary, but will not create a directory. If the variable
 531 @samp{rng-schema-locating-files} has not been customized, this
 532 means that the rule will be added to the file @samp{schemas.xml}
 533 in the same directory as the document being edited.
 534
 535 The command @kbd{C-c C-s C-t} allows you to select a schema by
 536 specifying an identifier for the type of the document.  The schema
 537 locating files determine the available type identifiers and what
 538 schema is used for each type identifier. This is useful when it is
 539 impossible to infer the right schema from either the file-name or the
 540 content of the document, even though the schema is already in the
 541 schema locating file.  A situation in which this can occur is when
 542 there are multiple variants of a schema where all valid documents have
 543 the same document element.  For example, XHTML has Strict and
 544 Transitional variants.  In a situation like this, a schema locating file
 545 can define a type identifier for each variant. As with @kbd{C-c
 546 C-s C-f}, Emacs will ask whether you wish to add a rule to a schema
 547 locating file that persistently associates the document with the
 548 specified type identifier.
 549
 550 The command @kbd{C-c C-s C-l} adds a rule to a schema
 551 locating file that persistently associates the document with
 552 the schema that is currently being used.
 553
 554 @node Schema locating files
 555 @section Schema locating files
 556
 557 Each schema locating file specifies a list of rules.  The rules
 558 from each file are appended in order. To locate a schema each rule is
 559 applied in turn until a rule matches.  The first matching rule is then
 560 used to determine the schema.
 561
 562 Schema locating files are designed to be useful for other
 563 applications that need to locate a schema for a document. In fact,
 564 there is nothing specific to locating schemas in the design; it could
 565 equally well be used for locating a stylesheet.
 566
 567 @menu
 568 * Schema locating file syntax basics::
 569 * Using the document's URI to locate a schema::
 570 * Using the document element to locate a schema::
 571 * Using type identifiers in schema locating files::
 572 * Using multiple schema locating files::
 573 @end menu
 574
 575 @node Schema locating file syntax basics
 576 @subsection Schema locating file syntax basics
 577
 578 There is a schema for schema locating files in the file
 579 @samp{locate.rnc} in the schema directory.  Schema locating
 580 files must be valid with respect to this schema.
 581
 582 The document element of a schema locating file must be
 583 @samp{locatingRules} and the namespace URI must be
 584 @samp{http://thaiopensource.com/ns/locating-rules/1.0}.  The
 585 children of the document element specify rules. The order of the
 586 children is the same as the order of the rules.  Here's a complete
 587 example of a schema locating file:
 588
 589 @example
 590 <?xml version="1.0"?>
 591 <locatingRules xmlns="http://thaiopensource.com/ns/locating-rules/1.0">
 592   <namespace ns="http://www.w3.org/1999/xhtml" uri="xhtml.rnc"/>
 593   <documentElement localName="book" uri="docbook.rnc"/>
 594 </locatingRules>
 595 @end example
 596
 597 @noindent
 598 This says to use the schema @samp{xhtml.rnc} for a document with
 599 namespace @samp{http://www.w3.org/1999/xhtml}, and to use the
 600 schema @samp{docbook.rnc} for a document whose local name is
 601 @samp{book}.  If the document element had both a namespace URI
 602 of @samp{http://www.w3.org/1999/xhtml} and a local name of
 603 @samp{book}, then the matching rule that comes first will be
 604 used and so the schema @samp{xhtml.rnc} would be used.  There is
 605 no precedence between different types of rule; the first matching rule
 606 of any type is used.
 607
 608 As usual with XML-related technologies, resources are identified
 609 by URIs.  The @samp{uri} attribute identifies the schema by
 610 specifying the URI.  The URI may be relative.  If so, it is resolved
 611 relative to the URI of the schema locating file that contains
 612 attribute. This means that if the value of @samp{uri} attribute
 613 does not contain a @samp{/}, then it will refer to a filename in
 614 the same directory as the schema locating file.
 615
 616 @node Using the document's URI to locate a schema
 617 @subsection Using the document's URI to locate a schema
 618
 619 A @samp{uri} rule locates a schema based on the URI of the
 620 document.  The @samp{uri} attribute specifies the URI of the
 621 schema.  The @samp{resource} attribute can be used to specify
 622 the schema for a particular document.  For example,
 623
 624 @example
 625 <uri resource="spec.xml" uri="docbook.rnc"/>
 626 @end example
 627
 628 @noindent
 629 specifies that the schema for @samp{spec.xml} is
 630 @samp{docbook.rnc}.
 631
 632 The @samp{pattern} attribute can be used instead of the
 633 @samp{resource} attribute to specify the schema for any document
 634 whose URI matches a pattern.  The pattern has the same syntax as an
 635 absolute or relative URI except that the path component of the URI can
 636 use a @samp{*} character to stand for zero or more characters
 637 within a path segment (i.e. any character other @samp{/}).
 638 Typically, the URI pattern looks like a relative URI, but, whereas a
 639 relative URI in the @samp{resource} attribute is resolved into a
 640 particular absolute URI using the base URI of the schema locating
 641 file, a relative URI pattern matches if it matches some number of
 642 complete path segments of the document's URI ending with the last path
 643 segment of the document's URI. For example,
 644
 645 @example
 646 <uri pattern="*.xsl" uri="xslt.rnc"/>
 647 @end example
 648
 649 @noindent
 650 specifies that the schema for documents with a URI whose path ends
 651 with @samp{.xsl} is @samp{xslt.rnc}.
 652
 653 A @samp{transformURI} rule locates a schema by
 654 transforming the URI of the document. The @samp{fromPattern}
 655 attribute specifies a URI pattern with the same meaning as the
 656 @samp{pattern} attribute of the @samp{uri} element.  The
 657 @samp{toPattern} attribute is a URI pattern that is used to
 658 generate the URI of the schema.  Each @samp{*} in the
 659 @samp{toPattern} is replaced by the string that matched the
 660 corresponding @samp{*} in the @samp{fromPattern}.  The
 661 resulting string is appended to the initial part of the document's URI
 662 that was not explicitly matched by the @samp{fromPattern}.  The
 663 rule matches only if the transformed URI identifies an existing
 664 resource.  For example, the rule
 665
 666 @example
 667 <transformURI fromPattern="*.xml" toPattern="*.rnc"/>
 668 @end example
 669
 670 @noindent
 671 would transform the URI @samp{file:///home/jjc/docs/spec.xml}
 672 into the URI @samp{file:///home/jjc/docs/spec.rnc}.  Thus, this
 673 rule specifies that to locate a schema for a document
 674 @samp{@var{foo}.xml}, Emacs should test whether a file
 675 @samp{@var{foo}.rnc} exists in the same directory as
 676 @samp{@var{foo}.xml}, and, if so, should use it as the
 677 schema.
 678
 679 @node Using the document element to locate a schema
 680 @subsection Using the document element to locate a schema
 681
 682 A @samp{documentElement} rule locates a schema based on
 683 the local name and prefix of the document element. For example, a rule
 684
 685 @example
 686 <documentElement prefix="xsl" localName="stylesheet" uri="xslt.rnc"/>
 687 @end example
 688
 689 @noindent
 690 specifies that when the name of the document element is
 691 @samp{xsl:stylesheet}, then @samp{xslt.rnc} should be used
 692 as the schema. Either the @samp{prefix} or
 693 @samp{localName} attribute may be omitted to allow any prefix or
 694 local name.
 695
 696 A @samp{namespace} rule locates a schema based on the
 697 namespace URI of the document element. For example, a rule
 698
 699 @example
 700 <namespace ns="http://www.w3.org/1999/XSL/Transform" uri="xslt.rnc"/>
 701 @end example
 702
 703 @noindent
 704 specifies that when the namespace URI of the document is
 705 @samp{http://www.w3.org/1999/XSL/Transform}, then
 706 @samp{xslt.rnc} should be used as the schema.
 707
 708 @node Using type identifiers in schema locating files
 709 @subsection Using type identifiers in schema locating files
 710
 711 Type identifiers allow a level of indirection in locating the
 712 schema for a document.  Instead of associating the document directly
 713 with a schema URI, the document is associated with a type identifier,
 714 which is in turn associated with a schema URI. nXML mode does not
 715 constrain the format of type identifiers.  They can be simply strings
 716 without any formal structure or they can be public identifiers or
 717 URIs.  Note that these type identifiers have nothing to do with the
 718 DOCTYPE declaration.  When comparing type identifiers, whitespace is
 719 normalized in the same way as with the @samp{xsd:token}
 720 datatype: leading and trailing whitespace is stripped; other sequences
 721 of whitespace are normalized to a single space character.
 722
 723 Each of the rules described in previous sections that uses a
 724 @samp{uri} attribute to specify a schema, can instead use a
 725 @samp{typeId} attribute to specify a type identifier.  The type
 726 identifier can be associated with a URI using a @samp{typeId}
 727 element. For example,
 728
 729 @example
 730 <locatingRules xmlns="http://thaiopensource.com/ns/locating-rules/1.0">
 731   <namespace ns="http://www.w3.org/1999/xhtml" typeId="XHTML"/>
 732   <typeId id="XHTML" typeId="XHTML Strict"/>
 733   <typeId id="XHTML Strict" uri="xhtml-strict.rnc"/>
 734   <typeId id="XHTML Transitional" uri="xhtml-transitional.rnc"/>
 735 </locatingRules>
 736 @end example
 737
 738 @noindent
 739 declares three type identifiers @samp{XHTML} (representing the
 740 default variant of XHTML to be used), @samp{XHTML Strict} and
 741 @samp{XHTML Transitional}.  Such a schema locating file would
 742 use @samp{xhtml-strict.rnc} for a document whose namespace is
 743 @samp{http://www.w3.org/1999/xhtml}.  But it is considerably
 744 more flexible than a schema locating file that simply specified
 745
 746 @example
 747 <namespace ns="http://www.w3.org/1999/xhtml" uri="xhtml-strict.rnc"/>
 748 @end example
 749
 750 @noindent
 751 A user can easily use @kbd{C-c C-s C-t} to select between XHTML
 752 Strict and XHTML Transitional. Also, a user can easily add a catalog
 753
 754 @example
 755 <locatingRules xmlns="http://thaiopensource.com/ns/locating-rules/1.0">
 756   <typeId id="XHTML" typeId="XHTML Transitional"/>
 757 </locatingRules>
 758 @end example
 759
 760 @noindent
 761 that makes the default variant of XHTML be XHTML Transitional.
 762
 763 @node Using multiple schema locating files
 764 @subsection Using multiple schema locating files
 765
 766 The @samp{include} element includes rules from another
 767 schema locating file.  The behavior is exactly as if the rules from
 768 that file were included in place of the @samp{include} element.
 769 Relative URIs are resolved into absolute URIs before the inclusion is
 770 performed. For example,
 771
 772 @example
 773 <include rules="../rules.xml"/>
 774 @end example
 775
 776 @noindent
 777 includes the rules from @samp{rules.xml}.
 778
 779 The process of locating a schema takes as input a list of schema
 780 locating files.  The rules in all these files and in the files they
 781 include are resolved into a single list of rules, which are applied
 782 strictly in order.  Sometimes this order is not what is needed.
 783 For example, suppose you have two schema locating files, a private
 784 file
 785
 786 @example
 787 <locatingRules xmlns="http://thaiopensource.com/ns/locating-rules/1.0">
 788   <namespace ns="http://www.w3.org/1999/xhtml" uri="xhtml.rnc"/>
 789 </locatingRules>
 790 @end example
 791
 792 @noindent
 793 followed by a public file
 794
 795 @example
 796 <locatingRules xmlns="http://thaiopensource.com/ns/locating-rules/1.0">
 797   <transformURI pathSuffix=".xml" replacePathSuffix=".rnc"/>
 798   <namespace ns="http://www.w3.org/1999/XSL/Transform" typeId="XSLT"/>
 799 </locatingRules>
 800 @end example
 801
 802 @noindent
 803 The effect of these two files is that the XHTML @samp{namespace}
 804 rule takes precedence over the @samp{transformURI} rule, which
 805 is almost certainly not what is needed.  This can be solved by adding
 806 an @samp{applyFollowingRules} to the private file.
 807
 808 @example
 809 <locatingRules xmlns="http://thaiopensource.com/ns/locating-rules/1.0">
 810   <applyFollowingRules ruleType="transformURI"/>
 811   <namespace ns="http://www.w3.org/1999/xhtml" uri="xhtml.rnc"/>
 812 </locatingRules>
 813 @end example
 814
 815 @node DTDs
 816 @chapter DTDs
 817
 818 nxml-mode is designed to support the creation of standalone XML
 819 documents that do not depend on a DTD.  Although it is common practice
 820 to insert a DOCTYPE declaration referencing an external DTD, this has
 821 undesirable side-effects.  It means that the document is no longer
 822 self-contained. It also means that different XML parsers may interpret
 823 the document in different ways, since the XML Recommendation does not
 824 require XML parsers to read the DTD.  With DTDs, it was impractical to
 825 get validation without using an external DTD or reference to an
 826 parameter entity.  With RELAX NG and other schema languages, you can
 827 simulataneously get the benefits of validation and standalone XML
 828 documents.  Therefore, I recommend that you do not reference an
 829 external DOCTYPE in your XML documents.
 830
 831 One problem is entities for characters. Typically, as well as
 832 providing validation, DTDs also provide a set of character entities
 833 for documents to use. Schemas cannot provide this functionality,
 834 because schema validation happens after XML parsing.  The recommended
 835 solution is to either use the Unicode characters directly, or, if this
 836 is impractical, use character references.  nXML mode supports this by
 837 providing commands for entering characters and character references
 838 using the Unicode names, and can display the glyph corresponding to a
 839 character reference.
 840
 841 @node Limitations
 842 @chapter Limitations
 843
 844 nXML mode has some limitations:
 845
 846 @itemize @bullet
 847 @item
 848 DTD support is limited.  Internal parsed general entities declared
 849 in the internal subset are supported provided they do not contain
 850 elements. Other usage of DTDs is ignored.
 851 @item
 852 The restrictions on RELAX NG schemas in section 7 of the RELAX NG
 853 specification are not enforced.
 854 @item
 855 Unicode support has problems. This stems mostly from the fact that
 856 the XML (and RELAX NG) character model is based squarely on Unicode,
 857 whereas the Emacs character model is not.  Emacs 22 is slated to have
 858 full Unicode support, which should improve the situation here.
 859 @end itemize
 860
 861 @bye
 862
 863 @ignore
 864    arch-tag: 3b6e8ac2-ae8d-4f38-bd43-ce9f80be04d6
 865 @end ignore