url.texi doc fix for bug#6038.
[bpt/emacs.git] / doc / misc / url.texi
1 \input texinfo
2 @setfilename ../../info/url
3 @settitle URL Programmer's Manual
4
5 @iftex
6 @c @finalout
7 @end iftex
8 @c @setchapternewpage odd
9 @c @smallbook
10
11 @tex
12 \overfullrule=0pt
13 %\global\baselineskip 30pt % for printing in double space
14 @end tex
15 @dircategory Emacs lisp libraries
16 @direntry
17 * URL: (url). URL loading package.
18 @end direntry
19
20 @copying
21 This file documents the Emacs Lisp URL loading package.
22
23 Copyright @copyright{} 1993-1999, 2002, 2004-2011 Free Software Foundation, Inc.
24
25 @quotation
26 Permission is granted to copy, distribute and/or modify this document
27 under the terms of the GNU Free Documentation License, Version 1.3 or
28 any later version published by the Free Software Foundation; with no
29 Invariant Sections, with the Front-Cover texts being ``A GNU Manual,''
30 and with the Back-Cover Texts as in (a) below. A copy of the license
31 is included in the section entitled ``GNU Free Documentation License''.
32
33 (a) The FSF's Back-Cover Text is: ``You have the freedom to copy and
34 modify this GNU manual. Buying copies from the FSF supports it in
35 developing GNU and promoting software freedom.''
36 @end quotation
37 @end copying
38
39 @c
40 @titlepage
41 @title URL Programmer's Manual
42 @subtitle First Edition, URL Version 2.0
43 @author William M. Perry @email{wmperry@@gnu.org}
44 @author David Love @email{fx@@gnu.org}
45 @page
46 @vskip 0pt plus 1filll
47 @insertcopying
48 @end titlepage
49
50 @contents
51
52 @node Top
53 @top URL
54
55 @ifnottex
56 @insertcopying
57 @end ifnottex
58
59 @menu
60 * Getting Started:: Preparing your program to use URLs.
61 * Retrieving URLs:: How to use this package to retrieve a URL.
62 * Supported URL Types:: Descriptions of URL types currently supported.
63 * Defining New URLs:: How to define a URL loader for a new protocol.
64 * General Facilities:: URLs can be cached, accessed via a gateway
65 and tracked in a history list.
66 * Customization:: Variables you can alter.
67 * GNU Free Documentation License:: The license for this documentation.
68 * Function Index::
69 * Variable Index::
70 * Concept Index::
71 @end menu
72
73 @node Getting Started
74 @chapter Getting Started
75 @cindex URLs, definition
76 @cindex URIs
77
78 @dfn{Uniform Resource Locators} (URLs) are a specific form of
79 @dfn{Uniform Resource Identifiers} (URI) described in RFC 2396 which
80 updates RFC 1738 and RFC 1808. RFC 2016 defines uniform resource
81 agents.
82
83 URIs have the form @var{scheme}:@var{scheme-specific-part}, where the
84 @var{scheme}s supported by this library are described below.
85 @xref{Supported URL Types}.
86
87 FTP, NFS, HTTP, HTTPS, @code{rlogin}, @code{telnet}, tn3270,
88 IRC and gopher URLs all have the form
89
90 @example
91 @var{scheme}://@r{[}@var{userinfo}@@@r{]}@var{hostname}@r{[}:@var{port}@r{]}@r{[}/@var{path}@r{]}
92 @end example
93 @noindent
94 where @samp{@r{[}} and @samp{@r{]}} delimit optional parts.
95 @var{userinfo} sometimes takes the form @var{username}:@var{password}
96 but you should beware of the security risks of sending cleartext
97 passwords. @var{hostname} may be a domain name or a dotted decimal
98 address. If the @samp{:@var{port}} is omitted then the library will
99 use the `well known' port for that service when accessing URLs. With
100 the possible exception of @code{telnet}, it is rare for ports to be
101 specified, and it is possible using a non-standard port may have
102 undesired consequences if a different service is listening on that
103 port (e.g., an HTTP URL specifying the SMTP port can cause mail to be
104 sent). @c , but @xref{Other Variables, url-bad-port-list}.
105 The meaning of the @var{path} component depends on the service.
106
107 @menu
108 * Configuration::
109 * Parsed URLs:: URLs are parsed into vector structures.
110 @end menu
111
112 @node Configuration
113 @section Configuration
114
115 @defvar url-configuration-directory
116 @cindex @file{~/.url}
117 @cindex configuration files
118 The directory in which URL configuration files, the cache etc.,
119 reside. Default @file{~/.url}.
120 @end defvar
121
122 @node Parsed URLs
123 @section Parsed URLs
124 @cindex parsed URLs
125 The library functions typically operate on @dfn{parsed} versions of
126 URLs. These are actually vectors of the form:
127
128 @example
129 [@var{type} @var{user} @var{password} @var{host} @var{port} @var{file} @var{target} @var{attributes} @var{full}]
130 @end example
131
132 @noindent where
133 @table @var
134 @item type
135 is the type of the URL scheme, e.g., @code{http}
136 @item user
137 is the username associated with it, or @code{nil};
138 @item password
139 is the user password associated with it, or @code{nil};
140 @item host
141 is the host name associated with it, or @code{nil};
142 @item port
143 is the port number associated with it, or @code{nil};
144 @item file
145 is the `file' part of it, or @code{nil}. This doesn't necessarily
146 actually refer to a file;
147 @item target
148 is the target part, or @code{nil};
149 @item attributes
150 is the attributes associated with it, or @code{nil};
151 @item full
152 is @code{t} for a fully-specified URL, with a host part indicated by
153 @samp{//} after the scheme part.
154 @end table
155
156 @findex url-type
157 @findex url-user
158 @findex url-password
159 @findex url-host
160 @findex url-port
161 @findex url-file
162 @findex url-target
163 @findex url-attributes
164 @findex url-full
165 @findex url-set-type
166 @findex url-set-user
167 @findex url-set-password
168 @findex url-set-host
169 @findex url-set-port
170 @findex url-set-file
171 @findex url-set-target
172 @findex url-set-attributes
173 @findex url-set-full
174 These attributes have accessors named @code{url-@var{part}}, where
175 @var{part} is the name of one of the elements above, e.g.,
176 @code{url-host}. Similarly, there are setters of the form
177 @code{url-set-@var{part}}.
178
179 There are functions for parsing and unparsing between the string and
180 vector forms.
181
182 @defun url-generic-parse-url url
183 Return a parsed version of the string @var{url}.
184 @end defun
185
186 @defun url-recreate-url url
187 @cindex unparsing URLs
188 Recreates a URL string from the parsed @var{url}.
189 @end defun
190
191 @node Retrieving URLs
192 @chapter Retrieving URLs
193
194 @defun url-retrieve-synchronously url
195 Retrieve @var{url} synchronously and return a buffer containing the
196 data. @var{url} is either a string or a parsed URL structure. Return
197 @code{nil} if there are no data associated with it (the case for dired,
198 info, or mailto URLs that need no further processing).
199 @end defun
200
201 @defun url-retrieve url callback &optional cbargs
202 Retrieve @var{url} asynchronously and call @var{callback} with args
203 @var{cbargs} when finished. The callback is called when the object
204 has been completely retrieved, with the current buffer containing the
205 object and any MIME headers associated with it. @var{url} is either a
206 string or a parsed URL structure. Returns the buffer @var{url} will
207 load into, or @code{nil} if the process has already completed.
208 @end defun
209
210 @node Supported URL Types
211 @chapter Supported URL Types
212
213 @menu
214 * http/https:: Hypertext Transfer Protocol.
215 * file/ftp:: Local files and FTP archives.
216 * info:: Emacs `Info' pages.
217 * mailto:: Sending email.
218 * news/nntp/snews:: Usenet news.
219 * rlogin/telnet/tn3270:: Remote host connectivity.
220 * irc:: Internet Relay Chat.
221 * data:: Embedded data URLs.
222 * nfs:: Networked File System
223 @c * finger::
224 @c * gopher::
225 @c * netrek::
226 @c * prospero::
227 * cid:: Content-ID.
228 * about::
229 * ldap:: Lightweight Directory Access Protocol
230 * imap:: IMAP mailboxes.
231 * man:: Unix man pages.
232 @end menu
233
234 @node http/https
235 @section @code{http} and @code{https}
236
237 The scheme @code{http} is Hypertext Transfer Protocol. The library
238 supports version 1.1, specified in RFC 2616. (This supersedes 1.0,
239 defined in RFC 1945) HTTP URLs have the following form, where most of
240 the parts are optional:
241 @example
242 http://@var{user}:@var{password}@@@var{host}:@var{port}/@var{path}?@var{searchpart}#@var{fragment}
243 @end example
244 @c The @code{:@var{port}} part is optional, and @var{port} defaults to
245 @c 80. The @code{/@var{path}} part, if present, is a slash-separated
246 @c series elements. The @code{?@var{searchpart}}, if present, is the
247 @c query for a search or the content of a form submission. The
248 @c @code{#fragment} part, if present, is a location in the document.
249
250 The scheme @code{https} is a secure version of @code{http}, with
251 transmission via SSL. It is defined in RFC 2069. Its default port is
252 443. This scheme depends on SSL support in Emacs via the
253 @file{ssl.el} library and is actually implemented by forcing the
254 @code{ssl} gateway method to be used. @xref{Gateways in general}.
255
256 @defopt url-honor-refresh-requests
257 This controls honoring of HTTP @samp{Refresh} headers by which
258 servers can direct clients to reload documents from the same URL or a
259 or different one. @code{nil} means they will not be honored,
260 @code{t} (the default) means they will always be honored, and
261 otherwise the user will be asked on each request.
262 @end defopt
263
264
265 @menu
266 * Cookies::
267 * HTTP language/coding::
268 * HTTP URL Options::
269 * Dealing with HTTP documents::
270 @end menu
271
272 @node Cookies
273 @subsection Cookies
274
275 @defopt url-cookie-file
276 The file in which cookies are stored, defaulting to @file{cookies} in
277 the directory specified by @code{url-configuration-directory}.
278 @end defopt
279
280 @defopt url-cookie-confirmation
281 Specifies whether confirmation is require to accept cookies.
282 @end defopt
283
284 @defopt url-cookie-multiple-line
285 Specifies whether to put all cookies for the server on one line in the
286 HTTP request to satisfy broken servers like
287 @url{http://www.hotmail.com}.
288 @end defopt
289
290 @defopt url-cookie-trusted-urls
291 A list of regular expressions matching URLs from which to accept
292 cookies always.
293 @end defopt
294
295 @defopt url-cookie-untrusted-urls
296 A list of regular expressions matching URLs from which to reject
297 cookies always.
298 @end defopt
299
300 @defopt url-cookie-save-interval
301 The number of seconds between automatic saves of cookies to disk.
302 Default is one hour.
303 @end defopt
304
305
306 @node HTTP language/coding
307 @subsection Language and Encoding Preferences
308
309 HTTP allows clients to express preferences for the language and
310 encoding of documents which servers may honor. For each of these
311 variables, the value is a string; it can specify a single choice, or
312 it can be a comma-separated list.
313
314 Normally, this list is ordered by descending preference. However, each
315 element can be followed by @samp{;q=@var{priority}} to specify its
316 preference level, a decimal number from 0 to 1; e.g., for
317 @code{url-mime-language-string}, @w{@code{"de, en-gb;q=0.8,
318 en;q=0.7"}}. An element that has no @samp{;q} specification has
319 preference level 1.
320
321 @defopt url-mime-charset-string
322 @cindex character sets
323 @cindex coding systems
324 This variable specifies a preference for character sets when documents
325 can be served in more than one encoding.
326
327 HTTP allows specifying a series of MIME charsets which indicate your
328 preferred character set encodings, e.g., Latin-9 or Big5, and these
329 can be weighted. The default series is generated automatically from
330 the associated MIME types of all defined coding systems, sorted by the
331 coding system priority specified in Emacs. @xref{Recognize Coding, ,
332 Recognizing Coding Systems, emacs, The GNU Emacs Manual}.
333 @end defopt
334
335 @defopt url-mime-language-string
336 @cindex language preferences
337 A string specifying the preferred language when servers can serve
338 files in several languages. Use RFC 1766 abbreviations, e.g.,
339 @samp{en} for English, @samp{de} for German.
340
341 The string can be @code{"*"} to get the first available language (as
342 opposed to the default).
343 @end defopt
344
345 @node HTTP URL Options
346 @subsection HTTP URL Options
347
348 HTTP supports an @samp{OPTIONS} method describing things supported by
349 the URL@.
350
351 @defun url-http-options url
352 Returns a property list describing options available for URL. The
353 property list members are:
354
355 @table @code
356 @item methods
357 A list of symbols specifying what HTTP methods the resource
358 supports.
359
360 @item dav
361 @cindex DAV
362 A list of numbers specifying what DAV protocol/schema versions are
363 supported.
364
365 @item dasl
366 @cindex DASL
367 A list of supported DASL search types supported (string form).
368
369 @item ranges
370 A list of the units available for use in partial document fetches.
371
372 @item p3p
373 @cindex P3P
374 The @dfn{Platform For Privacy Protection} description for the resource.
375 Currently this is just the raw header contents.
376 @end table
377
378 @end defun
379
380 @node Dealing with HTTP documents
381 @subsection Dealing with HTTP documents
382
383 HTTP URLs are retrieved into a buffer containing the HTTP headers
384 followed by the body. Since the headers are quasi-MIME, they may be
385 processed using the MIME library. @xref{Top,, Emacs MIME,
386 emacs-mime, The Emacs MIME Manual}.
387
388 @node file/ftp
389 @section file and ftp
390 @cindex files
391 @cindex FTP
392 @cindex File Transfer Protocol
393 @cindex compressed files
394 @cindex dired
395
396 @example
397 ftp://@var{user}:@var{password}@@@var{host}:@var{port}/@var{file}
398 file://@var{user}:@var{password}@@@var{host}:@var{port}/@var{file}
399 @end example
400
401 These schemes are defined in RFC 1808.
402 @samp{ftp:} and @samp{file:} are synonymous in this library. They
403 allow reading arbitrary files from hosts. Either @samp{ange-ftp}
404 (Emacs) or @samp{efs} (XEmacs) is used to retrieve them from remote
405 hosts. Local files are accessed directly.
406
407 Compressed files are handled, but support is hard-coded so that
408 @code{jka-compr-compression-info-list} and so on have no affect.
409 Suffixes recognized are @samp{.z}, @samp{.gz}, @samp{.Z} and
410 @samp{.bz2}.
411
412 @defopt url-directory-index-file
413 The filename to look for when indexing a directory, default
414 @samp{"index.html"}. If this file exists, and is readable, then it
415 will be viewed instead of using @code{dired} to view the directory.
416 @end defopt
417
418 @node info
419 @section info
420 @cindex Info
421 @cindex Texinfo
422 @findex Info-goto-node
423
424 @example
425 info:@var{file}#@var{node}
426 @end example
427
428 Info URLs are not officially defined. They invoke
429 @code{Info-goto-node} with argument @samp{(@var{file})@var{node}}.
430 @samp{#@var{node}} is optional, defaulting to @samp{Top}.
431
432 @node mailto
433 @section mailto
434
435 @cindex mailto
436 @cindex email
437 A mailto URL will send an email message to the address in the
438 URL, for example @samp{mailto:foo@@bar.com} would compose a
439 message to @samp{foo@@bar.com}.
440
441 @defopt url-mail-command
442 @vindex mail-user-agent
443 The function called whenever url needs to send mail. This should
444 normally be left to default from @var{mail-user-agent}. @xref{Mail
445 Methods, , Mail-Composition Methods, emacs, The GNU Emacs Manual}.
446 @end defopt
447
448 An @samp{X-Url-From} header field containing the URL of the document
449 that contained the mailto URL is added if that URL is known.
450
451 RFC 2368 extends the definition of mailto URLs in RFC 1738.
452 The form of a mailto URL is
453 @example
454 @samp{mailto:@var{mailbox}[?@var{header}=@var{contents}[&@var{header}=@var{contents}]]}
455 @end example
456 @noindent where an arbitrary number of @var{header}s can be added. If the
457 @var{header} is @samp{body}, then @var{contents} is put in the body
458 otherwise a @var{header} header field is created with @var{contents}
459 as its contents. Note that the URL library does not consider any
460 headers `dangerous' so you should check them before sending the
461 message.
462
463 @c Fixme: update
464 Email messages are defined in @sc{rfc}822.
465
466 @node news/nntp/snews
467 @section @code{news}, @code{nntp} and @code{snews}
468 @cindex news
469 @cindex network news
470 @cindex usenet
471 @cindex NNTP
472 @cindex snews
473
474 @c draft-gilman-news-url-01
475 The network news URL scheme take the following forms following RFC
476 1738 except that for compatibility with other clients, host and port
477 fields may be included in news URLs though they are properly only
478 allowed for nntp an snews.
479
480 @table @samp
481 @item news:@var{newsgroup}
482 Retrieves a list of messages in @var{newsgroup};
483 @item news:@var{message-id}
484 Retrieves the message with the given @var{message-id};
485 @item news:*
486 Retrieves a list of all available newsgroups;
487 @item nntp://@var{host}:@var{port}/@var{newsgroup}
488 @itemx nntp://@var{host}:@var{port}/@var{message-id}
489 @itemx nntp://@var{host}:@var{port}/*
490 Similar to the @samp{news} versions.
491 @end table
492
493 @samp{:@var{port}} is optional and defaults to :119.
494
495 @samp{snews} is the same as @samp{nntp} except that the default port
496 is :563.
497 @cindex SSL
498 (It is tunneled through SSL.)
499
500 An @samp{nntp} URL is the same as a news URL, except that the URL may
501 specify an article by its number.
502
503 @defopt url-news-server
504 This variable can be used to override the default news server.
505 Usually this will be set by the Gnus package, which is used to fetch
506 news.
507 @cindex environment variable
508 @vindex NNTPSERVER
509 It may be set from the conventional environment variable
510 @code{NNTPSERVER}.
511 @end defopt
512
513 @node rlogin/telnet/tn3270
514 @section rlogin, telnet and tn3270
515 @cindex rlogin
516 @cindex telnet
517 @cindex tn3270
518 @cindex terminal emulation
519 @findex terminal-emulator
520
521 These URL schemes from RFC 1738 for logon via a terminal emulator have
522 the form
523 @example
524 telnet://@var{user}:@var{password}@@@var{host}:@var{port}
525 @end example
526 but the @code{:@var{password}} component is ignored.
527
528 To handle rlogin, telnet and tn3270 URLs, a @code{rlogin},
529 @code{telnet} or @code{tn3270} (the program names and arguments are
530 hardcoded) session is run in a @code{terminal-emulator} buffer.
531 Well-known ports are used if the URL does not specify a port.
532
533 @node irc
534 @section irc
535 @cindex IRC
536 @cindex Internet Relay Chat
537 @cindex ZEN IRC
538 @cindex ERC
539 @cindex rcirc
540 @c Fixme: reference (was http://www.w3.org/Addressing/draft-mirashi-url-irc-01.txt)
541 @dfn{Internet Relay Chat} (IRC) is handled by handing off the @sc{irc}
542 session to a function named in @code{url-irc-function}.
543
544 @defopt url-irc-function
545 A function to actually open an IRC connection.
546 This function
547 must take five arguments, @var{host}, @var{port}, @var{channel},
548 @var{user} and @var{password}. The @var{channel} argument specifies the
549 channel to join immediately, this can be @code{nil}. By default this is
550 @code{url-irc-rcirc}.
551 @end defopt
552 @defun url-irc-rcirc host port channel user password
553 Processes the arguments and lets @code{rcirc} handle the session.
554 @end defun
555 @defun url-irc-erc host port channel user password
556 Processes the arguments and lets @code{ERC} handle the session.
557 @end defun
558 @defun url-irc-zenirc host port channel user password
559 Processes the arguments and lets @code{zenirc} handle the session.
560 @end defun
561
562 @node data
563 @section data
564 @cindex data URLs
565
566 @example
567 data:@r{[}@var{media-type}@r{]}@r{[};@var{base64}@r{]},@var{data}
568 @end example
569
570 Data URLs contain MIME data in the URL itself. They are defined in
571 RFC 2397.
572
573 @var{media-type} is a MIME @samp{Content-Type} string, possibly
574 including parameters. It defaults to
575 @samp{text/plain;charset=US-ASCII}. The @samp{text/plain} can be
576 omitted but the charset parameter supplied. If @samp{;base64} is
577 present, the @var{data} are base64-encoded.
578
579 @node nfs
580 @section nfs
581 @cindex NFS
582 @cindex Network File System
583 @cindex automounter
584
585 @example
586 nfs://@var{user}:@var{password}@@@var{host}:@var{port}/@var{file}
587 @end example
588
589 The @samp{nfs:} scheme is defined in RFC 2224. It is similar to
590 @samp{ftp:} except that it points to a file on a remote host that is
591 handled by the automounter on the local host.
592
593 @defvar url-nfs-automounter-directory-spec
594 @end defvar
595 A string saying how to invoke the NFS automounter. Certain @samp{%}
596 sequences are recognized:
597
598 @table @samp
599 @item %h
600 The hostname of the NFS server;
601 @item %n
602 The port number of the NFS server;
603 @item %u
604 The username to use to authenticate;
605 @item %p
606 The password to use to authenticate;
607 @item %f
608 The filename on the remote server;
609 @item %%
610 A literal @samp{%}.
611 @end table
612
613 Each can be used any number of times.
614
615 @node cid
616 @section cid
617 @cindex Content-ID
618
619 RFC 2111
620
621 @node about
622 @section about
623
624 @node ldap
625 @section ldap
626 @cindex LDAP
627 @cindex Lightweight Directory Access Protocol
628
629 The LDAP scheme is defined in RFC 2255.
630
631 @node imap
632 @section imap
633 @cindex IMAP
634
635 RFC 2192
636
637 @node man
638 @section man
639 @cindex @command{man}
640 @cindex Unix man pages
641 @findex man
642
643 @example
644 @samp{man:@var{page-spec}}
645 @end example
646
647 This is a non-standard scheme. @var{page-spec} is passed directly to
648 the Lisp @code{man} function.
649
650 @node Defining New URLs
651 @chapter Defining New URLs
652
653 @menu
654 * Naming conventions::
655 * Required functions::
656 * Optional functions::
657 * Asynchronous fetching::
658 * Supporting file-name-handlers::
659 @end menu
660
661 @node Naming conventions
662 @section Naming conventions
663
664 @node Required functions
665 @section Required functions
666
667 @node Optional functions
668 @section Optional functions
669
670 @node Asynchronous fetching
671 @section Asynchronous fetching
672
673 @node Supporting file-name-handlers
674 @section Supporting file-name-handlers
675
676 @node General Facilities
677 @chapter General Facilities
678
679 @menu
680 * Disk Caching::
681 * Proxies::
682 * Gateways in general::
683 * History::
684 @end menu
685
686 @node Disk Caching
687 @section Disk Caching
688 @cindex Caching
689 @cindex Persistent Cache
690 @cindex Disk Cache
691
692 The disk cache stores retrieved documents locally, whence they can be
693 retrieved more quickly. When requesting a URL that is in the cache,
694 the library checks to see if the page has changed since it was last
695 retrieved from the remote machine. If not, the local copy is used,
696 saving the transmission over the network.
697 @cindex Cleaning the cache
698 @cindex Clearing the cache
699 @cindex Cache cleaning
700 Currently the cache isn't cleared automatically.
701 @c Running the @code{clean-cache} shell script
702 @c fist is recommended, to allow for future cleaning of the cache. This
703 @c shell script will remove all files that have not been accessed since it
704 @c was last run. To keep the cache pared down, it is recommended that this
705 @c script be run from @i{at} or @i{cron} (see the manual pages for
706 @c crontab(5) or at(1) for more information)
707
708 @defopt url-automatic-caching
709 Setting this variable non-@code{nil} causes documents to be cached
710 automatically.
711 @end defopt
712
713 @defopt url-cache-directory
714 This variable specifies the
715 directory to store the cache files. It defaults to sub-directory
716 @file{cache} of @code{url-configuration-directory}.
717 @end defopt
718
719 @defopt url-cache-creation-function
720 The cache relies on a scheme for mapping URLs to files in the cache.
721 This variable names a function which sets the type of cache to use.
722 It takes a URL as argument and returns the absolute file name of the
723 corresponding cache file. The two supplied possibilities are
724 @code{url-cache-create-filename-using-md5} and
725 @code{url-cache-create-filename-human-readable}.
726 @end defopt
727
728 @defun url-cache-create-filename-using-md5 url
729 Creates a cache file name from @var{url} using MD5 hashing.
730 This is creates entries with very few cache collisions and is fast.
731 @cindex MD5
732 @smallexample
733 (url-cache-create-filename-using-md5 "http://www.example.com/foo/bar")
734 @result{} "/home/fx/.url/cache/fx/http/com/example/www/b8a35774ad20db71c7c3409a5410e74f"
735 @end smallexample
736 @end defun
737
738 @defun url-cache-create-filename-human-readable url
739 Creates a cache file name from @var{url} more obviously connected to
740 @var{url} than for @code{url-cache-create-filename-using-md5}, but
741 more likely to conflict with other files.
742 @smallexample
743 (url-cache-create-filename-human-readable "http://www.example.com/foo/bar")
744 @result{} "/home/fx/.url/cache/fx/http/com/example/www/foo/bar"
745 @end smallexample
746 @end defun
747
748 @defun url-cache-expired
749 This function returns non-nil if a cache entry has expired (or is absent).
750 The arguments are a URL and optional expiration delay in seconds
751 (default @var{url-cache-expire-time}).
752 @end defun
753
754 @defopt url-cache-expire-time
755 This variable is the default number of seconds to use for the
756 expire-time argument of the function @code{url-cache-expired}.
757 @end defopt
758
759 @defun url-fetch-from-cache
760 This function takes a URL as its argument and returns a buffer
761 containing the data cached for that URL.
762 @end defun
763
764 @c Fixme: never actually used currently?
765 @c @defopt url-standalone-mode
766 @c @cindex Relying on cache
767 @c @cindex Cache only mode
768 @c @cindex Standalone mode
769 @c If this variable is non-@code{nil}, the library relies solely on the
770 @c cache for fetching documents and avoids checking if they have changed
771 @c on remote servers.
772 @c @end defopt
773
774 @c With a large cache of documents on the local disk, it can be very handy
775 @c when traveling, or any other time the network connection is not active
776 @c (a laptop with a dial-on-demand PPP connection, etc). Emacs/W3 can rely
777 @c solely on its cache, and avoid checking to see if the page has changed
778 @c on the remote server. In the case of a dial-on-demand PPP connection,
779 @c this will keep the phone line free as long as possible, only bringing up
780 @c the PPP connection when asking for a page that is not located in the
781 @c cache. This is very useful for demonstrations as well.
782
783 @node Proxies
784 @section Proxies and Gatewaying
785
786 @c fixme: check/document url-ns stuff
787 @cindex proxy servers
788 @cindex proxies
789 @cindex environment variables
790 @vindex HTTP_PROXY
791 Proxy servers are commonly used to provide gateways through firewalls
792 or as caches serving some more-or-less local network. Each protocol
793 (HTTP, FTP, etc.)@: can have a different gateway server. Proxying is
794 conventionally configured commonly amongst different programs through
795 environment variables of the form @code{@var{protocol}_proxy}, where
796 @var{protocol} is one of the supported network protocols (@code{http},
797 @code{ftp} etc.). The library recognizes such variables in either
798 upper or lower case. Their values are of one of the forms:
799 @itemize @bullet
800 @item @code{@var{host}:@var{port}}
801 @item A full URL;
802 @item Simply a host name.
803 @end itemize
804
805 @vindex NO_PROXY
806 The @code{NO_PROXY} environment variable specifies URLs that should be
807 excluded from proxying (on servers that should be contacted directly).
808 This should be a comma-separated list of hostnames, domain names, or a
809 mixture of both. Asterisks can be used as wildcards, but other
810 clients may not support that. Domain names may be indicated by a
811 leading dot. For example:
812 @example
813 NO_PROXY="*.aventail.com,home.com,.seanet.com"
814 @end example
815 @noindent says to contact all machines in the @samp{aventail.com} and
816 @samp{seanet.com} domains directly, as well as the machine named
817 @samp{home.com}. If @code{NO_PROXY} isn't defined, @code{no_PROXY}
818 and @code{no_proxy} are also tried, in that order.
819
820 Proxies may also be specified directly in Lisp.
821
822 @defopt url-proxy-services
823 This variable is an alist of URL schemes and proxy servers that
824 gateway them. The items are of the form @w{@code{(@var{scheme}
825 . @var{host}:@var{portnumber})}}, says that the URL @var{scheme} is
826 gatewayed through @var{portnumber} on the specified @var{host}. An
827 exception is the pseudo scheme @code{"no_proxy"}, which is paired with
828 a regexp matching host names not to be proxied. This variable is
829 initialized from the environment as above.
830
831 @example
832 (setq url-proxy-services
833 '(("http" . "proxy.aventail.com:80")
834 ("no_proxy" . "^.*\\(aventail\\|seanet\\)\\.com")))
835 @end example
836 @end defopt
837
838 @node Gateways in general
839 @section Gateways in General
840 @cindex gateways
841 @cindex firewalls
842
843 The library provides a general gateway layer through which all
844 networking passes. It can both control access to the network and
845 provide access through gateways in firewalls. This may make direct
846 connections in some cases and pass through some sort of gateway in
847 others.@footnote{Proxies (which only operate over HTTP) are
848 implemented using this.} The library's basic function responsible for
849 making connections is @code{url-open-stream}.
850
851 @defun url-open-stream name buffer host service
852 @cindex opening a stream
853 @cindex stream, opening
854 Open a stream to @var{host}, possibly via a gateway. The other
855 arguments are as for @code{open-network-stream}. This will not make a
856 connection if @code{url-gateway-unplugged} is non-@code{nil}.
857 @end defun
858
859 @defvar url-gateway-local-host-regexp
860 This is a regular expression that matches local hosts that do not
861 require the use of a gateway. If @code{nil}, all connections are made
862 through the gateway.
863 @end defvar
864
865 @defvar url-gateway-method
866 This variable controls which gateway method is used. It may be useful
867 to bind it temporarily in some applications. It has values taken from
868 a list of symbols. Possible values are:
869
870 @table @code
871 @item telnet
872 @cindex @command{telnet}
873 Use this method if you must first telnet and log into a gateway host,
874 and then run telnet from that host to connect to outside machines.
875
876 @item rlogin
877 @cindex @command{rlogin}
878 This method is identical to @code{telnet}, but uses @command{rlogin}
879 to log into the remote machine without having to send the username and
880 password over the wire every time.
881
882 @item socks
883 @cindex @sc{socks}
884 Use if the firewall has a @sc{socks} gateway running on it. The
885 @sc{socks} v5 protocol is defined in RFC 1928.
886
887 @c @item ssl
888 @c This probably shouldn't be documented
889 @c Fixme: why not? -- fx
890
891 @item native
892 This method uses Emacs's builtin networking directly. This is the
893 default. It can be used only if there is no firewall blocking access.
894 @end table
895 @end defvar
896
897 The following variables control the gateway methods.
898
899 @defopt url-gateway-telnet-host
900 The gateway host to telnet to. Once logged in there, you then telnet
901 out to the hosts you want to connect to.
902 @end defopt
903 @defopt url-gateway-telnet-parameters
904 This should be a list of parameters to pass to the @command{telnet} program.
905 @end defopt
906 @defopt url-gateway-telnet-password-prompt
907 This is a regular expression that matches the password prompt when
908 logging in.
909 @end defopt
910 @defopt url-gateway-telnet-login-prompt
911 This is a regular expression that matches the username prompt when
912 logging in.
913 @end defopt
914 @defopt url-gateway-telnet-user-name
915 The username to log in with.
916 @end defopt
917 @defopt url-gateway-telnet-password
918 The password to send when logging in.
919 @end defopt
920 @defopt url-gateway-prompt-pattern
921 This is a regular expression that matches the shell prompt.
922 @end defopt
923
924 @defopt url-gateway-rlogin-host
925 Host to @samp{rlogin} to before telnetting out.
926 @end defopt
927 @defopt url-gateway-rlogin-parameters
928 Parameters to pass to @samp{rsh}.
929 @end defopt
930 @defopt url-gateway-rlogin-user-name
931 User name to use when logging in to the gateway.
932 @end defopt
933 @defopt url-gateway-prompt-pattern
934 This is a regular expression that matches the shell prompt.
935 @end defopt
936
937 @defopt socks-server
938 This specifies the default server, it takes the form
939 @w{@code{("Default server" @var{server} @var{port} @var{version})}}
940 where @var{version} can be either 4 or 5.
941 @end defopt
942 @defvar socks-password
943 If this is @code{nil} then you will be asked for the password,
944 otherwise it will be used as the password for authenticating you to
945 the @sc{socks} server.
946 @end defvar
947 @defvar socks-username
948 This is the username to use when authenticating yourself to the
949 @sc{socks} server. By default this is your login name.
950 @end defvar
951 @defvar socks-timeout
952 This controls how long, in seconds, to wait for responses from the
953 @sc{socks} server; it is 5 by default.
954 @end defvar
955 @c fixme: these have been effectively commented-out in the code
956 @c @defopt socks-server-aliases
957 @c This a list of server aliases. It is a list of aliases of the form
958 @c @var{(alias hostname port version)}.
959 @c @end defopt
960 @c @defopt socks-network-aliases
961 @c This a list of network aliases. Each entry in the list takes the form
962 @c @var{(alias (network))} where @var{alias} is a string that names the
963 @c @var{network}. The networks can contain a pair (not a dotted pair) of
964 @c @sc{ip} addresses which specify a range of @sc{ip} addresses, an @sc{ip}
965 @c address and a netmask, a domain name or a unique hostname or @sc{ip}
966 @c address.
967 @c @end defopt
968 @c @defopt socks-redirection-rules
969 @c This a list of redirection rules. Each rule take the form
970 @c @var{(Destination network Connection type)} where @var{Destination
971 @c network} is a network alias from @code{socks-network-aliases} and
972 @c @var{Connection type} can be @code{nil} in which case a direct
973 @c connection is used, or it can be an alias from
974 @c @code{socks-server-aliases} in which case that server is used as a
975 @c proxy.
976 @c @end defopt
977 @defopt socks-nslookup-program
978 @cindex @command{nslookup}
979 This the @samp{nslookup} program. It is @code{"nslookup"} by default.
980 @end defopt
981
982 @menu
983 * Suppressing network connections::
984 @end menu
985 @c * Broken hostname resolution::
986
987 @node Suppressing network connections
988 @subsection Suppressing Network Connections
989
990 @cindex network connections, suppressing
991 @cindex suppressing network connections
992 @cindex bugs, HTML
993 @cindex HTML `bugs'
994 In some circumstances it is desirable to suppress making network
995 connections. A typical case is when rendering HTML in a mail user
996 agent, when external URLs should not be activated, particularly to
997 avoid `bugs' which `call home' by fetch single-pixel images and the
998 like. To arrange this, bind the following variable for the duration
999 of such processing.
1000
1001 @defvar url-gateway-unplugged
1002 If this variable is non-@code{nil} new network connections are never
1003 opened by the URL library.
1004 @end defvar
1005
1006 @c @node Broken hostname resolution
1007 @c @subsection Broken Hostname Resolution
1008
1009 @c @cindex hostname resolver
1010 @c @cindex resolver, hostname
1011 @c Some C libraries do not include the hostname resolver routines in
1012 @c their static libraries. If Emacs was linked statically, and was not
1013 @c linked with the resolver libraries, it will not be able to get to any
1014 @c machines off the local network. This is characterized by being able
1015 @c to reach someplace with a raw ip number, but not its hostname
1016 @c (@url{http://129.79.254.191/} works, but
1017 @c @url{http://www.cs.indiana.edu/} doesn't). This used to happen on
1018 @c SunOS4 and Ultrix, but is now probably now rare. If Emacs can't be
1019 @c rebuilt linked against the resolver library, it can use the external
1020 @c @command{nslookup} program instead.
1021
1022 @c @defopt url-gateway-broken-resolution
1023 @c @cindex @code{nslookup} program
1024 @c @cindex program, @code{nslookup}
1025 @c If non-@code{nil}, this variable says to use the program specified by
1026 @c @code{url-gateway-nslookup-program} program to do hostname resolution.
1027 @c @end defopt
1028
1029 @c @defopt url-gateway-nslookup-program
1030 @c The name of the program to do hostname lookup if Emacs can't do it
1031 @c directly. This program should expect a single argument on the command
1032 @c line---the hostname to resolve---and should produce output similar to
1033 @c the standard Unix @command{nslookup} program:
1034 @c @example
1035 @c Name: www.cs.indiana.edu
1036 @c Address: 129.79.254.191
1037 @c @end example
1038 @c @end defopt
1039
1040 @node History
1041 @section History
1042
1043 @findex url-do-setup
1044 The library can maintain a global history list tracking URLs accessed.
1045 URL completion can be done from it. The history mechanism is set up
1046 automatically via @code{url-do-setup} when it is configured to be on.
1047 Note that the size of the history list is currently not limited.
1048
1049 @vindex url-history-hash-table
1050 The history `list' is actually a hash table,
1051 @code{url-history-hash-table}. It contains access times keyed by URL
1052 strings. The times are in the format returned by @code{current-time}.
1053
1054 @defun url-history-update-url url time
1055 This function updates the history table with an entry for @var{url}
1056 accessed at the given @var{time}.
1057 @end defun
1058
1059 @defopt url-history-track
1060 If non-@code{nil}, the library will keep track of all the URLs
1061 accessed. If it is @code{t}, the list is saved to disk at the end of
1062 each Emacs session. The default is @code{nil}.
1063 @end defopt
1064
1065 @defopt url-history-file
1066 The file storing the history list between sessions. It defaults to
1067 @file{history} in @code{url-configuration-directory}.
1068 @end defopt
1069
1070 @defopt url-history-save-interval
1071 @findex url-history-setup-save-timer
1072 The number of seconds between automatic saves of the history list.
1073 Default is one hour. Note that if you change this variable directly,
1074 rather than using Custom, after @code{url-do-setup} has been run, you
1075 need to run the function @code{url-history-setup-save-timer}.
1076 @end defopt
1077
1078 @defun url-history-parse-history &optional fname
1079 Parses the history file @var{fname} (default @code{url-history-file})
1080 and sets up the history list.
1081 @end defun
1082
1083 @defun url-history-save-history &optional fname
1084 Saves the current history to file @var{fname} (default
1085 @code{url-history-file}).
1086 @end defun
1087
1088 @defun url-completion-function string predicate function
1089 You can use this function to do completion of URLs from the history.
1090 @end defun
1091
1092 @node Customization
1093 @chapter Customization
1094
1095 @section Environment Variables
1096
1097 @cindex environment variables
1098 The following environment variables affect the library's operation at
1099 startup.
1100
1101 @table @code
1102 @item TMPDIR
1103 @vindex TMPDIR
1104 @vindex url-temporary-directory
1105 If this is defined, @var{url-temporary-directory} is initialized from
1106 it.
1107 @end table
1108
1109 @section General User Options
1110
1111 The following user options, settable with Customize, affect the
1112 general operation of the package.
1113
1114 @defopt url-debug
1115 @cindex debugging
1116 Specifies the types of debug messages which are logged to
1117 the @code{*URL-DEBUG*} buffer.
1118 @code{t} means log all messages.
1119 A number means log all messages and show them with @code{message}.
1120 It may also be a list of the types of messages to be logged.
1121 @end defopt
1122 @defopt url-personal-mail-address
1123 @end defopt
1124 @defopt url-privacy-level
1125 @end defopt
1126 @defopt url-uncompressor-alist
1127 @end defopt
1128 @defopt url-passwd-entry-func
1129 @end defopt
1130 @defopt url-standalone-mode
1131 @end defopt
1132 @defopt url-bad-port-list
1133 @end defopt
1134 @defopt url-max-password-attempts
1135 @end defopt
1136 @defopt url-temporary-directory
1137 @end defopt
1138 @defopt url-show-status
1139 @end defopt
1140 @defopt url-confirmation-func
1141 The function to use for asking yes or no functions. This is normally
1142 either @code{y-or-n-p} or @code{yes-or-no-p}, but could be another
1143 function taking a single argument (the prompt) and returning @code{t}
1144 only if an affirmative answer is given.
1145 @end defopt
1146 @defopt url-gateway-method
1147 @c fixme: describe gatewaying
1148 A symbol specifying the type of gateway support to use for connections
1149 from the local machine. The supported methods are:
1150
1151 @table @code
1152 @item telnet
1153 Run telnet in a subprocess to connect;
1154 @item rlogin
1155 Rlogin to another machine to connect;
1156 @item socks
1157 Connect through a socks server;
1158 @item ssl
1159 Connect with SSL;
1160 @item native
1161 Connect directly.
1162 @end table
1163 @end defopt
1164
1165 @node GNU Free Documentation License
1166 @appendix GNU Free Documentation License
1167 @include doclicense.texi
1168
1169 @node Function Index
1170 @unnumbered Command and Function Index
1171 @printindex fn
1172
1173 @node Variable Index
1174 @unnumbered Variable Index
1175 @printindex vr
1176
1177 @node Concept Index
1178 @unnumbered Concept Index
1179 @printindex cp
1180
1181 @bye