Fix some indexing.
[bpt/emacs.git] / man / url.texi
1 \input texinfo
2 @setfilename ../info/url
3 @settitle URL Programmer's Manual
4
5 @iftex
6 @c @finalout
7 @end iftex
8 @c @setchapternewpage odd
9 @c @smallbook
10
11 @tex
12 \overfullrule=0pt
13 %\global\baselineskip 30pt % for printing in double space
14 @end tex
15 @dircategory World Wide Web
16 @dircategory GNU Emacs Lisp
17 @direntry
18 * URL: (url). URL loading package.
19 @end direntry
20
21 @ifnottex
22 This file documents the URL loading package.
23
24 Copyright @copyright{} 1993, 1994, 1995, 1996, 1997, 1998, 1999, 2002,
25 2004, 2005, 2006, 2007 Free Software Foundation, Inc.
26
27 Permission is granted to copy, distribute and/or modify this document
28 under the terms of the GNU Free Documentation License, Version 1.2 or
29 any later version published by the Free Software Foundation; with the
30 Invariant Sections being
31 ``GNU GENERAL PUBLIC LICENSE''. A copy of the
32 license is included in the section entitled ``GNU Free Documentation
33 License.''
34 @end ifnottex
35
36 @c
37 @titlepage
38 @sp 6
39 @center @titlefont{URL}
40 @center @titlefont{Programmer's Manual}
41 @sp 4
42 @center First Edition, URL Version 2.0
43 @sp 1
44 @c @center December 1999
45 @sp 5
46 @center William M. Perry
47 @center @email{wmperry@@gnu.org}
48 @center David Love
49 @center @email{fx@@gnu.org}
50 @page
51 @vskip 0pt plus 1filll
52 Copyright @copyright{} 1993, 1994, 1995, 1996, 1997, 1998, 1999, 2002,
53 2003, 2004, 2005, 2006, 2007 Free Software Foundation, Inc.
54
55 Permission is granted to copy, distribute and/or modify this document
56 under the terms of the GNU Free Documentation License, Version 1.2 or
57 any later version published by the Free Software Foundation; with the
58 Invariant Sections being
59 ``GNU GENERAL PUBLIC LICENSE''. A copy of the
60 license is included in the section entitled ``GNU Free Documentation
61 License.''
62 @end titlepage
63 @page
64 @node Top
65 @top URL
66
67
68
69 @menu
70 * Getting Started:: Preparing your program to use URLs.
71 * Retrieving URLs:: How to use this package to retrieve a URL.
72 * Supported URL Types:: Descriptions of URL types currently supported.
73 * Defining New URLs:: How to define a URL loader for a new protocol.
74 * General Facilities:: URLs can be cached, accessed via a gateway
75 and tracked in a history list.
76 * Customization:: Variables you can alter.
77 * Function Index::
78 * Variable Index::
79 * Concept Index::
80 @end menu
81
82 @node Getting Started
83 @chapter Getting Started
84 @cindex URLs, definition
85 @cindex URIs
86
87 @dfn{Uniform Resource Locators} (URLs) are a specific form of
88 @dfn{Uniform Resource Identifiers} (URI) described in RFC 2396 which
89 updates RFC 1738 and RFC 1808. RFC 2016 defines uniform resource
90 agents.
91
92 URIs have the form @var{scheme}:@var{scheme-specific-part}, where the
93 @var{scheme}s supported by this library are described below.
94 @xref{Supported URL Types}.
95
96 FTP, NFS, HTTP, HTTPS, @code{rlogin}, @code{telnet}, tn3270,
97 IRC and gopher URLs all have the form
98
99 @example
100 @var{scheme}://@r{[}@var{userinfo}@@@r{]}@var{hostname}@r{[}:@var{port}@r{]}@r{[}/@var{path}@r{]}
101 @end example
102 @noindent
103 where @samp{@r{[}} and @samp{@r{]}} delimit optional parts.
104 @var{userinfo} sometimes takes the form @var{username}:@var{password}
105 but you should beware of the security risks of sending cleartext
106 passwords. @var{hostname} may be a domain name or a dotted decimal
107 address. If the @samp{:@var{port}} is omitted then the library will
108 use the `well known' port for that service when accessing URLs. With
109 the possible exception of @code{telnet}, it is rare for ports to be
110 specified, and it is possible using a non-standard port may have
111 undesired consequences if a different service is listening on that
112 port (e.g., an HTTP URL specifying the SMTP port can cause mail to be
113 sent). @c , but @xref{Other Variables, url-bad-port-list}.
114 The meaning of the @var{path} component depends on the service.
115
116 @menu
117 * Configuration::
118 * Parsed URLs:: URLs are parsed into vector structures.
119 @end menu
120
121 @node Configuration
122 @section Configuration
123
124 @defvar url-configuration-directory
125 @cindex @file{~/.url}
126 @cindex configuration files
127 The directory in which URL configuration files, the cache etc.,
128 reside. Default @file{~/.url}.
129 @end defvar
130
131 @node Parsed URLs
132 @section Parsed URLs
133 @cindex parsed URLs
134 The library functions typically operate on @dfn{parsed} versions of
135 URLs. These are actually vectors of the form:
136
137 @example
138 [@var{type} @var{user} @var{password} @var{host} @var{port} @var{file} @var{target} @var{attributes} @var{full}]
139 @end example
140
141 @noindent where
142 @table @var
143 @item type
144 is the type of the URL scheme, e.g., @code{http}
145 @item user
146 is the username associated with it, or @code{nil};
147 @item password
148 is the user password associated with it, or @code{nil};
149 @item host
150 is the host name associated with it, or @code{nil};
151 @item port
152 is the port number associated with it, or @code{nil};
153 @item file
154 is the `file' part of it, or @code{nil}. This doesn't necessarily
155 actually refer to a file;
156 @item target
157 is the target part, or @code{nil};
158 @item attributes
159 is the attributes associated with it, or @code{nil};
160 @item full
161 is @code{t} for a fully-specified URL, with a host part indicated by
162 @samp{//} after the scheme part.
163 @end table
164
165 @findex url-type
166 @findex url-user
167 @findex url-password
168 @findex url-host
169 @findex url-port
170 @findex url-file
171 @findex url-target
172 @findex url-attributes
173 @findex url-full
174 @findex url-set-type
175 @findex url-set-user
176 @findex url-set-password
177 @findex url-set-host
178 @findex url-set-port
179 @findex url-set-file
180 @findex url-set-target
181 @findex url-set-attributes
182 @findex url-set-full
183 These attributes have accessors named @code{url-@var{part}}, where
184 @var{part} is the name of one of the elements above, e.g.,
185 @code{url-host}. Similarly, there are setters of the form
186 @code{url-set-@var{part}}.
187
188 There are functions for parsing and unparsing between the string and
189 vector forms.
190
191 @defun url-generic-parse-url url
192 Return a parsed version of the string @var{url}.
193 @end defun
194
195 @defun url-recreate-url url
196 @cindex unparsing URLs
197 Recreates a URL string from the parsed @var{url}.
198 @end defun
199
200 @node Retrieving URLs
201 @chapter Retrieving URLs
202
203 @defun url-retrieve-synchronously url
204 Retrieve @var{url} synchronously and return a buffer containing the
205 data. @var{url} is either a string or a parsed URL structure. Return
206 @code{nil} if there are no data associated with it (the case for dired,
207 info, or mailto URLs that need no further processing).
208 @end defun
209
210 @defun url-retrieve url callback &optional cbargs
211 Retrieve @var{url} asynchronously and call @var{callback} with args
212 @var{cbargs} when finished. The callback is called when the object
213 has been completely retrieved, with the current buffer containing the
214 object and any MIME headers associated with it. @var{url} is either a
215 string or a parsed URL structure. Returns the buffer @var{url} will
216 load into, or @code{nil} if the process has already completed.
217 @end defun
218
219 @node Supported URL Types
220 @chapter Supported URL Types
221
222 @menu
223 * http/https:: Hypertext Transfer Protocol.
224 * file/ftp:: Local files and FTP archives.
225 * info:: Emacs `Info' pages.
226 * mailto:: Sending email.
227 * news/nntp/snews:: Usenet news.
228 * rlogin/telnet/tn3270:: Remote host connectivity.
229 * irc:: Internet Relay Chat.
230 * data:: Embedded data URLs.
231 * nfs:: Networked File System
232 @c * finger::
233 @c * gopher::
234 @c * netrek::
235 @c * prospero::
236 * cid:: Content-ID.
237 * about::
238 * ldap:: Lightweight Directory Access Protocol
239 * imap:: IMAP mailboxes.
240 * man:: Unix man pages.
241 @end menu
242
243 @node http/https
244 @section @code{http} and @code{https}
245
246 The scheme @code{http} is Hypertext Transfer Protocol. The library
247 supports version 1.1, specified in RFC 2616. (This supersedes 1.0,
248 defined in RFC 1945) HTTP URLs have the following form, where most of
249 the parts are optional:
250 @example
251 http://@var{user}:@var{password}@@@var{host}:@var{port}/@var{path}?@var{searchpart}#@var{fragment}
252 @end example
253 @c The @code{:@var{port}} part is optional, and @var{port} defaults to
254 @c 80. The @code{/@var{path}} part, if present, is a slash-separated
255 @c series elements. The @code{?@var{searchpart}}, if present, is the
256 @c query for a search or the content of a form submission. The
257 @c @code{#fragment} part, if present, is a location in the document.
258
259 The scheme @code{https} is a secure version of @code{http}, with
260 transmission via SSL. It is defined in RFC 2069. Its default port is
261 443. This scheme depends on SSL support in Emacs via the
262 @file{ssl.el} library and is actually implemented by forcing the
263 @code{ssl} gateway method to be used. @xref{Gateways in general}.
264
265 @defopt url-honor-refresh-requests
266 This controls honouring of HTTP @samp{Refresh} headers by which
267 servers can direct clients to reload documents from the same URL or a
268 or different one. @code{nil} means they will not be honoured,
269 @code{t} (the default) means they will always be honoured, and
270 otherwise the user will be asked on each request.
271 @end defopt
272
273
274 @menu
275 * Cookies::
276 * HTTP language/coding::
277 * HTTP URL Options::
278 * Dealing with HTTP documents::
279 @end menu
280
281 @node Cookies
282 @subsection Cookies
283
284 @defopt url-cookie-file
285 The file in which cookies are stored, defaulting to @file{cookies} in
286 the directory specified by @code{url-configuration-directory}.
287 @end defopt
288
289 @defopt url-cookie-confirmation
290 Specifies whether confirmation is require to accept cookies.
291 @end defopt
292
293 @defopt url-cookie-multiple-line
294 Specifies whether to put all cookies for the server on one line in the
295 HTTP request to satisfy broken servers like
296 @url{http://www.hotmail.com}.
297 @end defopt
298
299 @defopt url-cookie-trusted-urls
300 A list of regular expressions matching URLs from which to accept
301 cookies always.
302 @end defopt
303
304 @defopt url-cookie-untrusted-urls
305 A list of regular expressions matching URLs from which to reject
306 cookies always.
307 @end defopt
308
309 @defopt url-cookie-save-interval
310 The number of seconds between automatic saves of cookies to disk.
311 Default is one hour.
312 @end defopt
313
314
315 @node HTTP language/coding
316 @subsection Language and Encoding Preferences
317
318 HTTP allows clients to express preferences for the language and
319 encoding of documents which servers may honour. For each of these
320 variables, the value is a string; it can specify a single choice, or
321 it can be a comma-separated list.
322
323 Normally this list ordered by descending preference. However, each
324 element can be followed by @samp{;q=@var{priority}} to specify its
325 preference level, a decimal number from 0 to 1; e.g., for
326 @code{url-mime-language-string}, @w{@code{"de, en-gb;q=0.8,
327 en;q=0.7"}}. An element that has no @samp{;q} specification has
328 preference level 1.
329
330 @defopt url-mime-charset-string
331 @cindex character sets
332 @cindex coding systems
333 This variable specifies a preference for character sets when documents
334 can be served in more than one encoding.
335
336 HTTP allows specifying a series of MIME charsets which indicate your
337 preferred character set encodings, e.g., Latin-9 or Big5, and these
338 can be weighted. The default series is generated automatically from
339 the associated MIME types of all defined coding systems, sorted by the
340 coding system priority specified in Emacs. @xref{Recognize Coding, ,
341 Recognizing Coding Systems, emacs, The GNU Emacs Manual}.
342 @end defopt
343
344 @defopt url-mime-language-string
345 @cindex language preferences
346 A string specifying the preferred language when servers can serve
347 files in several languages. Use RFC 1766 abbreviations, e.g.,
348 @samp{en} for English, @samp{de} for German.
349
350 The string can be @code{"*"} to get the first available language (as
351 opposed to the default).
352 @end defopt
353
354 @node HTTP URL Options
355 @subsection HTTP URL Options
356
357 HTTP supports an @samp{OPTIONS} method describing things supported by
358 the URL@.
359
360 @defun url-http-options url
361 Returns a property list describing options available for URL. The
362 property list members are:
363
364 @table @code
365 @item methods
366 A list of symbols specifying what HTTP methods the resource
367 supports.
368
369 @item dav
370 @cindex DAV
371 A list of numbers specifying what DAV protocol/schema versions are
372 supported.
373
374 @item dasl
375 @cindex DASL
376 A list of supported DASL search types supported (string form).
377
378 @item ranges
379 A list of the units available for use in partial document fetches.
380
381 @item p3p
382 @cindex P3P
383 The @dfn{Platform For Privacy Protection} description for the resource.
384 Currently this is just the raw header contents.
385 @end table
386
387 @end defun
388
389 @node Dealing with HTTP documents
390 @subsection Dealing with HTTP documents
391
392 HTTP URLs are retrieved into a buffer containing the HTTP headers
393 followed by the body. Since the headers are quasi-MIME, they may be
394 processed using the MIME library. @xref{Top,, Emacs MIME,
395 emacs-mime, The Emacs MIME Manual}. The URL package provides a
396 function to do this in general:
397
398 @defun url-decode-text-part handle &optional coding
399 This function decodes charset-encoded text in the current buffer. In
400 Emacs, the buffer is expected to be unibyte initially and is set to
401 multibyte after decoding.
402 HANDLE is the MIME handle of the original part. CODING is an explicit
403 coding to use, overriding what the MIME headers specify.
404 The coding system used for the decoding is returned.
405
406 Note that this function doesn't deal with @samp{http-equiv} charset
407 specifications in HTML @samp{<meta>} elements.
408 @end defun
409
410 @node file/ftp
411 @section file and ftp
412 @cindex files
413 @cindex FTP
414 @cindex File Transfer Protocol
415 @cindex compressed files
416 @cindex dired
417
418 @example
419 ftp://@var{user}:@var{password}@@@var{host}:@var{port}/@var{file}
420 file://@var{user}:@var{password}@@@var{host}:@var{port}/@var{file}
421 @end example
422
423 These schemes are defined in RFC 1808.
424 @samp{ftp:} and @samp{file:} are synonymous in this library. They
425 allow reading arbitrary files from hosts. Either @samp{ange-ftp}
426 (Emacs) or @samp{efs} (XEmacs) is used to retrieve them from remote
427 hosts. Local files are accessed directly.
428
429 Compressed files are handled, but support is hard-coded so that
430 @code{jka-compr-compression-info-list} and so on have no affect.
431 Suffixes recognized are @samp{.z}, @samp{.gz}, @samp{.Z} and
432 @samp{.bz2}.
433
434 @defopt url-directory-index-file
435 The filename to look for when indexing a directory, default
436 @samp{"index.html"}. If this file exists, and is readable, then it
437 will be viewed instead of using @code{dired} to view the directory.
438 @end defopt
439
440 @node info
441 @section info
442 @cindex Info
443 @cindex Texinfo
444 @findex Info-goto-node
445
446 @example
447 info:@var{file}#@var{node}
448 @end example
449
450 Info URLs are not officially defined. They invoke
451 @code{Info-goto-node} with argument @samp{(@var{file})@var{node}}.
452 @samp{#@var{node}} is optional, defaulting to @samp{Top}.
453
454 @node mailto
455 @section mailto
456
457 @cindex mailto
458 @cindex email
459 A mailto URL will send an email message to the address in the
460 URL, for example @samp{mailto:foo@@bar.com} would compose a
461 message to @samp{foo@@bar.com}.
462
463 @defopt url-mail-command
464 @vindex mail-user-agent
465 The function called whenever url needs to send mail. This should
466 normally be left to default from @var{mail-user-agent}. @xref{Mail
467 Methods, , Mail-Composition Methods, emacs, The GNU Emacs Manual}.
468 @end defopt
469
470 An @samp{X-Url-From} header field containing the URL of the document
471 that contained the mailto URL is added if that URL is known.
472
473 RFC 2368 extends the definition of mailto URLs in RFC 1738.
474 The form of a mailto URL is
475 @example
476 @samp{mailto:@var{mailbox}[?@var{header}=@var{contents}[&@var{header}=@var{contents}]]}
477 @end example
478 @noindent where an arbitrary number of @var{header}s can be added. If the
479 @var{header} is @samp{body}, then @var{contents} is put in the body
480 otherwise a @var{header} header field is created with @var{contents}
481 as its contents. Note that the URL library does not consider any
482 headers `dangerous' so you should check them before sending the
483 message.
484
485 @c Fixme: update
486 Email messages are defined in @sc{rfc}822.
487
488 @node news/nntp/snews
489 @section @code{news}, @code{nntp} and @code{snews}
490 @cindex news
491 @cindex network news
492 @cindex usenet
493 @cindex NNTP
494 @cindex snews
495
496 @c draft-gilman-news-url-01
497 The network news URL scheme take the following forms following RFC
498 1738 except that for compatibility with other clients, host and port
499 fields may be included in news URLs though they are properly only
500 allowed for nntp an snews.
501
502 @table @samp
503 @item news:@var{newsgroup}
504 Retrieves a list of messages in @var{newsgroup};
505 @item news:@var{message-id}
506 Retrieves the message with the given @var{message-id};
507 @item news:*
508 Retrieves a list of all available newsgroups;
509 @item nntp://@var{host}:@var{port}/@var{newsgroup}
510 @itemx nntp://@var{host}:@var{port}/@var{message-id}
511 @itemx nntp://@var{host}:@var{port}/*
512 Similar to the @samp{news} versions.
513 @end table
514
515 @samp{:@var{port}} is optional and defaults to :119.
516
517 @samp{snews} is the same as @samp{nntp} except that the default port
518 is :563.
519 @cindex SSL
520 (It is tunneled through SSL.)
521
522 An @samp{nntp} URL is the same as a news URL, except that the URL may
523 specify an article by its number.
524
525 @defopt url-news-server
526 This variable can be used to override the default news server.
527 Usually this will be set by the Gnus package, which is used to fetch
528 news.
529 @cindex environment variable
530 @vindex NNTPSERVER
531 It may be set from the conventional environment variable
532 @code{NNTPSERVER}.
533 @end defopt
534
535 @node rlogin/telnet/tn3270
536 @section rlogin, telnet and tn3270
537 @cindex rlogin
538 @cindex telnet
539 @cindex tn3270
540 @cindex terminal emulation
541 @findex terminal-emulator
542
543 These URL schemes from RFC 1738 for logon via a terminal emulator have
544 the form
545 @example
546 telnet://@var{user}:@var{password}@@@var{host}:@var{port}
547 @end example
548 but the @code{:@var{password}} component is ignored.
549
550 To handle rlogin, telnet and tn3270 URLs, a @code{rlogin},
551 @code{telnet} or @code{tn3270} (the program names and arguments are
552 hardcoded) session is run in a @code{terminal-emulator} buffer.
553 Well-known ports are used if the URL does not specify a port.
554
555 @node irc
556 @section irc
557 @cindex IRC
558 @cindex Internet Relay Chat
559 @cindex ZEN IRC
560 @cindex ERC
561 @cindex rcirc
562 @c Fixme: reference (was http://www.w3.org/Addressing/draft-mirashi-url-irc-01.txt)
563 @dfn{Internet Relay Chat} (IRC) is handled by handing off the @sc{irc}
564 session to a function named in @code{url-irc-function}.
565
566 @defopt url-irc-function
567 A function to actually open an IRC connection.
568 This function
569 must take five arguments, @var{host}, @var{port}, @var{channel},
570 @var{user} and @var{password}. The @var{channel} argument specifies the
571 channel to join immediately, this can be @code{nil}. By default this is
572 @code{url-irc-rcirc}.
573 @end defopt
574 @defun url-irc-rcirc host port channel user password
575 Processes the arguments and lets @code{rcirc} handle the session.
576 @end defun
577 @defun url-irc-erc host port channel user password
578 Processes the arguments and lets @code{ERC} handle the session.
579 @end defun
580 @defun url-irc-zenirc host port channel user password
581 Processes the arguments and lets @code{zenirc} handle the session.
582 @end defun
583
584 @node data
585 @section data
586 @cindex data URLs
587
588 @example
589 data:@r{[}@var{media-type}@r{]}@r{[};@var{base64}@r{]},@var{data}
590 @end example
591
592 Data URLs contain MIME data in the URL itself. They are defined in
593 RFC 2397.
594
595 @var{media-type} is a MIME @samp{Content-Type} string, possibly
596 including parameters. It defaults to
597 @samp{text/plain;charset=US-ASCII}. The @samp{text/plain} can be
598 omitted but the charset parameter supplied. If @samp{;base64} is
599 present, the @var{data} are base64-encoded.
600
601 @node nfs
602 @section nfs
603 @cindex NFS
604 @cindex Network File System
605 @cindex automounter
606
607 @example
608 nfs://@var{user}:@var{password}@@@var{host}:@var{port}/@var{file}
609 @end example
610
611 The @samp{nfs:} scheme is defined in RFC 2224. It is similar to
612 @samp{ftp:} except that it points to a file on a remote host that is
613 handled by the automounter on the local host.
614
615 @defvar url-nfs-automounter-directory-spec
616 @end defvar
617 A string saying how to invoke the NFS automounter. Certain @samp{%}
618 sequences are recognized:
619
620 @table @samp
621 @item %h
622 The hostname of the NFS server;
623 @item %n
624 The port number of the NFS server;
625 @item %u
626 The username to use to authenticate;
627 @item %p
628 The password to use to authenticate;
629 @item %f
630 The filename on the remote server;
631 @item %%
632 A literal @samp{%}.
633 @end table
634
635 Each can be used any number of times.
636
637 @node cid
638 @section cid
639 @cindex Content-ID
640
641 RFC 2111
642
643 @node about
644 @section about
645
646 @node ldap
647 @section ldap
648 @cindex LDAP
649 @cindex Lightweight Directory Access Protocol
650
651 The LDAP scheme is defined in RFC 2255.
652
653 @node imap
654 @section imap
655 @cindex IMAP
656
657 RFC 2192
658
659 @node man
660 @section man
661 @cindex @command{man}
662 @cindex Unix man pages
663 @findex man
664
665 @example
666 @samp{man:@var{page-spec}}
667 @end example
668
669 This is a non-standard scheme. @var{page-spec} is passed directly to
670 the Lisp @code{man} function.
671
672 @node Defining New URLs
673 @chapter Defining New URLs
674
675 @menu
676 * Naming conventions::
677 * Required functions::
678 * Optional functions::
679 * Asynchronous fetching::
680 * Supporting file-name-handlers::
681 @end menu
682
683 @node Naming conventions
684 @section Naming conventions
685
686 @node Required functions
687 @section Required functions
688
689 @node Optional functions
690 @section Optional functions
691
692 @node Asynchronous fetching
693 @section Asynchronous fetching
694
695 @node Supporting file-name-handlers
696 @section Supporting file-name-handlers
697
698 @node General Facilities
699 @chapter General Facilities
700
701 @menu
702 * Disk Caching::
703 * Proxies::
704 * Gateways in general::
705 * History::
706 @end menu
707
708 @node Disk Caching
709 @section Disk Caching
710 @cindex Caching
711 @cindex Persistent Cache
712 @cindex Disk Cache
713
714 The disk cache stores retrieved documents locally, whence they can be
715 retrieved more quickly. When requesting a URL that is in the cache,
716 the library checks to see if the page has changed since it was last
717 retrieved from the remote machine. If not, the local copy is used,
718 saving the transmission over the network.
719 @cindex Cleaning the cache
720 @cindex Clearing the cache
721 @cindex Cache cleaning
722 Currently the cache isn't cleared automatically.
723 @c Running the @code{clean-cache} shell script
724 @c fist is recommended, to allow for future cleaning of the cache. This
725 @c shell script will remove all files that have not been accessed since it
726 @c was last run. To keep the cache pared down, it is recommended that this
727 @c script be run from @i{at} or @i{cron} (see the manual pages for
728 @c crontab(5) or at(1) for more information)
729
730 @defopt url-automatic-caching
731 Setting this variable non-@code{nil} causes documents to be cached
732 automatically.
733 @end defopt
734
735 @defopt url-cache-directory
736 This variable specifies the
737 directory to store the cache files. It defaults to sub-directory
738 @file{cache} of @code{url-configuration-directory}.
739 @end defopt
740
741 @c Fixme: function v. option, but neither used.
742 @c @findex url-cache-expired
743 @c @defopt url-cache-expired
744 @c This is a function to decide whether or not a cache entry has expired.
745 @c It takes two times as it parameters and returns non-@code{nil} if the
746 @c second time is ``too old'' when compared with the first time.
747 @c @end defopt
748
749 @defopt url-cache-creation-function
750 The cache relies on a scheme for mapping URLs to files in the cache.
751 This variable names a function which sets the type of cache to use.
752 It takes a URL as argument and returns the absolute file name of the
753 corresponding cache file. The two supplied possibilities are
754 @code{url-cache-create-filename-using-md5} and
755 @code{url-cache-create-filename-human-readable}.
756 @end defopt
757
758 @defun url-cache-create-filename-using-md5 url
759 Creates a cache file name from @var{url} using MD5 hashing.
760 This is creates entries with very few cache collisions and is fast.
761 @cindex MD5
762 @smallexample
763 (url-cache-create-filename-using-md5 "http://www.example.com/foo/bar")
764 @result{} "/home/fx/.url/cache/fx/http/com/example/www/b8a35774ad20db71c7c3409a5410e74f"
765 @end smallexample
766 @end defun
767
768 @defun url-cache-create-filename-human-readable url
769 Creates a cache file name from @var{url} more obviously connected to
770 @var{url} than for @code{url-cache-create-filename-using-md5}, but
771 more likely to conflict with other files.
772 @smallexample
773 (url-cache-create-filename-human-readable "http://www.example.com/foo/bar")
774 @result{} "/home/fx/.url/cache/fx/http/com/example/www/foo/bar"
775 @end smallexample
776 @end defun
777
778 @c Fixme: never actually used currently?
779 @c @defopt url-standalone-mode
780 @c @cindex Relying on cache
781 @c @cindex Cache only mode
782 @c @cindex Standalone mode
783 @c If this variable is non-@code{nil}, the library relies solely on the
784 @c cache for fetching documents and avoids checking if they have changed
785 @c on remote servers.
786 @c @end defopt
787
788 @c With a large cache of documents on the local disk, it can be very handy
789 @c when traveling, or any other time the network connection is not active
790 @c (a laptop with a dial-on-demand PPP connection, etc). Emacs/W3 can rely
791 @c solely on its cache, and avoid checking to see if the page has changed
792 @c on the remote server. In the case of a dial-on-demand PPP connection,
793 @c this will keep the phone line free as long as possible, only bringing up
794 @c the PPP connection when asking for a page that is not located in the
795 @c cache. This is very useful for demonstrations as well.
796
797 @node Proxies
798 @section Proxies and Gatewaying
799
800 @c fixme: check/document url-ns stuff
801 @cindex proxy servers
802 @cindex proxies
803 @cindex environment variables
804 @vindex HTTP_PROXY
805 Proxy servers are commonly used to provide gateways through firewalls
806 or as caches serving some more-or-less local network. Each protocol
807 (HTTP, FTP, etc.)@: can have a different gateway server. Proxying is
808 conventionally configured commonly amongst different programs through
809 environment variables of the form @code{@var{protocol}_proxy}, where
810 @var{protocol} is one of the supported network protocols (@code{http},
811 @code{ftp} etc.). The library recognizes such variables in either
812 upper or lower case. Their values are of one of the forms:
813 @itemize @bullet
814 @item @code{@var{host}:@var{port}}
815 @item A full URL;
816 @item Simply a host name.
817 @end itemize
818
819 @vindex NO_PROXY
820 The @code{NO_PROXY} environment variable specifies URLs that should be
821 excluded from proxying (on servers that should be contacted directly).
822 This should be a comma-separated list of hostnames, domain names, or a
823 mixture of both. Asterisks can be used as wildcards, but other
824 clients may not support that. Domain names may be indicated by a
825 leading dot. For example:
826 @example
827 NO_PROXY="*.aventail.com,home.com,.seanet.com"
828 @end example
829 @noindent says to contact all machines in the @samp{aventail.com} and
830 @samp{seanet.com} domains directly, as well as the machine named
831 @samp{home.com}. If @code{NO_PROXY} isn't defined, @code{no_PROXY}
832 and @code{no_proxy} are also tried, in that order.
833
834 Proxies may also be specified directly in Lisp.
835
836 @defopt url-proxy-services
837 This variable is an alist of URL schemes and proxy servers that
838 gateway them. The items are of the form @w{@code{(@var{scheme}
839 . @var{host}:@var{portnumber})}}, says that the URL @var{scheme} is
840 gatewayed through @var{portnumber} on the specified @var{host}. An
841 exception is the pseudo scheme @code{"no_proxy"}, which is paired with
842 a regexp matching host names not to be proxied. This variable is
843 initialized from the environment as above.
844
845 @example
846 (setq url-proxy-services
847 '(("http" . "proxy.aventail.com:80")
848 ("no_proxy" . "^.*\\(aventail\\|seanet\\)\\.com")))
849 @end example
850 @end defopt
851
852 @node Gateways in general
853 @section Gateways in General
854 @cindex gateways
855 @cindex firewalls
856
857 The library provides a general gateway layer through which all
858 networking passes. It can both control access to the network and
859 provide access through gateways in firewalls. This may make direct
860 connections in some cases and pass through some sort of gateway in
861 others.@footnote{Proxies (which only operate over HTTP) are
862 implemented using this.} The library's basic function responsible for
863 making connections is @code{url-open-stream}.
864
865 @defun url-open-stream name buffer host service
866 @cindex opening a stream
867 @cindex stream, opening
868 Open a stream to @var{host}, possibly via a gateway. The other
869 arguments are as for @code{open-network-stream}. This will not make a
870 connection if @code{url-gateway-unplugged} is non-@code{nil}.
871 @end defun
872
873 @defvar url-gateway-local-host-regexp
874 This is a regular expression that matches local hosts that do not
875 require the use of a gateway. If @code{nil}, all connections are made
876 through the gateway.
877 @end defvar
878
879 @defvar url-gateway-method
880 This variable controls which gateway method is used. It may be useful
881 to bind it temporarily in some applications. It has values taken from
882 a list of symbols. Possible values are:
883
884 @table @code
885 @item telnet
886 @cindex @command{telnet}
887 Use this method if you must first telnet and log into a gateway host,
888 and then run telnet from that host to connect to outside machines.
889
890 @item rlogin
891 @cindex @command{rlogin}
892 This method is identical to @code{telnet}, but uses @command{rlogin}
893 to log into the remote machine without having to send the username and
894 password over the wire every time.
895
896 @item socks
897 @cindex @sc{socks}
898 Use if the firewall has a @sc{socks} gateway running on it. The
899 @sc{socks} v5 protocol is defined in RFC 1928.
900
901 @c @item ssl
902 @c This probably shouldn't be documented
903 @c Fixme: why not? -- fx
904
905 @item native
906 This method uses Emacs's builtin networking directly. This is the
907 default. It can be used only if there is no firewall blocking access.
908 @end table
909 @end defvar
910
911 The following variables control the gateway methods.
912
913 @defopt url-gateway-telnet-host
914 The gateway host to telnet to. Once logged in there, you then telnet
915 out to the hosts you want to connect to.
916 @end defopt
917 @defopt url-gateway-telnet-parameters
918 This should be a list of parameters to pass to the @command{telnet} program.
919 @end defopt
920 @defopt url-gateway-telnet-password-prompt
921 This is a regular expression that matches the password prompt when
922 logging in.
923 @end defopt
924 @defopt url-gateway-telnet-login-prompt
925 This is a regular expression that matches the username prompt when
926 logging in.
927 @end defopt
928 @defopt url-gateway-telnet-user-name
929 The username to log in with.
930 @end defopt
931 @defopt url-gateway-telnet-password
932 The password to send when logging in.
933 @end defopt
934 @defopt url-gateway-prompt-pattern
935 This is a regular expression that matches the shell prompt.
936 @end defopt
937
938 @defopt url-gateway-rlogin-host
939 Host to @samp{rlogin} to before telnetting out.
940 @end defopt
941 @defopt url-gateway-rlogin-parameters
942 Parameters to pass to @samp{rsh}.
943 @end defopt
944 @defopt url-gateway-rlogin-user-name
945 User name to use when logging in to the gateway.
946 @end defopt
947 @defopt url-gateway-prompt-pattern
948 This is a regular expression that matches the shell prompt.
949 @end defopt
950
951 @defopt socks-server
952 This specifies the default server, it takes the form
953 @w{@code{("Default server" @var{server} @var{port} @var{version})}}
954 where @var{version} can be either 4 or 5.
955 @end defopt
956 @defvar socks-password
957 If this is @code{nil} then you will be asked for the password,
958 otherwise it will be used as the password for authenticating you to
959 the @sc{socks} server.
960 @end defvar
961 @defvar socks-username
962 This is the username to use when authenticating yourself to the
963 @sc{socks} server. By default this is your login name.
964 @end defvar
965 @defvar socks-timeout
966 This controls how long, in seconds, to wait for responses from the
967 @sc{socks} server; it is 5 by default.
968 @end defvar
969 @c fixme: these have been effectively commented-out in the code
970 @c @defopt socks-server-aliases
971 @c This a list of server aliases. It is a list of aliases of the form
972 @c @var{(alias hostname port version)}.
973 @c @end defopt
974 @c @defopt socks-network-aliases
975 @c This a list of network aliases. Each entry in the list takes the form
976 @c @var{(alias (network))} where @var{alias} is a string that names the
977 @c @var{network}. The networks can contain a pair (not a dotted pair) of
978 @c @sc{ip} addresses which specify a range of @sc{ip} addresses, an @sc{ip}
979 @c address and a netmask, a domain name or a unique hostname or @sc{ip}
980 @c address.
981 @c @end defopt
982 @c @defopt socks-redirection-rules
983 @c This a list of redirection rules. Each rule take the form
984 @c @var{(Destination network Connection type)} where @var{Destination
985 @c network} is a network alias from @code{socks-network-aliases} and
986 @c @var{Connection type} can be @code{nil} in which case a direct
987 @c connection is used, or it can be an alias from
988 @c @code{socks-server-aliases} in which case that server is used as a
989 @c proxy.
990 @c @end defopt
991 @defopt socks-nslookup-program
992 @cindex @command{nslookup}
993 This the @samp{nslookup} program. It is @code{"nslookup"} by default.
994 @end defopt
995
996 @menu
997 * Suppressing network connections::
998 @end menu
999 @c * Broken hostname resolution::
1000
1001 @node Suppressing network connections
1002 @subsection Suppressing Network Connections
1003
1004 @cindex network connections, suppressing
1005 @cindex suppressing network connections
1006 @cindex bugs, HTML
1007 @cindex HTML `bugs'
1008 In some circumstances it is desirable to suppress making network
1009 connections. A typical case is when rendering HTML in a mail user
1010 agent, when external URLs should not be activated, particularly to
1011 avoid `bugs' which `call home' by fetch single-pixel images and the
1012 like. To arrange this, bind the following variable for the duration
1013 of such processing.
1014
1015 @defvar url-gateway-unplugged
1016 If this variable is non-@code{nil} new network connections are never
1017 opened by the URL library.
1018 @end defvar
1019
1020 @c @node Broken hostname resolution
1021 @c @subsection Broken Hostname Resolution
1022
1023 @c @cindex hostname resolver
1024 @c @cindex resolver, hostname
1025 @c Some C libraries do not include the hostname resolver routines in
1026 @c their static libraries. If Emacs was linked statically, and was not
1027 @c linked with the resolver libraries, it will not be able to get to any
1028 @c machines off the local network. This is characterized by being able
1029 @c to reach someplace with a raw ip number, but not its hostname
1030 @c (@url{http://129.79.254.191/} works, but
1031 @c @url{http://www.cs.indiana.edu/} doesn't). This used to happen on
1032 @c SunOS4 and Ultrix, but is now probably now rare. If Emacs can't be
1033 @c rebuilt linked against the resolver library, it can use the external
1034 @c @command{nslookup} program instead.
1035
1036 @c @defopt url-gateway-broken-resolution
1037 @c @cindex @code{nslookup} program
1038 @c @cindex program, @code{nslookup}
1039 @c If non-@code{nil}, this variable says to use the program specified by
1040 @c @code{url-gateway-nslookup-program} program to do hostname resolution.
1041 @c @end defopt
1042
1043 @c @defopt url-gateway-nslookup-program
1044 @c The name of the program to do hostname lookup if Emacs can't do it
1045 @c directly. This program should expect a single argument on the command
1046 @c line---the hostname to resolve---and should produce output similar to
1047 @c the standard Unix @command{nslookup} program:
1048 @c @example
1049 @c Name: www.cs.indiana.edu
1050 @c Address: 129.79.254.191
1051 @c @end example
1052 @c @end defopt
1053
1054 @node History
1055 @section History
1056
1057 @findex url-do-setup
1058 The library can maintain a global history list tracking URLs accessed.
1059 URL completion can be done from it. The history mechanism is set up
1060 automatically via @code{url-do-setup} when it is configured to be on.
1061 Note that the size of the history list is currently not limited.
1062
1063 @vindex url-history-hash-table
1064 The history `list' is actually a hash table,
1065 @code{url-history-hash-table}. It contains access times keyed by URL
1066 strings. The times are in the format returned by @code{current-time}.
1067
1068 @defun url-history-update-url url time
1069 This function updates the history table with an entry for @var{url}
1070 accessed at the given @var{time}.
1071 @end defun
1072
1073 @defopt url-history-track
1074 If non-@code{nil}, the library will keep track of all the URLs
1075 accessed. If it is @code{t}, the list is saved to disk at the end of
1076 each Emacs session. The default is @code{nil}.
1077 @end defopt
1078
1079 @defopt url-history-file
1080 The file storing the history list between sessions. It defaults to
1081 @file{history} in @code{url-configuration-directory}.
1082 @end defopt
1083
1084 @defopt url-history-save-interval
1085 @findex url-history-setup-save-timer
1086 The number of seconds between automatic saves of the history list.
1087 Default is one hour. Note that if you change this variable directly,
1088 rather than using Custom, after @code{url-do-setup} has been run, you
1089 need to run the function @code{url-history-setup-save-timer}.
1090 @end defopt
1091
1092 @defun url-history-parse-history &optional fname
1093 Parses the history file @var{fname} (default @code{url-history-file})
1094 and sets up the history list.
1095 @end defun
1096
1097 @defun url-history-save-history &optional fname
1098 Saves the current history to file @var{fname} (default
1099 @code{url-history-file}).
1100 @end defun
1101
1102 @defun url-completion-function string predicate function
1103 You can use this function to do completion of URLs from the history.
1104 @end defun
1105
1106 @node Customization
1107 @chapter Customization
1108
1109 @section Environment Variables
1110
1111 @cindex environment variables
1112 The following environment variables affect the library's operation at
1113 startup.
1114
1115 @table @code
1116 @item TMPDIR
1117 @vindex TMPDIR
1118 @vindex url-temporary-directory
1119 If this is defined, @var{url-temporary-directory} is initialized from
1120 it.
1121 @end table
1122
1123 @section General User Options
1124
1125 The following user options, settable with Customize, affect the
1126 general operation of the package.
1127
1128 @defopt url-debug
1129 @cindex debugging
1130 Specifies the types of debug messages the library which are logged to
1131 the @code{*URL-DEBUG*} buffer.
1132 @code{t} means log all messages.
1133 A number means log all messages and show them with @code{message}.
1134 If may also be a list of the types of messages to be logged.
1135 @end defopt
1136 @defopt url-personal-mail-address
1137 @end defopt
1138 @defopt url-privacy-level
1139 @end defopt
1140 @defopt url-uncompressor-alist
1141 @end defopt
1142 @defopt url-passwd-entry-func
1143 @end defopt
1144 @defopt url-standalone-mode
1145 @end defopt
1146 @defopt url-bad-port-list
1147 @end defopt
1148 @defopt url-max-password-attempts
1149 @end defopt
1150 @defopt url-temporary-directory
1151 @end defopt
1152 @defopt url-show-status
1153 @end defopt
1154 @defopt url-confirmation-func
1155 The function to use for asking yes or no functions. This is normally
1156 either @code{y-or-n-p} or @code{yes-or-no-p}, but could be another
1157 function taking a single argument (the prompt) and returning @code{t}
1158 only if an affirmative answer is given.
1159 @end defopt
1160 @defopt url-gateway-method
1161 @c fixme: describe gatewaying
1162 A symbol specifying the type of gateway support to use for connections
1163 from the local machine. The supported methods are:
1164
1165 @table @code
1166 @item telnet
1167 Run telnet in a subprocess to connect;
1168 @item rlogin
1169 Rlogin to another machine to connect;
1170 @item socks
1171 Connect through a socks server;
1172 @item ssl
1173 Connect with SSL;
1174 @item native
1175 Connect directly.
1176 @end table
1177 @end defopt
1178
1179 @node Function Index
1180 @unnumbered Command and Function Index
1181 @printindex fn
1182
1183 @node Variable Index
1184 @unnumbered Variable Index
1185 @printindex vr
1186
1187 @node Concept Index
1188 @unnumbered Concept Index
1189 @printindex cp
1190
1191 @setchapternewpage odd
1192 @contents
1193 @bye
1194
1195 @ignore
1196 arch-tag: c96be356-7e2d-4196-bcda-b13246c5c3f0
1197 @end ignore