* doc/misc/url.texi: Fix quote usage in body text.
[bpt/emacs.git] / doc / misc / url.texi
1 \input texinfo
2 @setfilename ../../info/url
3 @settitle URL Programmer's Manual
4
5 @iftex
6 @c @finalout
7 @end iftex
8 @c @setchapternewpage odd
9 @c @smallbook
10
11 @tex
12 \overfullrule=0pt
13 %\global\baselineskip 30pt % for printing in double space
14 @end tex
15 @dircategory Emacs lisp libraries
16 @direntry
17 * URL: (url). URL loading package.
18 @end direntry
19
20 @copying
21 This file documents the Emacs Lisp URL loading package.
22
23 Copyright @copyright{} 1993-1999, 2002, 2004-2012 Free Software Foundation, Inc.
24
25 @quotation
26 Permission is granted to copy, distribute and/or modify this document
27 under the terms of the GNU Free Documentation License, Version 1.3 or
28 any later version published by the Free Software Foundation; with no
29 Invariant Sections, with the Front-Cover texts being ``A GNU Manual,''
30 and with the Back-Cover Texts as in (a) below. A copy of the license
31 is included in the section entitled ``GNU Free Documentation License''.
32
33 (a) The FSF's Back-Cover Text is: ``You have the freedom to copy and
34 modify this GNU manual. Buying copies from the FSF supports it in
35 developing GNU and promoting software freedom.''
36 @end quotation
37 @end copying
38
39 @c
40 @titlepage
41 @title URL Programmer's Manual
42 @subtitle First Edition, URL Version 2.0
43 @author William M. Perry @email{wmperry@@gnu.org}
44 @author David Love @email{fx@@gnu.org}
45 @page
46 @vskip 0pt plus 1filll
47 @insertcopying
48 @end titlepage
49
50 @contents
51
52 @node Top
53 @top URL
54
55 @ifnottex
56 @insertcopying
57 @end ifnottex
58
59 @menu
60 * Getting Started:: Preparing your program to use URLs.
61 * Retrieving URLs:: How to use this package to retrieve a URL.
62 * Supported URL Types:: Descriptions of URL types currently supported.
63 * Defining New URLs:: How to define a URL loader for a new protocol.
64 * General Facilities:: URLs can be cached, accessed via a gateway
65 and tracked in a history list.
66 * Customization:: Variables you can alter.
67 * GNU Free Documentation License:: The license for this documentation.
68 * Function Index::
69 * Variable Index::
70 * Concept Index::
71 @end menu
72
73 @node Getting Started
74 @chapter Getting Started
75 @cindex URLs, definition
76 @cindex URIs
77
78 @dfn{Uniform Resource Locators} (URLs) are a specific form of
79 @dfn{Uniform Resource Identifiers} (URI) described in RFC 2396 which
80 updates RFC 1738 and RFC 1808. RFC 2016 defines uniform resource
81 agents.
82
83 URIs have the form @var{scheme}:@var{scheme-specific-part}, where the
84 @var{scheme}s supported by this library are described below.
85 @xref{Supported URL Types}.
86
87 FTP, NFS, HTTP, HTTPS, @code{rlogin}, @code{telnet}, tn3270,
88 IRC and gopher URLs all have the form
89
90 @example
91 @var{scheme}://@r{[}@var{userinfo}@@@r{]}@var{hostname}@r{[}:@var{port}@r{]}@r{[}/@var{path}@r{]}
92 @end example
93 @noindent
94 where @samp{@r{[}} and @samp{@r{]}} delimit optional parts.
95 @var{userinfo} sometimes takes the form @var{username}:@var{password}
96 but you should beware of the security risks of sending cleartext
97 passwords. @var{hostname} may be a domain name or a dotted decimal
98 address. If the @samp{:@var{port}} is omitted then the library will
99 use the ``well known'' port for that service when accessing URLs. With
100 the possible exception of @code{telnet}, it is rare for ports to be
101 specified, and it is possible using a non-standard port may have
102 undesired consequences if a different service is listening on that
103 port (e.g., an HTTP URL specifying the SMTP port can cause mail to be
104 sent). @c , but @xref{Other Variables, url-bad-port-list}.
105 The meaning of the @var{path} component depends on the service.
106
107 @menu
108 * Configuration::
109 * Parsed URLs:: URLs are parsed into vector structures.
110 @end menu
111
112 @node Configuration
113 @section Configuration
114
115 @defvar url-configuration-directory
116 @cindex @file{~/.url}
117 @cindex configuration files
118 The directory in which URL configuration files, the cache etc.,
119 reside. The old default was @file{~/.url}, and this directory
120 is still used if it exists. The new default is a @file{url/}
121 directory in @code{user-emacs-directory}, which is normally
122 @file{~/.emacs.d}.
123 @end defvar
124
125 @node Parsed URLs
126 @section Parsed URLs
127 @cindex parsed URLs
128 The library functions typically operate on @dfn{parsed} versions of
129 URLs. These are actually vectors of the form:
130
131 @example
132 [@var{type} @var{user} @var{password} @var{host} @var{port} @var{file} @var{target} @var{attributes} @var{full}]
133 @end example
134
135 @noindent where
136 @table @var
137 @item type
138 is the type of the URL scheme, e.g., @code{http}
139 @item user
140 is the username associated with it, or @code{nil};
141 @item password
142 is the user password associated with it, or @code{nil};
143 @item host
144 is the host name associated with it, or @code{nil};
145 @item port
146 is the port number associated with it, or @code{nil};
147 @item file
148 is the ``file'' part of it, or @code{nil}. This doesn't necessarily
149 actually refer to a file;
150 @item target
151 is the target part, or @code{nil};
152 @item attributes
153 is the attributes associated with it, or @code{nil};
154 @item full
155 is @code{t} for a fully-specified URL, with a host part indicated by
156 @samp{//} after the scheme part.
157 @end table
158
159 @findex url-type
160 @findex url-user
161 @findex url-password
162 @findex url-host
163 @findex url-port
164 @findex url-file
165 @findex url-target
166 @findex url-attributes
167 @findex url-full
168 @findex url-set-type
169 @findex url-set-user
170 @findex url-set-password
171 @findex url-set-host
172 @findex url-set-port
173 @findex url-set-file
174 @findex url-set-target
175 @findex url-set-attributes
176 @findex url-set-full
177 These attributes have accessors named @code{url-@var{part}}, where
178 @var{part} is the name of one of the elements above, e.g.,
179 @code{url-host}. Similarly, there are setters of the form
180 @code{url-set-@var{part}}.
181
182 There are functions for parsing and unparsing between the string and
183 vector forms.
184
185 @defun url-generic-parse-url url
186 Return a parsed version of the string @var{url}.
187 @end defun
188
189 @defun url-recreate-url url
190 @cindex unparsing URLs
191 Recreates a URL string from the parsed @var{url}.
192 @end defun
193
194 @node Retrieving URLs
195 @chapter Retrieving URLs
196
197 @defun url-retrieve-synchronously url
198 Retrieve @var{url} synchronously and return a buffer containing the
199 data. @var{url} is either a string or a parsed URL structure. Return
200 @code{nil} if there are no data associated with it (the case for dired,
201 info, or mailto URLs that need no further processing).
202 @end defun
203
204 @defun url-retrieve url callback &optional cbargs silent no-cookies
205 Retrieve @var{url} asynchronously and call @var{callback} with args
206 @var{cbargs} when finished. The callback is called when the object
207 has been completely retrieved, with the current buffer containing the
208 object and any MIME headers associated with it. @var{url} is either a
209 string or a parsed URL structure. Returns the buffer @var{url} will
210 load into, or @code{nil} if the process has already completed.
211 If the optional argument @var{silent} is non-@code{nil}, suppress
212 progress messages. If the optional argument @var{no-cookies} is
213 non-@code{nil}, do not store or send cookies.
214 @end defun
215
216 @vindex url-queue-parallel-processes
217 @vindex url-queue-timeout
218 @defun url-queue-retrieve url callback &optional cbargs silent no-cookies
219 This acts like the @code{url-retrieve} function, but with limits on
220 the degree of parallelism. The option @code{url-queue-parallel-processes}
221 controls the number of concurrent processes, and the option
222 @code{url-queue-timeout} sets a timeout in seconds.
223 @end defun
224
225 @node Supported URL Types
226 @chapter Supported URL Types
227
228 @menu
229 * http/https:: Hypertext Transfer Protocol.
230 * file/ftp:: Local files and FTP archives.
231 * info:: Emacs "Info" pages.
232 * mailto:: Sending email.
233 * news/nntp/snews:: Usenet news.
234 * rlogin/telnet/tn3270:: Remote host connectivity.
235 * irc:: Internet Relay Chat.
236 * data:: Embedded data URLs.
237 * nfs:: Networked File System
238 @c * finger::
239 @c * gopher::
240 @c * netrek::
241 @c * prospero::
242 * cid:: Content-ID.
243 * about::
244 * ldap:: Lightweight Directory Access Protocol
245 * imap:: IMAP mailboxes.
246 * man:: Unix man pages.
247 @end menu
248
249 @node http/https
250 @section @code{http} and @code{https}
251
252 The scheme @code{http} is Hypertext Transfer Protocol. The library
253 supports version 1.1, specified in RFC 2616. (This supersedes 1.0,
254 defined in RFC 1945) HTTP URLs have the following form, where most of
255 the parts are optional:
256 @example
257 http://@var{user}:@var{password}@@@var{host}:@var{port}/@var{path}?@var{searchpart}#@var{fragment}
258 @end example
259 @c The @code{:@var{port}} part is optional, and @var{port} defaults to
260 @c 80. The @code{/@var{path}} part, if present, is a slash-separated
261 @c series elements. The @code{?@var{searchpart}}, if present, is the
262 @c query for a search or the content of a form submission. The
263 @c @code{#fragment} part, if present, is a location in the document.
264
265 The scheme @code{https} is a secure version of @code{http}, with
266 transmission via SSL. It is defined in RFC 2069. Its default port is
267 443. This scheme depends on SSL support in Emacs via the
268 @file{ssl.el} library and is actually implemented by forcing the
269 @code{ssl} gateway method to be used. @xref{Gateways in general}.
270
271 @defopt url-honor-refresh-requests
272 This controls honoring of HTTP @samp{Refresh} headers by which
273 servers can direct clients to reload documents from the same URL or a
274 or different one. @code{nil} means they will not be honored,
275 @code{t} (the default) means they will always be honored, and
276 otherwise the user will be asked on each request.
277 @end defopt
278
279
280 @menu
281 * Cookies::
282 * HTTP language/coding::
283 * HTTP URL Options::
284 * Dealing with HTTP documents::
285 @end menu
286
287 @node Cookies
288 @subsection Cookies
289
290 @defopt url-cookie-file
291 The file in which cookies are stored, defaulting to @file{cookies} in
292 the directory specified by @code{url-configuration-directory}.
293 @end defopt
294
295 @defopt url-cookie-confirmation
296 Specifies whether confirmation is require to accept cookies.
297 @end defopt
298
299 @defopt url-cookie-multiple-line
300 Specifies whether to put all cookies for the server on one line in the
301 HTTP request to satisfy broken servers like
302 @url{http://www.hotmail.com}.
303 @end defopt
304
305 @defopt url-cookie-trusted-urls
306 A list of regular expressions matching URLs from which to accept
307 cookies always.
308 @end defopt
309
310 @defopt url-cookie-untrusted-urls
311 A list of regular expressions matching URLs from which to reject
312 cookies always.
313 @end defopt
314
315 @defopt url-cookie-save-interval
316 The number of seconds between automatic saves of cookies to disk.
317 Default is one hour.
318 @end defopt
319
320
321 @node HTTP language/coding
322 @subsection Language and Encoding Preferences
323
324 HTTP allows clients to express preferences for the language and
325 encoding of documents which servers may honor. For each of these
326 variables, the value is a string; it can specify a single choice, or
327 it can be a comma-separated list.
328
329 Normally, this list is ordered by descending preference. However, each
330 element can be followed by @samp{;q=@var{priority}} to specify its
331 preference level, a decimal number from 0 to 1; e.g., for
332 @code{url-mime-language-string}, @w{@code{"de, en-gb;q=0.8,
333 en;q=0.7"}}. An element that has no @samp{;q} specification has
334 preference level 1.
335
336 @defopt url-mime-charset-string
337 @cindex character sets
338 @cindex coding systems
339 This variable specifies a preference for character sets when documents
340 can be served in more than one encoding.
341
342 HTTP allows specifying a series of MIME charsets which indicate your
343 preferred character set encodings, e.g., Latin-9 or Big5, and these
344 can be weighted. The default series is generated automatically from
345 the associated MIME types of all defined coding systems, sorted by the
346 coding system priority specified in Emacs. @xref{Recognize Coding, ,
347 Recognizing Coding Systems, emacs, The GNU Emacs Manual}.
348 @end defopt
349
350 @defopt url-mime-language-string
351 @cindex language preferences
352 A string specifying the preferred language when servers can serve
353 files in several languages. Use RFC 1766 abbreviations, e.g.,
354 @samp{en} for English, @samp{de} for German.
355
356 The string can be @code{"*"} to get the first available language (as
357 opposed to the default).
358 @end defopt
359
360 @node HTTP URL Options
361 @subsection HTTP URL Options
362
363 HTTP supports an @samp{OPTIONS} method describing things supported by
364 the URL@.
365
366 @defun url-http-options url
367 Returns a property list describing options available for URL. The
368 property list members are:
369
370 @table @code
371 @item methods
372 A list of symbols specifying what HTTP methods the resource
373 supports.
374
375 @item dav
376 @cindex DAV
377 A list of numbers specifying what DAV protocol/schema versions are
378 supported.
379
380 @item dasl
381 @cindex DASL
382 A list of supported DASL search types supported (string form).
383
384 @item ranges
385 A list of the units available for use in partial document fetches.
386
387 @item p3p
388 @cindex P3P
389 The @dfn{Platform For Privacy Protection} description for the resource.
390 Currently this is just the raw header contents.
391 @end table
392
393 @end defun
394
395 @node Dealing with HTTP documents
396 @subsection Dealing with HTTP documents
397
398 HTTP URLs are retrieved into a buffer containing the HTTP headers
399 followed by the body. Since the headers are quasi-MIME, they may be
400 processed using the MIME library. @xref{Top,, Emacs MIME,
401 emacs-mime, The Emacs MIME Manual}.
402
403 @node file/ftp
404 @section file and ftp
405 @cindex files
406 @cindex FTP
407 @cindex File Transfer Protocol
408 @cindex compressed files
409 @cindex dired
410
411 @example
412 ftp://@var{user}:@var{password}@@@var{host}:@var{port}/@var{file}
413 file://@var{user}:@var{password}@@@var{host}:@var{port}/@var{file}
414 @end example
415
416 These schemes are defined in RFC 1808.
417 @samp{ftp:} and @samp{file:} are synonymous in this library. They
418 allow reading arbitrary files from hosts. Either @samp{ange-ftp}
419 (Emacs) or @samp{efs} (XEmacs) is used to retrieve them from remote
420 hosts. Local files are accessed directly.
421
422 Compressed files are handled, but support is hard-coded so that
423 @code{jka-compr-compression-info-list} and so on have no affect.
424 Suffixes recognized are @samp{.z}, @samp{.gz}, @samp{.Z} and
425 @samp{.bz2}.
426
427 @defopt url-directory-index-file
428 The filename to look for when indexing a directory, default
429 @samp{"index.html"}. If this file exists, and is readable, then it
430 will be viewed instead of using @code{dired} to view the directory.
431 @end defopt
432
433 @node info
434 @section info
435 @cindex Info
436 @cindex Texinfo
437 @findex Info-goto-node
438
439 @example
440 info:@var{file}#@var{node}
441 @end example
442
443 Info URLs are not officially defined. They invoke
444 @code{Info-goto-node} with argument @samp{(@var{file})@var{node}}.
445 @samp{#@var{node}} is optional, defaulting to @samp{Top}.
446
447 @node mailto
448 @section mailto
449
450 @cindex mailto
451 @cindex email
452 A mailto URL will send an email message to the address in the
453 URL, for example @samp{mailto:foo@@bar.com} would compose a
454 message to @samp{foo@@bar.com}.
455
456 @defopt url-mail-command
457 @vindex mail-user-agent
458 The function called whenever url needs to send mail. This should
459 normally be left to default from @var{mail-user-agent}. @xref{Mail
460 Methods, , Mail-Composition Methods, emacs, The GNU Emacs Manual}.
461 @end defopt
462
463 An @samp{X-Url-From} header field containing the URL of the document
464 that contained the mailto URL is added if that URL is known.
465
466 RFC 2368 extends the definition of mailto URLs in RFC 1738.
467 The form of a mailto URL is
468 @example
469 @samp{mailto:@var{mailbox}[?@var{header}=@var{contents}[&@var{header}=@var{contents}]]}
470 @end example
471 @noindent where an arbitrary number of @var{header}s can be added. If the
472 @var{header} is @samp{body}, then @var{contents} is put in the body
473 otherwise a @var{header} header field is created with @var{contents}
474 as its contents. Note that the URL library does not consider any
475 headers ``dangerous'' so you should check them before sending the
476 message.
477
478 @c Fixme: update
479 Email messages are defined in @sc{rfc}822.
480
481 @node news/nntp/snews
482 @section @code{news}, @code{nntp} and @code{snews}
483 @cindex news
484 @cindex network news
485 @cindex usenet
486 @cindex NNTP
487 @cindex snews
488
489 @c draft-gilman-news-url-01
490 The network news URL scheme take the following forms following RFC
491 1738 except that for compatibility with other clients, host and port
492 fields may be included in news URLs though they are properly only
493 allowed for nntp an snews.
494
495 @table @samp
496 @item news:@var{newsgroup}
497 Retrieves a list of messages in @var{newsgroup};
498 @item news:@var{message-id}
499 Retrieves the message with the given @var{message-id};
500 @item news:*
501 Retrieves a list of all available newsgroups;
502 @item nntp://@var{host}:@var{port}/@var{newsgroup}
503 @itemx nntp://@var{host}:@var{port}/@var{message-id}
504 @itemx nntp://@var{host}:@var{port}/*
505 Similar to the @samp{news} versions.
506 @end table
507
508 @samp{:@var{port}} is optional and defaults to :119.
509
510 @samp{snews} is the same as @samp{nntp} except that the default port
511 is :563.
512 @cindex SSL
513 (It is tunneled through SSL.)
514
515 An @samp{nntp} URL is the same as a news URL, except that the URL may
516 specify an article by its number.
517
518 @defopt url-news-server
519 This variable can be used to override the default news server.
520 Usually this will be set by the Gnus package, which is used to fetch
521 news.
522 @cindex environment variable
523 @vindex NNTPSERVER
524 It may be set from the conventional environment variable
525 @code{NNTPSERVER}.
526 @end defopt
527
528 @node rlogin/telnet/tn3270
529 @section rlogin, telnet and tn3270
530 @cindex rlogin
531 @cindex telnet
532 @cindex tn3270
533 @cindex terminal emulation
534 @findex terminal-emulator
535
536 These URL schemes from RFC 1738 for logon via a terminal emulator have
537 the form
538 @example
539 telnet://@var{user}:@var{password}@@@var{host}:@var{port}
540 @end example
541 but the @code{:@var{password}} component is ignored.
542
543 To handle rlogin, telnet and tn3270 URLs, a @code{rlogin},
544 @code{telnet} or @code{tn3270} (the program names and arguments are
545 hardcoded) session is run in a @code{terminal-emulator} buffer.
546 Well-known ports are used if the URL does not specify a port.
547
548 @node irc
549 @section irc
550 @cindex IRC
551 @cindex Internet Relay Chat
552 @cindex ZEN IRC
553 @cindex ERC
554 @cindex rcirc
555 @c Fixme: reference (was http://www.w3.org/Addressing/draft-mirashi-url-irc-01.txt)
556 @dfn{Internet Relay Chat} (IRC) is handled by handing off the @sc{irc}
557 session to a function named in @code{url-irc-function}.
558
559 @defopt url-irc-function
560 A function to actually open an IRC connection.
561 This function
562 must take five arguments, @var{host}, @var{port}, @var{channel},
563 @var{user} and @var{password}. The @var{channel} argument specifies the
564 channel to join immediately, this can be @code{nil}. By default this is
565 @code{url-irc-rcirc}.
566 @end defopt
567 @defun url-irc-rcirc host port channel user password
568 Processes the arguments and lets @code{rcirc} handle the session.
569 @end defun
570 @defun url-irc-erc host port channel user password
571 Processes the arguments and lets @code{ERC} handle the session.
572 @end defun
573 @defun url-irc-zenirc host port channel user password
574 Processes the arguments and lets @code{zenirc} handle the session.
575 @end defun
576
577 @node data
578 @section data
579 @cindex data URLs
580
581 @example
582 data:@r{[}@var{media-type}@r{]}@r{[};@var{base64}@r{]},@var{data}
583 @end example
584
585 Data URLs contain MIME data in the URL itself. They are defined in
586 RFC 2397.
587
588 @var{media-type} is a MIME @samp{Content-Type} string, possibly
589 including parameters. It defaults to
590 @samp{text/plain;charset=US-ASCII}. The @samp{text/plain} can be
591 omitted but the charset parameter supplied. If @samp{;base64} is
592 present, the @var{data} are base64-encoded.
593
594 @node nfs
595 @section nfs
596 @cindex NFS
597 @cindex Network File System
598 @cindex automounter
599
600 @example
601 nfs://@var{user}:@var{password}@@@var{host}:@var{port}/@var{file}
602 @end example
603
604 The @samp{nfs:} scheme is defined in RFC 2224. It is similar to
605 @samp{ftp:} except that it points to a file on a remote host that is
606 handled by the automounter on the local host.
607
608 @defvar url-nfs-automounter-directory-spec
609 @end defvar
610 A string saying how to invoke the NFS automounter. Certain @samp{%}
611 sequences are recognized:
612
613 @table @samp
614 @item %h
615 The hostname of the NFS server;
616 @item %n
617 The port number of the NFS server;
618 @item %u
619 The username to use to authenticate;
620 @item %p
621 The password to use to authenticate;
622 @item %f
623 The filename on the remote server;
624 @item %%
625 A literal @samp{%}.
626 @end table
627
628 Each can be used any number of times.
629
630 @node cid
631 @section cid
632 @cindex Content-ID
633
634 RFC 2111
635
636 @node about
637 @section about
638
639 @node ldap
640 @section ldap
641 @cindex LDAP
642 @cindex Lightweight Directory Access Protocol
643
644 The LDAP scheme is defined in RFC 2255.
645
646 @node imap
647 @section imap
648 @cindex IMAP
649
650 RFC 2192
651
652 @node man
653 @section man
654 @cindex @command{man}
655 @cindex Unix man pages
656 @findex man
657
658 @example
659 @samp{man:@var{page-spec}}
660 @end example
661
662 This is a non-standard scheme. @var{page-spec} is passed directly to
663 the Lisp @code{man} function.
664
665 @node Defining New URLs
666 @chapter Defining New URLs
667
668 @menu
669 * Naming conventions::
670 * Required functions::
671 * Optional functions::
672 * Asynchronous fetching::
673 * Supporting file-name-handlers::
674 @end menu
675
676 @node Naming conventions
677 @section Naming conventions
678
679 @node Required functions
680 @section Required functions
681
682 @node Optional functions
683 @section Optional functions
684
685 @node Asynchronous fetching
686 @section Asynchronous fetching
687
688 @node Supporting file-name-handlers
689 @section Supporting file-name-handlers
690
691 @node General Facilities
692 @chapter General Facilities
693
694 @menu
695 * Disk Caching::
696 * Proxies::
697 * Gateways in general::
698 * History::
699 @end menu
700
701 @node Disk Caching
702 @section Disk Caching
703 @cindex Caching
704 @cindex Persistent Cache
705 @cindex Disk Cache
706
707 The disk cache stores retrieved documents locally, whence they can be
708 retrieved more quickly. When requesting a URL that is in the cache,
709 the library checks to see if the page has changed since it was last
710 retrieved from the remote machine. If not, the local copy is used,
711 saving the transmission over the network.
712 @cindex Cleaning the cache
713 @cindex Clearing the cache
714 @cindex Cache cleaning
715 Currently the cache isn't cleared automatically.
716 @c Running the @code{clean-cache} shell script
717 @c fist is recommended, to allow for future cleaning of the cache. This
718 @c shell script will remove all files that have not been accessed since it
719 @c was last run. To keep the cache pared down, it is recommended that this
720 @c script be run from @i{at} or @i{cron} (see the manual pages for
721 @c crontab(5) or at(1) for more information)
722
723 @defopt url-automatic-caching
724 Setting this variable non-@code{nil} causes documents to be cached
725 automatically.
726 @end defopt
727
728 @defopt url-cache-directory
729 This variable specifies the
730 directory to store the cache files. It defaults to sub-directory
731 @file{cache} of @code{url-configuration-directory}.
732 @end defopt
733
734 @defopt url-cache-creation-function
735 The cache relies on a scheme for mapping URLs to files in the cache.
736 This variable names a function which sets the type of cache to use.
737 It takes a URL as argument and returns the absolute file name of the
738 corresponding cache file. The two supplied possibilities are
739 @code{url-cache-create-filename-using-md5} and
740 @code{url-cache-create-filename-human-readable}.
741 @end defopt
742
743 @defun url-cache-create-filename-using-md5 url
744 Creates a cache file name from @var{url} using MD5 hashing.
745 This is creates entries with very few cache collisions and is fast.
746 @cindex MD5
747 @smallexample
748 (url-cache-create-filename-using-md5 "http://www.example.com/foo/bar")
749 @result{} "/home/fx/.url/cache/fx/http/com/example/www/b8a35774ad20db71c7c3409a5410e74f"
750 @end smallexample
751 @end defun
752
753 @defun url-cache-create-filename-human-readable url
754 Creates a cache file name from @var{url} more obviously connected to
755 @var{url} than for @code{url-cache-create-filename-using-md5}, but
756 more likely to conflict with other files.
757 @smallexample
758 (url-cache-create-filename-human-readable "http://www.example.com/foo/bar")
759 @result{} "/home/fx/.url/cache/fx/http/com/example/www/foo/bar"
760 @end smallexample
761 @end defun
762
763 @defun url-cache-expired
764 This function returns non-nil if a cache entry has expired (or is absent).
765 The arguments are a URL and optional expiration delay in seconds
766 (default @var{url-cache-expire-time}).
767 @end defun
768
769 @defopt url-cache-expire-time
770 This variable is the default number of seconds to use for the
771 expire-time argument of the function @code{url-cache-expired}.
772 @end defopt
773
774 @defun url-fetch-from-cache
775 This function takes a URL as its argument and returns a buffer
776 containing the data cached for that URL.
777 @end defun
778
779 @c Fixme: never actually used currently?
780 @c @defopt url-standalone-mode
781 @c @cindex Relying on cache
782 @c @cindex Cache only mode
783 @c @cindex Standalone mode
784 @c If this variable is non-@code{nil}, the library relies solely on the
785 @c cache for fetching documents and avoids checking if they have changed
786 @c on remote servers.
787 @c @end defopt
788
789 @c With a large cache of documents on the local disk, it can be very handy
790 @c when traveling, or any other time the network connection is not active
791 @c (a laptop with a dial-on-demand PPP connection, etc). Emacs/W3 can rely
792 @c solely on its cache, and avoid checking to see if the page has changed
793 @c on the remote server. In the case of a dial-on-demand PPP connection,
794 @c this will keep the phone line free as long as possible, only bringing up
795 @c the PPP connection when asking for a page that is not located in the
796 @c cache. This is very useful for demonstrations as well.
797
798 @node Proxies
799 @section Proxies and Gatewaying
800
801 @c fixme: check/document url-ns stuff
802 @cindex proxy servers
803 @cindex proxies
804 @cindex environment variables
805 @vindex HTTP_PROXY
806 Proxy servers are commonly used to provide gateways through firewalls
807 or as caches serving some more-or-less local network. Each protocol
808 (HTTP, FTP, etc.)@: can have a different gateway server. Proxying is
809 conventionally configured commonly amongst different programs through
810 environment variables of the form @code{@var{protocol}_proxy}, where
811 @var{protocol} is one of the supported network protocols (@code{http},
812 @code{ftp} etc.). The library recognizes such variables in either
813 upper or lower case. Their values are of one of the forms:
814 @itemize @bullet
815 @item @code{@var{host}:@var{port}}
816 @item A full URL;
817 @item Simply a host name.
818 @end itemize
819
820 @vindex NO_PROXY
821 The @code{NO_PROXY} environment variable specifies URLs that should be
822 excluded from proxying (on servers that should be contacted directly).
823 This should be a comma-separated list of hostnames, domain names, or a
824 mixture of both. Asterisks can be used as wildcards, but other
825 clients may not support that. Domain names may be indicated by a
826 leading dot. For example:
827 @example
828 NO_PROXY="*.aventail.com,home.com,.seanet.com"
829 @end example
830 @noindent says to contact all machines in the @samp{aventail.com} and
831 @samp{seanet.com} domains directly, as well as the machine named
832 @samp{home.com}. If @code{NO_PROXY} isn't defined, @code{no_PROXY}
833 and @code{no_proxy} are also tried, in that order.
834
835 Proxies may also be specified directly in Lisp.
836
837 @defopt url-proxy-services
838 This variable is an alist of URL schemes and proxy servers that
839 gateway them. The items are of the form @w{@code{(@var{scheme}
840 . @var{host}:@var{portnumber})}}, says that the URL @var{scheme} is
841 gatewayed through @var{portnumber} on the specified @var{host}. An
842 exception is the pseudo scheme @code{"no_proxy"}, which is paired with
843 a regexp matching host names not to be proxied. This variable is
844 initialized from the environment as above.
845
846 @example
847 (setq url-proxy-services
848 '(("http" . "proxy.aventail.com:80")
849 ("no_proxy" . "^.*\\(aventail\\|seanet\\)\\.com")))
850 @end example
851 @end defopt
852
853 @node Gateways in general
854 @section Gateways in General
855 @cindex gateways
856 @cindex firewalls
857
858 The library provides a general gateway layer through which all
859 networking passes. It can both control access to the network and
860 provide access through gateways in firewalls. This may make direct
861 connections in some cases and pass through some sort of gateway in
862 others.@footnote{Proxies (which only operate over HTTP) are
863 implemented using this.} The library's basic function responsible for
864 making connections is @code{url-open-stream}.
865
866 @defun url-open-stream name buffer host service
867 @cindex opening a stream
868 @cindex stream, opening
869 Open a stream to @var{host}, possibly via a gateway. The other
870 arguments are as for @code{open-network-stream}. This will not make a
871 connection if @code{url-gateway-unplugged} is non-@code{nil}.
872 @end defun
873
874 @defvar url-gateway-local-host-regexp
875 This is a regular expression that matches local hosts that do not
876 require the use of a gateway. If @code{nil}, all connections are made
877 through the gateway.
878 @end defvar
879
880 @defvar url-gateway-method
881 This variable controls which gateway method is used. It may be useful
882 to bind it temporarily in some applications. It has values taken from
883 a list of symbols. Possible values are:
884
885 @table @code
886 @item telnet
887 @cindex @command{telnet}
888 Use this method if you must first telnet and log into a gateway host,
889 and then run telnet from that host to connect to outside machines.
890
891 @item rlogin
892 @cindex @command{rlogin}
893 This method is identical to @code{telnet}, but uses @command{rlogin}
894 to log into the remote machine without having to send the username and
895 password over the wire every time.
896
897 @item socks
898 @cindex @sc{socks}
899 Use if the firewall has a @sc{socks} gateway running on it. The
900 @sc{socks} v5 protocol is defined in RFC 1928.
901
902 @c @item ssl
903 @c This probably shouldn't be documented
904 @c Fixme: why not? -- fx
905
906 @item native
907 This method uses Emacs's builtin networking directly. This is the
908 default. It can be used only if there is no firewall blocking access.
909 @end table
910 @end defvar
911
912 The following variables control the gateway methods.
913
914 @defopt url-gateway-telnet-host
915 The gateway host to telnet to. Once logged in there, you then telnet
916 out to the hosts you want to connect to.
917 @end defopt
918 @defopt url-gateway-telnet-parameters
919 This should be a list of parameters to pass to the @command{telnet} program.
920 @end defopt
921 @defopt url-gateway-telnet-password-prompt
922 This is a regular expression that matches the password prompt when
923 logging in.
924 @end defopt
925 @defopt url-gateway-telnet-login-prompt
926 This is a regular expression that matches the username prompt when
927 logging in.
928 @end defopt
929 @defopt url-gateway-telnet-user-name
930 The username to log in with.
931 @end defopt
932 @defopt url-gateway-telnet-password
933 The password to send when logging in.
934 @end defopt
935 @defopt url-gateway-prompt-pattern
936 This is a regular expression that matches the shell prompt.
937 @end defopt
938
939 @defopt url-gateway-rlogin-host
940 Host to @samp{rlogin} to before telnetting out.
941 @end defopt
942 @defopt url-gateway-rlogin-parameters
943 Parameters to pass to @samp{rsh}.
944 @end defopt
945 @defopt url-gateway-rlogin-user-name
946 User name to use when logging in to the gateway.
947 @end defopt
948 @defopt url-gateway-prompt-pattern
949 This is a regular expression that matches the shell prompt.
950 @end defopt
951
952 @defopt socks-server
953 This specifies the default server, it takes the form
954 @w{@code{("Default server" @var{server} @var{port} @var{version})}}
955 where @var{version} can be either 4 or 5.
956 @end defopt
957 @defvar socks-password
958 If this is @code{nil} then you will be asked for the password,
959 otherwise it will be used as the password for authenticating you to
960 the @sc{socks} server.
961 @end defvar
962 @defvar socks-username
963 This is the username to use when authenticating yourself to the
964 @sc{socks} server. By default this is your login name.
965 @end defvar
966 @defvar socks-timeout
967 This controls how long, in seconds, to wait for responses from the
968 @sc{socks} server; it is 5 by default.
969 @end defvar
970 @c fixme: these have been effectively commented-out in the code
971 @c @defopt socks-server-aliases
972 @c This a list of server aliases. It is a list of aliases of the form
973 @c @var{(alias hostname port version)}.
974 @c @end defopt
975 @c @defopt socks-network-aliases
976 @c This a list of network aliases. Each entry in the list takes the form
977 @c @var{(alias (network))} where @var{alias} is a string that names the
978 @c @var{network}. The networks can contain a pair (not a dotted pair) of
979 @c @sc{ip} addresses which specify a range of @sc{ip} addresses, an @sc{ip}
980 @c address and a netmask, a domain name or a unique hostname or @sc{ip}
981 @c address.
982 @c @end defopt
983 @c @defopt socks-redirection-rules
984 @c This a list of redirection rules. Each rule take the form
985 @c @var{(Destination network Connection type)} where @var{Destination
986 @c network} is a network alias from @code{socks-network-aliases} and
987 @c @var{Connection type} can be @code{nil} in which case a direct
988 @c connection is used, or it can be an alias from
989 @c @code{socks-server-aliases} in which case that server is used as a
990 @c proxy.
991 @c @end defopt
992 @defopt socks-nslookup-program
993 @cindex @command{nslookup}
994 This the @samp{nslookup} program. It is @code{"nslookup"} by default.
995 @end defopt
996
997 @menu
998 * Suppressing network connections::
999 @end menu
1000 @c * Broken hostname resolution::
1001
1002 @node Suppressing network connections
1003 @subsection Suppressing Network Connections
1004
1005 @cindex network connections, suppressing
1006 @cindex suppressing network connections
1007 @cindex bugs, HTML
1008 @cindex HTML `bugs'
1009 In some circumstances it is desirable to suppress making network
1010 connections. A typical case is when rendering HTML in a mail user
1011 agent, when external URLs should not be activated, particularly to
1012 avoid ``bugs'' which ``call home'' by fetch single-pixel images and the
1013 like. To arrange this, bind the following variable for the duration
1014 of such processing.
1015
1016 @defvar url-gateway-unplugged
1017 If this variable is non-@code{nil} new network connections are never
1018 opened by the URL library.
1019 @end defvar
1020
1021 @c @node Broken hostname resolution
1022 @c @subsection Broken Hostname Resolution
1023
1024 @c @cindex hostname resolver
1025 @c @cindex resolver, hostname
1026 @c Some C libraries do not include the hostname resolver routines in
1027 @c their static libraries. If Emacs was linked statically, and was not
1028 @c linked with the resolver libraries, it will not be able to get to any
1029 @c machines off the local network. This is characterized by being able
1030 @c to reach someplace with a raw ip number, but not its hostname
1031 @c (@url{http://129.79.254.191/} works, but
1032 @c @url{http://www.cs.indiana.edu/} doesn't). This used to happen on
1033 @c SunOS4 and Ultrix, but is now probably now rare. If Emacs can't be
1034 @c rebuilt linked against the resolver library, it can use the external
1035 @c @command{nslookup} program instead.
1036
1037 @c @defopt url-gateway-broken-resolution
1038 @c @cindex @code{nslookup} program
1039 @c @cindex program, @code{nslookup}
1040 @c If non-@code{nil}, this variable says to use the program specified by
1041 @c @code{url-gateway-nslookup-program} program to do hostname resolution.
1042 @c @end defopt
1043
1044 @c @defopt url-gateway-nslookup-program
1045 @c The name of the program to do hostname lookup if Emacs can't do it
1046 @c directly. This program should expect a single argument on the command
1047 @c line---the hostname to resolve---and should produce output similar to
1048 @c the standard Unix @command{nslookup} program:
1049 @c @example
1050 @c Name: www.cs.indiana.edu
1051 @c Address: 129.79.254.191
1052 @c @end example
1053 @c @end defopt
1054
1055 @node History
1056 @section History
1057
1058 @findex url-do-setup
1059 The library can maintain a global history list tracking URLs accessed.
1060 URL completion can be done from it. The history mechanism is set up
1061 automatically via @code{url-do-setup} when it is configured to be on.
1062 Note that the size of the history list is currently not limited.
1063
1064 @vindex url-history-hash-table
1065 The history ``list'' is actually a hash table,
1066 @code{url-history-hash-table}. It contains access times keyed by URL
1067 strings. The times are in the format returned by @code{current-time}.
1068
1069 @defun url-history-update-url url time
1070 This function updates the history table with an entry for @var{url}
1071 accessed at the given @var{time}.
1072 @end defun
1073
1074 @defopt url-history-track
1075 If non-@code{nil}, the library will keep track of all the URLs
1076 accessed. If it is @code{t}, the list is saved to disk at the end of
1077 each Emacs session. The default is @code{nil}.
1078 @end defopt
1079
1080 @defopt url-history-file
1081 The file storing the history list between sessions. It defaults to
1082 @file{history} in @code{url-configuration-directory}.
1083 @end defopt
1084
1085 @defopt url-history-save-interval
1086 @findex url-history-setup-save-timer
1087 The number of seconds between automatic saves of the history list.
1088 Default is one hour. Note that if you change this variable directly,
1089 rather than using Custom, after @code{url-do-setup} has been run, you
1090 need to run the function @code{url-history-setup-save-timer}.
1091 @end defopt
1092
1093 @defun url-history-parse-history &optional fname
1094 Parses the history file @var{fname} (default @code{url-history-file})
1095 and sets up the history list.
1096 @end defun
1097
1098 @defun url-history-save-history &optional fname
1099 Saves the current history to file @var{fname} (default
1100 @code{url-history-file}).
1101 @end defun
1102
1103 @defun url-completion-function string predicate function
1104 You can use this function to do completion of URLs from the history.
1105 @end defun
1106
1107 @node Customization
1108 @chapter Customization
1109
1110 @section Environment Variables
1111
1112 @cindex environment variables
1113 The following environment variables affect the library's operation at
1114 startup.
1115
1116 @table @code
1117 @item TMPDIR
1118 @vindex TMPDIR
1119 @vindex url-temporary-directory
1120 If this is defined, @var{url-temporary-directory} is initialized from
1121 it.
1122 @end table
1123
1124 @section General User Options
1125
1126 The following user options, settable with Customize, affect the
1127 general operation of the package.
1128
1129 @defopt url-debug
1130 @cindex debugging
1131 Specifies the types of debug messages which are logged to
1132 the @code{*URL-DEBUG*} buffer.
1133 @code{t} means log all messages.
1134 A number means log all messages and show them with @code{message}.
1135 It may also be a list of the types of messages to be logged.
1136 @end defopt
1137 @defopt url-personal-mail-address
1138 @end defopt
1139 @defopt url-privacy-level
1140 @end defopt
1141 @defopt url-uncompressor-alist
1142 @end defopt
1143 @defopt url-passwd-entry-func
1144 @end defopt
1145 @defopt url-standalone-mode
1146 @end defopt
1147 @defopt url-bad-port-list
1148 @end defopt
1149 @defopt url-max-password-attempts
1150 @end defopt
1151 @defopt url-temporary-directory
1152 @end defopt
1153 @defopt url-show-status
1154 @end defopt
1155 @defopt url-confirmation-func
1156 The function to use for asking yes or no functions. This is normally
1157 either @code{y-or-n-p} or @code{yes-or-no-p}, but could be another
1158 function taking a single argument (the prompt) and returning @code{t}
1159 only if an affirmative answer is given.
1160 @end defopt
1161 @defopt url-gateway-method
1162 @c fixme: describe gatewaying
1163 A symbol specifying the type of gateway support to use for connections
1164 from the local machine. The supported methods are:
1165
1166 @table @code
1167 @item telnet
1168 Run telnet in a subprocess to connect;
1169 @item rlogin
1170 Rlogin to another machine to connect;
1171 @item socks
1172 Connect through a socks server;
1173 @item ssl
1174 Connect with SSL;
1175 @item native
1176 Connect directly.
1177 @end table
1178 @end defopt
1179
1180 @node GNU Free Documentation License
1181 @appendix GNU Free Documentation License
1182 @include doclicense.texi
1183
1184 @node Function Index
1185 @unnumbered Command and Function Index
1186 @printindex fn
1187
1188 @node Variable Index
1189 @unnumbered Variable Index
1190 @printindex vr
1191
1192 @node Concept Index
1193 @unnumbered Concept Index
1194 @printindex cp
1195
1196 @bye