(Input Methods): Refer to the command C-u C-x =.
[bpt/emacs.git] / man / url.texi
CommitLineData
948a35c1 1\input texinfo
89281a95 2@setfilename ../info/url
948a35c1
SM
3@settitle URL Programmer's Manual
4
5@iftex
6@c @finalout
7@end iftex
8@c @setchapternewpage odd
9@c @smallbook
10
11@tex
12\overfullrule=0pt
13%\global\baselineskip 30pt % for printing in double space
14@end tex
15@dircategory World Wide Web
16@dircategory GNU Emacs Lisp
17@direntry
18* URL: (url). URL loading package.
19@end direntry
20
21@ifnottex
22This file documents the URL loading package.
23
24Copyright (C) 1996, 1997, 1998, 1999, 2002, 2004 Free Software Foundation
25Copyright (C) 1993, 1994, 1995, 1996 William M. Perry
26
27Permission is granted to copy, distribute and/or modify this document
28under the terms of the GNU Free Documentation License, Version 1.1 or
29any later version published by the Free Software Foundation; with the
a17e377e 30Invariant Sections being
948a35c1
SM
31``GNU GENERAL PUBLIC LICENSE''. A copy of the
32license is included in the section entitled ``GNU Free Documentation
33License.''
34@end ifnottex
35
36@c
37@titlepage
38@sp 6
39@center @titlefont{URL}
40@center @titlefont{Programmer's Manual}
41@sp 4
42@center First Edition, URL Version 2.0
43@sp 1
44@c @center December 1999
45@sp 5
46@center William M. Perry
47@center @email{wmperry@@gnu.org}
48@center David Love
49@center @email{fx@@gnu.org}
50@page
51@vskip 0pt plus 1filll
52Copyright @copyright{} 1993, 1994, 1995, 1996 William M. Perry@*
53Copyright @copyright{} 1996, 1997, 1998, 1999, 2002 Free Software Foundation
54
55Permission is granted to copy, distribute and/or modify this document
56under the terms of the GNU Free Documentation License, Version 1.1 or
57any later version published by the Free Software Foundation; with the
58Invariant Sections being
59``GNU GENERAL PUBLIC LICENSE''. A copy of the
60license is included in the section entitled ``GNU Free Documentation
61License.''
62@end titlepage
63@page
64@node Top
65@top URL
66
67
68
69@menu
70* Getting Started:: Preparing your program to use URLs.
71* Retrieving URLs:: How to use this package to retrieve a URL.
72* Supported URL Types:: Descriptions of URL types currently supported.
73* Defining New URLs:: How to define a URL loader for a new protocol.
74* General Facilities:: URLs can be cached, accessed via a gateway
75 and tracked in a history list.
76* Customization:: Variables you can alter.
a17e377e
LT
77* Function Index::
78* Variable Index::
79* Concept Index::
948a35c1
SM
80@end menu
81
82@node Getting Started
83@chapter Getting Started
84@cindex URLs, definition
85@cindex URIs
86
87@dfn{Uniform Resource Locators} (URLs) are a specific form of
88@dfn{Uniform Resource Identifiers} (URI) described in RFC 2396 which
89updates RFC 1738 and RFC 1808. RFC 2016 defines uniform resource
90agents.
91
92URIs have the form @var{scheme}:@var{scheme-specific-part}, where the
93@var{scheme}s supported by this library are described below.
94@xref{Supported URL Types}.
95
96FTP NFS, HTTP, HTTPS, @code{rlogin}, @code{telnet}, tn3270,
97IRC and gopher URLs all have the form
98
99@example
100@var{scheme}://@r{[}@var{userinfo}@@@r{]}@var{hostname}@r{[}:@var{port}@r{]}@r{[}/@var{path}@r{]}
101@end example
102@noindent
103where @samp{@r{[}} and @samp{@r{]}} delimit optional parts.
104@var{userinfo} sometimes takes the form @var{username}:@var{password}
105but you should beware of the security risks of sending cleartext
106passwords. @var{hostname} may be a domain name or a dotted decimal
107address. If the @samp{:@var{port}} is omitted then the library will
108use the `well known' port for that service when accessing URLs. With
109the possible exception of @code{telnet}, it is rare for ports to be
110specified, and it is possible using a non-standard port may have
111undesired consequences if a different service is listening on that
0111d1e1 112port (e.g., an HTTP URL specifying the SMTP port can cause mail to be
948a35c1
SM
113sent).@c , but @xref{Other Variables, url-bad-port-list}.
114The meaning of
115the @var{path} component depends on the service.
116
948a35c1 117@menu
a17e377e 118* Configuration::
948a35c1
SM
119* Parsed URLs:: URLs are parsed into vector structures.
120@end menu
121
122@node Configuration
123@section Configuration
124
125@defvar url-configuration-directory
126@cindex @file{~/.url}
127@cindex configuration files
128The directory in which URL configuration files, the cache etc.,
129reside. Default @file{~/.url}.
130@end defvar
131
132@node Parsed URLs
133@section Parsed URLs
134@cindex parsed URLs
135The library functions typically operate on @dfn{parsed} versions of
136URLs. These are actually vectors of the form:
137
138@example
139[@var{type} @var{user} @var{password} @var{host} @var{port} @var{file} @var{target} @var{attributes} @var{full}]
140@end example
141
142@noindent where
143@table @var
144@item type
0111d1e1 145is the type of the URL scheme, e.g., @code{http}
948a35c1
SM
146@item user
147is the username associated with it, or @code{nil};
148@item password
149is the user password associated with it, or @code{nil};
150@item host
151is the host name associated with it, or @code{nil};
152@item port
153is the port number associated with it, or @code{nil};
154@item file
155is the `file' part of it, or @code{nil}. This doesn't necessarily
156actually refer to a file;
157@item target
158is the target part, or @code{nil};
159@item attributes
160is the attributes associated with it, or @code{nil};
161@item full
162is @code{t} for a fully-specified URL, with a host part indicated by
163@samp{//} after the scheme part.
164@end table
165
166@findex url-type
167@findex url-user
168@findex url-password
169@findex url-host
170@findex url-port
171@findex url-file
172@findex url-target
173@findex url-attributes
174@findex url-full
175@findex url-set-type
176@findex url-set-user
177@findex url-set-password
178@findex url-set-host
179@findex url-set-port
180@findex url-set-file
181@findex url-set-target
182@findex url-set-attributes
183@findex url-set-full
184These attributes have accessors named @code{url-@var{part}}, where
0111d1e1 185@var{part} is the name of one of the elements above, e.g.,
948a35c1
SM
186@code{url-host}. Similarly, there are setters of the form
187@code{url-set-@var{part}}.
188
189There are functions for parsing and unparsing between the string and
190vector forms.
191
192@defun url-generic-parse-url url
193Return a parsed version of the string @var{url}.
194@end defun
195
196@defun url-recreate-url url
197@cindex unparsing URLs
198Recreates a URL string from the parsed @var{url}.
199@end defun
200
201@node Retrieving URLs
202@chapter Retrieving URLs
203
204@defun url-retrieve-synchronously url
205Retrieve @var{url} synchronously and return a buffer containing the
206data. @var{url} is either a string or a parsed URL structure. Return
ac091f3d 207@code{nil} if there are no data associated with it (the case for dired,
948a35c1
SM
208info, or mailto URLs that need no further processing).
209@end defun
210
211@defun url-retrieve url callback &optional cbargs
212Retrieve @var{url} asynchronously and call @var{callback} with args
213@var{cbargs} when finished. The callback is called when the object
214has been completely retrieved, with the current buffer containing the
215object and any MIME headers associated with it. @var{url} is either a
216string or a parsed URL structure. Returns the buffer @var{url} will
ac091f3d 217load into, or @code{nil} if the process has already completed.
948a35c1
SM
218@end defun
219
220@node Supported URL Types
221@chapter Supported URL Types
222
223@menu
224* http/https:: Hypertext Transfer Protocol.
a17e377e 225* file/ftp:: Local files and FTP archives.
948a35c1
SM
226* info:: Emacs `Info' pages.
227* mailto:: Sending email.
228* news/nntp/snews:: Usenet news.
229* rlogin/telnet/tn3270:: Remote host connectivity.
230* irc:: Internet Relay Chat.
231* data:: Embedded data URLs.
232* nfs:: Networked File System
233@c * finger::
234@c * gopher::
235@c * netrek::
236@c * prospero::
237* cid:: Content-ID.
a17e377e 238* about::
948a35c1
SM
239* ldap:: Lightweight Directory Access Protocol
240* imap:: IMAP mailboxes.
241* man:: Unix man pages.
242@end menu
243
244@node http/https
245@section @code{http} and @code{https}
246
247The scheme @code{http} is Hypertext Transfer Protocol. The library
248supports version 1.1, specified in RFC 2616. (This supersedes 1.0,
249defined in RFC 1945) HTTP URLs have the following form, where most of
250the parts are optional:
251@example
252http://@var{user}:@var{password}@var{host}:@var{port}/@var{path}?@var{searchpart}#@var{fragment}
253@end example
254@c The @code{:@var{port}} part is optional, and @var{port} defaults to
255@c 80. The @code{/@var{path}} part, if present, is a slash-separated
256@c series elements. The @code{?@var{searchpart}}, if present, is the
257@c query for a search or the content of a form submission. The
258@c @code{#fragment} part, if present, is a location in the document.
259
260The scheme @code{https} is a secure version of @code{http}, with
261transmission via SSL. It is defined in RFC 2069. Its default port is
262443. This scheme depends on SSL support in Emacs via the
263@file{ssl.el} library and is actually implemented by forcing the
264@code{ssl} gateway method to be used. @xref{Gateways in general}.
265
266@defopt url-honor-refresh-requests
267This controls honouring of HTTP @samp{Refresh} headers by which
268servers can direct clients to reload documents from the same URL or a
269or different one. @code{nil} means they will not be honoured,
270@code{t} (the default) means they will always be honoured, and
271otherwise the user will be asked on each request.
272@end defopt
273
274
275@menu
a17e377e
LT
276* Cookies::
277* HTTP language/coding::
278* HTTP URL Options::
279* Dealing with HTTP documents::
948a35c1
SM
280@end menu
281
282@node Cookies
283@subsection Cookies
284
285@defopt url-cookie-file
286The file in which cookies are stored, defaulting to @file{cookies} in
287the directory specified by @code{url-configuration-directory}.
288@end defopt
289
290@defopt url-cookie-confirmation
291Specifies whether confirmation is require to accept cookies.
292@end defopt
293
294@defopt url-cookie-multiple-line
295Specifies whether to put all cookies for the server on one line in the
296HTTP request to satisfy broken servers like
297@url{http://www.hotmail.com}.
298@end defopt
299
300@defopt url-cookie-trusted-urls
301A list of regular expressions matching URLs from which to accept
302cookies always.
303@end defopt
304
305@defopt url-cookie-untrusted-urls
306A list of regular expressions matching URLs from which to reject
307cookies always.
308@end defopt
309
310@defopt url-cookie-save-interval
311The number of seconds between automatic saves of cookies to disk.
312Default is one hour.
313@end defopt
314
315
316@node HTTP language/coding
317@subsection Language and Encoding Preferences
318
319HTTP allows clients to express preferences for the language and
0111d1e1
RS
320encoding of documents which servers may honour. For each of these
321variables, the value is a string; it can specify a single choice, or
4a0c6358
RS
322it can be a comma-separated list.
323
324Normally this list ordered by descending preference. However, each
325element can be followed by @samp{;q=@var{priority}} to specify its
326preference level, a decimal number from 0 to 1; e.g., for
327@code{url-mime-language-string}, @w{@code{"de, en-gb;q=0.8,
328en;q=0.7"}}. An element that has no @samp{;q} specification has
329preference level 1.
948a35c1
SM
330
331@defopt url-mime-charset-string
332@cindex character sets
333@cindex coding systems
334This variable specifies a preference for character sets when documents
335can be served in more than one encoding.
336
0111d1e1
RS
337HTTP allows specifying a series of MIME charsets which indicate your
338preferred character set encodings, e.g., Latin-9 or Big5, and these
339can be weighted. The default series is generated automatically from
340the associated MIME types of all defined coding systems, sorted by the
341coding system priority specified in Emacs. @xref{Recognize Coding, ,
342Recognizing Coding Systems, emacs, The GNU Emacs Manual}.
948a35c1
SM
343@end defopt
344
345@defopt url-mime-language-string
346@cindex language preferences
347A string specifying the preferred language when servers can serve
0111d1e1
RS
348files in several languages. Use RFC 1766 abbreviations, e.g.,
349@samp{en} for English, @samp{de} for German.
350
351The string can be @code{"*"} to get the first available language (as
352opposed to the default).
948a35c1
SM
353@end defopt
354
355@node HTTP URL Options
356@subsection HTTP URL Options
357
358HTTP supports an @samp{OPTIONS} method describing things supported by
359the URL@.
360
361@defun url-http-options url
362Returns a property list describing options available for URL. The
363property list members are:
364
365@table @code
366@item methods
367A list of symbols specifying what HTTP methods the resource
368supports.
369
370@item dav
371@cindex DAV
372A list of numbers specifying what DAV protocol/schema versions are
373supported.
374
375@item dasl
376@cindex DASL
377A list of supported DASL search types supported (string form).
378
379@item ranges
380A list of the units available for use in partial document fetches.
381
382@item p3p
383@cindex P3P
384The @dfn{Platform For Privacy Protection} description for the resource.
385Currently this is just the raw header contents.
386@end table
387
388@end defun
389
390@node Dealing with HTTP documents
391@subsection Dealing with HTTP documents
392
393HTTP URLs are retrieved into a buffer containing the HTTP headers
394followed by the body. Since the headers are quasi-MIME, they may be
15594861
LT
395processed using the MIME library. @xref{Top,, Emacs MIME,
396emacs-mime, The Emacs MIME Manual}. The URL package provides a
397function to do this in general:
948a35c1
SM
398
399@defun url-decode-text-part handle &optional coding
400This function decodes charset-encoded text in the current buffer. In
401Emacs, the buffer is expected to be unibyte initially and is set to
402multibyte after decoding.
403HANDLE is the MIME handle of the original part. CODING is an explicit
404coding to use, overriding what the MIME headers specify.
405The coding system used for the decoding is returned.
406
407Note that this function doesn't deal with @samp{http-equiv} charset
408specifications in HTML @samp{<meta>} elements.
409@end defun
410
411@node file/ftp
412@section file and ftp
413@cindex files
414@cindex FTP
415@cindex File Transfer Protocol
416@cindex compressed files
417@findex dired
418
419@example
420ftp://@var{user}:@var{password}@@@var{host}:@var{port}/@var{file}
421file://@var{user}:@var{password}@@@var{host}:@var{port}/@var{file}
422@end example
423
424These schemes are defined in RFC 1808.
df2f79ee
LT
425@samp{ftp:} and @samp{file:} are synonymous in this library. They
426allow reading arbitrary files from hosts. Either @samp{ange-ftp}
948a35c1
SM
427(Emacs) or @samp{efs} (XEmacs) is used to retrieve them from remote
428hosts. Local files are accessed directly.
429
430Compressed files are handled, but support is hard-coded so that
431@code{jka-compr-compression-info-list} and so on have no affect.
432Suffixes recognized are @samp{.z}, @samp{.gz}, @samp{.Z} and
433@samp{.bz2}.
434
435@defopt url-directory-index-file
436The filename to look for when indexing a directory, default
437@samp{"index.html"}. If this file exists, and is readable, then it
438will be viewed instead of using @code{dired} to view the directory.
439@end defopt
440
441@node info
442@section info
443@cindex Info
444@cindex Texinfo
445@findex Info-goto-node
446
447@example
448info:@var{file}#@var{node}
449@end example
450
451Info URLs are not officially defined. They invoke
452@code{Info-goto-node} with argument @samp{(@var{file})@var{node}}.
453@samp{#@var{node}} is optional, defaulting to @samp{Top}.
454
455@node mailto
456@section mailto
457
458@cindex mailto
459@cindex email
460A mailto URL will send an email message to the address in the
461URL, for example @samp{mailto:foo@@bar.com} would compose a
a17e377e 462message to @samp{foo@@bar.com}.
948a35c1
SM
463
464@defopt url-mail-command
465@vindex mail-user-agent
466The function called whenever url needs to send mail. This should
467normally be left to default from @var{mail-user-agent}. @xref{Mail
15594861 468Methods, , Mail-Composition Methods, emacs, The GNU Emacs Manual}.
948a35c1
SM
469@end defopt
470
471An @samp{X-Url-From} header field containing the URL of the document
472that contained the mailto URL is added if that URL is known.
473
474RFC 2368 extends the definition of mailto URLs in RFC 1738.
475The form of a mailto URL is
476@example
477@samp{mailto:@var{mailbox}[?@var{header}=@var{contents}[&@var{header}=@var{contents}]]}
478@end example
df2f79ee 479@noindent where an arbitrary number of @var{header}s can be added. If the
948a35c1
SM
480@var{header} is @samp{body}, then @var{contents} is put in the body
481otherwise a @var{header} header field is created with @var{contents}
482as its contents. Note that the URL library does not consider any
483headers `dangerous' so you should check them before sending the
484message.
485
486@c Fixme: update
487Email messages are defined in @sc{rfc}822.
488
489@node news/nntp/snews
490@section @code{news}, @code{nntp} and @code{snews}
491@cindex news
492@cindex network news
493@cindex usenet
494@cindex NNTP
495@cindex snews
496
497@c draft-gilman-news-url-01
498The network news URL scheme take the following forms following RFC
4991738 except that for compatibility with other clients, host and port
500fields may be included in news URLs though they are properly only
501allowed for nntp an snews.
502
503@table @samp
a17e377e 504@item news:@var{newsgroup}
948a35c1
SM
505Retrieves a list of messages in @var{newsgroup};
506@item news:@var{message-id}
507Retrieves the message with the given @var{message-id};
a17e377e 508@item news:*
948a35c1
SM
509Retrieves a list of all available newsgroups;
510@item nntp://@var{host}:@var{port}/@var{newsgroup}
511@itemx nntp://@var{host}:@var{port}/@var{message-id}
512@itemx nntp://@var{host}:@var{port}/*
513Similar to the @samp{news} versions.
514@end table
515
516@samp{:@var{port}} is optional and defaults to :119.
517
518@samp{snews} is the same as @samp{nntp} except that the default port
519is :563.
520@cindex SSL
df2f79ee 521(It is tunneled through SSL.)
948a35c1
SM
522
523An @samp{nntp} URL is the same as a news URL, except that the URL may
524specify an article by its number.
525
526@defopt url-news-server
527This variable can be used to override the default news server.
528Usually this will be set by the Gnus package, which is used to fetch
529news.
530@cindex environment variable
531@vindex NNTPSERVER
532It may be set from the conventional environment variable
533@code{NNTPSERVER}.
534@end defopt
535
536@node rlogin/telnet/tn3270
537@section rlogin, telnet and tn3270
538@cindex rlogin
539@cindex telnet
540@cindex tn3270
541@cindex terminal emulation
542@findex terminal-emulator
543
544These URL schemes from RFC 1738 for logon via a terminal emulator have
545the form
546@example
547telnet://@var{user}:@var{password}@@@var{host}:@var{port}
548@end example
549but the @code{:@var{password}} component is ignored.
550
551To handle rlogin, telnet and tn3270 URLs, a @code{rlogin},
552@code{telnet} or @code{tn3270} (the program names and arguments are
553hardcoded) session is run in a @code{terminal-emulator} buffer.
554Well-known ports are used if the URL does not specify a port.
555
556@node irc
557@section irc
558@cindex IRC
559@cindex Internet Relay Chat
560@cindex ZEN IRC
a17e377e 561@c Fixme: reference (was http://www.w3.org/Addressing/draft-mirashi-url-irc-01.txt)
948a35c1 562@dfn{Internet Relay Chat} (IRC) is handled by handing off the @sc{irc}
a17e377e 563session to a function named in @code{url-irc-function}.
948a35c1
SM
564
565@defopt url-irc-function
566A function to actually open an IRC connection.
567This function
568must take five arguments, @var{host}, @var{port}, @var{channel},
569@var{user} and @var{password}. The @var{channel} argument specifies the
570channel to join immediately, this can be @code{nil}. By default this is
571@code{url-irc-zenirc}.
572@end defopt
573@defun url-irc-zenirc host port channel user password
574Processes the arguments and lets @code{zenirc} handle the session.
575@end defun
576
577@node data
578@section data
579@cindex data URLs
580
581@example
582data:@r{[}@var{media-type}@r{]}@r{[};@var{base64}@r{]},@var{data}
583@end example
584
585Data URLs contain MIME data in the URL itself. They are defined in
586RFC 2397.
587
588@var{media-type} is a MIME @samp{Content-Type} string, possibly
589including parameters. It defaults to
590@samp{text/plain;charset=US-ASCII}. The @samp{text/plain} can be
591omitted but the charset parameter supplied. If @samp{;base64} is
592present, the @var{data} are base64-encoded.
a17e377e 593
948a35c1
SM
594@node nfs
595@section nfs
596@cindex NFS
597@cindex Network File System
598@cindex automounter
599
600@example
601nfs://@var{user}:@var{password}@@@var{host}:@var{port}/@var{file}
602@end example
603
604The @samp{nfs:} scheme is defined in RFC 2224. It is similar to
605@samp{ftp:} except that it points to a file on a remote host that is
606handled by the automounter on the local host.
607
608@defvar url-nfs-automounter-directory-spec
609@end defvar
610A string saying how to invoke the NFS automounter. Certain @samp{%}
611sequences are recognized:
612
613@table @samp
614@item %h
615The hostname of the NFS server;
616@item %n
617The port number of the NFS server;
618@item %u
619The username to use to authenticate;
620@item %p
621The password to use to authenticate;
622@item %f
623The filename on the remote server;
624@item %%
625A literal @samp{%}.
626@end table
627
628Each can be used any number of times.
629
630@node cid
631@section cid
632@cindex Content-ID
633
634RFC 2111
635
636@node about
637@section about
638
639@node ldap
640@section ldap
641@cindex LDAP
642@cindex Lightweight Directory Access Protocol
643
644The LDAP scheme is defined in RFC 2255.
645
646@node imap
647@section imap
648@cindex IMAP
649
650RFC 2192
651
652@node man
653@section man
654@cindex @command{man}
655@cindex Unix man pages
656@findex man
657
658@example
659@samp{man:@var{page-spec}}
660@end example
661
662This is a non-standard scheme. @var{page-spec} is passed directly to
663the Lisp @code{man} function.
664
665@node Defining New URLs
666@chapter Defining New URLs
667
668@menu
a17e377e
LT
669* Naming conventions::
670* Required functions::
671* Optional functions::
672* Asynchronous fetching::
673* Supporting file-name-handlers::
948a35c1
SM
674@end menu
675
676@node Naming conventions
677@section Naming conventions
678
679@node Required functions
680@section Required functions
681
682@node Optional functions
683@section Optional functions
684
685@node Asynchronous fetching
686@section Asynchronous fetching
687
688@node Supporting file-name-handlers
689@section Supporting file-name-handlers
690
691@node General Facilities
692@chapter General Facilities
693
694@menu
a17e377e
LT
695* Disk Caching::
696* Proxies::
697* Gateways in general::
698* History::
948a35c1
SM
699@end menu
700
701@node Disk Caching
702@section Disk Caching
703@cindex Caching
704@cindex Persistent Cache
705@cindex Disk Cache
706
707The disk cache stores retrieved documents locally, whence they can be
708retrieved more quickly. When requesting a URL that is in the cache,
709the library checks to see if the page has changed since it was last
710retrieved from the remote machine. If not, the local copy is used,
711saving the transmission over the network.
712@cindex Cleaning the cache
713@cindex Clearing the cache
714@cindex Cache cleaning
715Currently the cache isn't cleared automatically.
716@c Running the @code{clean-cache} shell script
717@c fist is recommended, to allow for future cleaning of the cache. This
718@c shell script will remove all files that have not been accessed since it
719@c was last run. To keep the cache pared down, it is recommended that this
720@c script be run from @i{at} or @i{cron} (see the manual pages for
721@c crontab(5) or at(1) for more information)
722
723@defopt url-automatic-caching
724Setting this variable non-@code{nil} causes documents to be cached
725automatically.
726@end defopt
727
728@defopt url-cache-directory
729This variable specifies the
730directory to store the cache files. It defaults to sub-directory
731@file{cache} of @code{url-configuration-directory}.
732@end defopt
733
734@c Fixme: function v. option, but neither used.
735@c @findex url-cache-expired
736@c @defopt url-cache-expired
737@c This is a function to decide whether or not a cache entry has expired.
738@c It takes two times as it parameters and returns non-@code{nil} if the
739@c second time is ``too old'' when compared with the first time.
740@c @end defopt
741
742@defopt url-cache-creation-function
743The cache relies on a scheme for mapping URLs to files in the cache.
744This variable names a function which sets the type of cache to use.
745It takes a URL as argument and returns the absolute file name of the
746corresponding cache file. The two supplied possibilities are
747@code{url-cache-create-filename-using-md5} and
748@code{url-cache-create-filename-human-readable}.
749@end defopt
750
751@defun url-cache-create-filename-using-md5 url
752Creates a cache file name from @var{url} using MD5 hashing.
753@findex md5
754This is creates entries with very few cache collisions and is fast if
755you have the @code{md5} function as a primitive (Emacs 21 and XEmacs).
756@smallexample
757(url-cache-create-filename-using-md5 "http://www.example.com/foo/bar")
758 @result{} "/home/fx/.url/cache/fx/http/com/example/www/b8a35774ad20db71c7c3409a5410e74f"
759@end smallexample
760@end defun
761
762@defun url-cache-create-filename-human-readable url
763Creates a cache file name from @var{url} more obviously connected to
764@var{url} than for @code{url-cache-create-filename-using-md5}, but
765more likely to conflict with other files.
766@smallexample
767(url-cache-create-filename-human-readable "http://www.example.com/foo/bar")
768 @result{} "/home/fx/.url/cache/fx/http/com/example/www/foo/bar"
769@end smallexample
770@end defun
771
a17e377e 772@c Fixme: never actually used currently?
948a35c1
SM
773@c @defopt url-standalone-mode
774@c @cindex Relying on cache
775@c @cindex Cache only mode
776@c @cindex Standalone mode
777@c If this variable is non-@code{nil}, the library relies solely on the
778@c cache for fetching documents and avoids checking if they have changed
779@c on remote servers.
780@c @end defopt
781
782@c With a large cache of documents on the local disk, it can be very handy
783@c when traveling, or any other time the network connection is not active
784@c (a laptop with a dial-on-demand PPP connection, etc). Emacs/W3 can rely
785@c solely on its cache, and avoid checking to see if the page has changed
786@c on the remote server. In the case of a dial-on-demand PPP connection,
787@c this will keep the phone line free as long as possible, only bringing up
788@c the PPP connection when asking for a page that is not located in the
789@c cache. This is very useful for demonstrations as well.
790
791@node Proxies
792@section Proxies and Gatewaying
793
a17e377e 794@c fixme: check/document url-ns stuff
948a35c1
SM
795@cindex proxy servers
796@cindex proxies
797@cindex environment variables
798@vindex HTTP_PROXY
799Proxy servers are commonly used to provide gateways through firewalls
800or as caches serving some more-or-less local network. Each protocol
801(HTTP, FTP, etc.)@: can have a different gateway server. Proxying is
802conventionally configured commonly amongst different programs through
803environment variables of the form @code{@var{protocol}_proxy}, where
804@var{protocol} is one of the supported network protocols (@code{http},
805@code{ftp} etc.). The library recognizes such variables in either
806upper or lower case. Their values are of one of the forms:
807@itemize @bullet
808@item @code{@var{host}:@var{port}}
809@item A full URL;
810@item Simply a host name.
811@end itemize
812
813@vindex NO_PROXY
814The @code{NO_PROXY} environment variable specifies URLs that should be
815excluded from proxying (on servers that should be contacted directly).
816This should be a comma-separated list of hostnames, domain names, or a
817mixture of both. Asterisks can be used as wildcards, but other
818clients may not support that. Domain names may be indicated by a
819leading dot. For example:
820@example
821NO_PROXY="*.aventail.com,home.com,.seanet.com"
822@end example
823@noindent says to contact all machines in the @samp{aventail.com} and
824@samp{seanet.com} domains directly, as well as the machine named
825@samp{home.com}. If @code{NO_PROXY} isn't defined, @code{no_PROXY}
a17e377e 826and @code{no_proxy} are also tried, in that order.
948a35c1
SM
827
828Proxies may also be specified directly in Lisp.
829
830@defopt url-proxy-services
831This variable is an alist of URL schemes and proxy servers that
832gateway them. The items are of the form @w{@code{(@var{scheme}
833. @var{host}:@var{portnumber})}}, says that the URL @var{scheme} is
834gatewayed through @var{portnumber} on the specified @var{host}. An
835exception is the pseudo scheme @code{"no_proxy"}, which is paired with
836a regexp matching host names not to be proxied. This variable is
837initialized from the environment as above.
838
839@example
840(setq url-proxy-services
841 '(("http" . "proxy.aventail.com:80")
842 ("no_proxy" . "^.*\\(aventail\\|seanet\\)\\.com")))
843@end example
844@end defopt
845
846@node Gateways in general
847@section Gateways in General
848@cindex gateways
849@cindex firewalls
850
851The library provides a general gateway layer through which all
852networking passes. It can both control access to the network and
853provide access through gateways in firewalls. This may make direct
854connexions in some cases and pass through some sort of gateway in
855others.@footnote{Proxies (which only operate over HTTP) are
856implemented using this.} The library's basic function responsible for
857making connexions is @code{url-open-stream}.
858
859@defun url-open-stream name buffer host service
860@cindex opening a stream
861@cindex stream, opening
862Open a stream to @var{host}, possibly via a gateway. The other
863arguments are as for @code{open-network-stream}. This will not make a
864connexion if @code{url-gateway-unplugged} is non-@code{nil}.
865@end defun
866
867@defvar url-gateway-local-host-regexp
868This is a regular expression that matches local hosts that do not
869require the use of a gateway. If @code{nil}, all connexions are made
870through the gateway.
871@end defvar
872
873@defvar url-gateway-method
874This variable controls which gateway method is used. It may be useful
875to bind it temporarily in some applications. It has values taken from
876a list of symbols. Possible values are:
877
878@table @code
879@item telnet
880@cindex @command{telnet}
881Use this method if you must first telnet and log into a gateway host,
882and then run telnet from that host to connect to outside machines.
883
884@item rlogin
885@cindex @command{rlogin}
886This method is identical to @code{telnet}, but uses @command{rlogin}
887to log into the remote machine without having to send the username and
888password over the wire every time.
889
890@item socks
891@cindex @sc{socks}
892Use if the firewall has a @sc{socks} gateway running on it. The
893@sc{socks} v5 protocol is defined in RFC 1928.
894
895@c @item ssl
896@c This probably shouldn't be documented
897@c Fixme: why not? -- fx
898
899@item native
900This method uses Emacs's builtin networking directly. This is the
901default. It can be used only if there is no firewall blocking access.
902@end table
903@end defvar
904
905The following variables control the gateway methods.
906
907@defopt url-gateway-telnet-host
908The gateway host to telnet to. Once logged in there, you then telnet
909out to the hosts you want to connect to.
910@end defopt
911@defopt url-gateway-telnet-parameters
912This should be a list of parameters to pass to the @command{telnet} program.
913@end defopt
914@defopt url-gateway-telnet-password-prompt
915This is a regular expression that matches the password prompt when
916logging in.
917@end defopt
918@defopt url-gateway-telnet-login-prompt
919This is a regular expression that matches the username prompt when
920logging in.
921@end defopt
922@defopt url-gateway-telnet-user-name
923The username to log in with.
924@end defopt
925@defopt url-gateway-telnet-password
926The password to send when logging in.
927@end defopt
928@defopt url-gateway-prompt-pattern
929This is a regular expression that matches the shell prompt.
930@end defopt
931
932@defopt url-gateway-rlogin-host
933Host to @samp{rlogin} to before telnetting out.
934@end defopt
935@defopt url-gateway-rlogin-parameters
936Parametres to pass to @samp{rsh}.
937@end defopt
938@defopt url-gateway-rlogin-user-name
939User name to use when logging in to the gateway.
940@end defopt
941@defopt url-gateway-prompt-pattern
942This is a regular expression that matches the shell prompt.
943@end defopt
944
945@defopt socks-server
946This specifies the default server, it takes the form
947@w{@code{("Default server" @var{server} @var{port} @var{version})}}
948where @var{version} can be either 4 or 5.
949@end defopt
950@defvar socks-password
df2f79ee 951If this is @code{nil} then you will be asked for the password,
948a35c1
SM
952otherwise it will be used as the password for authenticating you to
953the @sc{socks} server.
954@end defvar
955@defvar socks-username
956This is the username to use when authenticating yourself to the
957@sc{socks} server. By default this is your login name.
958@end defvar
959@defvar socks-timeout
960This controls how long, in seconds, to wait for responses from the
961@sc{socks} server; it is 5 by default.
962@end defvar
963@c fixme: these have been effectively commented-out in the code
964@c @defopt socks-server-aliases
965@c This a list of server aliases. It is a list of aliases of the form
966@c @var{(alias hostname port version)}.
967@c @end defopt
968@c @defopt socks-network-aliases
969@c This a list of network aliases. Each entry in the list takes the form
970@c @var{(alias (network))} where @var{alias} is a string that names the
971@c @var{network}. The networks can contain a pair (not a dotted pair) of
972@c @sc{ip} addresses which specify a range of @sc{ip} addresses, an @sc{ip}
973@c address and a netmask, a domain name or a unique hostname or @sc{ip}
974@c address.
975@c @end defopt
976@c @defopt socks-redirection-rules
977@c This a list of redirection rules. Each rule take the form
978@c @var{(Destination network Connection type)} where @var{Destination
979@c network} is a network alias from @code{socks-network-aliases} and
980@c @var{Connection type} can be @code{nil} in which case a direct
981@c connection is used, or it can be an alias from
982@c @code{socks-server-aliases} in which case that server is used as a
983@c proxy.
984@c @end defopt
985@defopt socks-nslookup-program
986@cindex @command{nslookup}
987This the @samp{nslookup} program. It is @code{"nslookup"} by default.
988@end defopt
989
990@menu
a17e377e 991* Suppressing network connexions::
948a35c1 992@end menu
a17e377e 993@c * Broken hostname resolution::
948a35c1
SM
994
995@node Suppressing network connexions
996@subsection Suppressing Network Connexions
997
998@cindex network connexions, suppressing
999@cindex suppressing network connexions
1000@cindex bugs, HTML
1001@cindex HTML `bugs'
1002In some circumstances it is desirable to suppress making network
1003connexions. A typical case is when rendering HTML in a mail user
1004agent, when external URLs should not be activated, particularly to
1005avoid `bugs' which `call home' by fetch single-pixel images and the
1006like. To arrange this, bind the following variable for the duration
1007of such processing.
1008
1009@defvar url-gateway-unplugged
1010If this variable is non-@code{nil} new network connexions are never
1011opened by the URL library.
1012@end defvar
1013
1014@c @node Broken hostname resolution
1015@c @subsection Broken Hostname Resolution
1016
1017@c @cindex hostname resolver
1018@c @cindex resolver, hostname
1019@c Some C libraries do not include the hostname resolver routines in
1020@c their static libraries. If Emacs was linked statically, and was not
df2f79ee 1021@c linked with the resolver libraries, it will not be able to get to any
948a35c1
SM
1022@c machines off the local network. This is characterized by being able
1023@c to reach someplace with a raw ip number, but not its hostname
1024@c (@url{http://129.79.254.191/} works, but
1025@c @url{http://www.cs.indiana.edu/} doesn't). This used to happen on
1026@c SunOS4 and Ultrix, but is now probably now rare. If Emacs can't be
1027@c rebuilt linked against the resolver library, it can use the external
1028@c @command{nslookup} program instead.
1029
1030@c @defopt url-gateway-broken-resolution
1031@c @cindex @code{nslookup} program
1032@c @cindex program, @code{nslookup}
1033@c If non-@code{nil}, this variable says to use the program specified by
1034@c @code{url-gateway-nslookup-program} program to do hostname resolution.
1035@c @end defopt
1036
1037@c @defopt url-gateway-nslookup-program
1038@c The name of the program to do hostname lookup if Emacs can't do it
1039@c directly. This program should expect a single argument on the command
1040@c line---the hostname to resolve---and should produce output similar to
1041@c the standard Unix @command{nslookup} program:
1042@c @example
1043@c Name: www.cs.indiana.edu
1044@c Address: 129.79.254.191
1045@c @end example
1046@c @end defopt
1047
1048@node History
1049@section History
1050
1051The library can maintain a global history list tracking URLs accessed.
1052URL completion can be done from it. The history mechanism is set up
1053@findex url-do-setup
1054automatically via @code{url-do-setup} when it is configured to be on.
1055Note that the size of the history list is currently not limited.
1056
1057@vindex url-history-hash-table
1058The history `list' is actually a hash table,
1059@code{url-history-hash-table}. It contains access times keyed by URL
1060strings. The times are in the format returned by @code{current-time}.
1061
1062@defun url-history-update-url url time
df2f79ee
LT
1063This function updates the history table with an entry for @var{url}
1064accessed at the given @var{time}.
948a35c1
SM
1065@end defun
1066
1067@defopt url-history-track
1068If non-@code{nil}, the library will keep track of all the URLs
1069accessed. If is is @code{t}, the list is saved to disk at the end of
1070each Emacs session. The default is @code{nil}.
1071@end defopt
1072
1073@defopt url-history-file
1074The file storing the history list between sessions. It defaults to
1075@file{history} in @code{url-configuration-directory}.
1076@end defopt
1077
1078@defopt url-history-save-interval
1079@findex url-history-setup-save-timer
1080The number of seconds between automatic saves of the history list.
1081Default is one hour. Note that if you change this variable directly,
1082rather than using Custom, after @code{url-do-setup} has been run, you
1083need to run the function @code{url-history-setup-save-timer}.
1084@end defopt
1085
1086@defun url-history-parse-history &optional fname
1087Parses the history file @var{fname} (default @code{url-history-file})
1088and sets up the history list.
1089@end defun
1090
1091@defun url-history-save-history &optional fname
1092Saves the current history to file @var{fname} (default
1093@code{url-history-file}).
1094@end defun
1095
1096@defun url-completion-function string predicate function
1097You can use this function to do completion of URLs from the history.
1098@end defun
1099
1100@node Customization
1101@chapter Customization
1102
1103@section Environment Variables
1104
1105@cindex environment variables
1106The following environment variables affect the library's operation at
1107startup.
1108
1109@table @code
1110@item TMPDIR
1111@vindex TMPDIR
1112@vindex url-temporary-directory
1113If this is defined, @var{url-temporary-directory} is initialized from
1114it.
1115@end table
1116
1117@section General User Options
1118
1119The following user options, settable with Customize, affect the
1120general operation of the package.
1121
1122@defopt url-debug
1123@cindex debugging
1124Specifies the types of debug messages the library which are logged to
1125the @code{*URL-DEBUG*} buffer.
1126@code{t} means log all messages.
1127A number means log all messages and show them with @code{message}.
1128If may also be a list of the types of messages to be logged.
1129@end defopt
1130@defopt url-personal-mail-address
1131@end defopt
1132@defopt url-privacy-level
1133@end defopt
1134@defopt url-uncompressor-alist
1135@end defopt
1136@defopt url-passwd-entry-func
1137@end defopt
1138@defopt url-standalone-mode
1139@end defopt
1140@defopt url-bad-port-list
1141@end defopt
1142@defopt url-max-password-attempts
1143@end defopt
1144@defopt url-temporary-directory
1145@end defopt
1146@defopt url-show-status
1147@end defopt
1148@defopt url-confirmation-func
1149The function to use for asking yes or no functions. This is normally
1150either @code{y-or-n-p} or @code{yes-or-no-p}, but could be another
1151function taking a single argument (the prompt) and returning @code{t}
1152only if an affirmative answer is given.
1153@end defopt
1154@defopt url-gateway-method
a17e377e 1155@c fixme: describe gatewaying
948a35c1
SM
1156A symbol specifying the type of gateway support to use fro connexions
1157from the local machine. The supported methods are:
1158
1159@table @code
1160@item telnet
1161Run telnet in a subprocess to connect;
1162@item rlogin
1163Rlogin to another machine to connect;
1164@item socks
1165Connect through a socks server;
1166@item ssl
1167Connect with SSL;
1168@item native
1169Connect directly.
1170@end table
1171@end defopt
1172
1173@node Function Index
1174@unnumbered Command and Function Index
1175@printindex fn
1176
1177@node Variable Index
1178@unnumbered Variable Index
1179@printindex vr
1180
1181@node Concept Index
1182@unnumbered Concept Index
1183@printindex cp
1184
1185@setchapternewpage odd
1186@contents
1187@bye
7f72fcc3
MB
1188
1189@ignore
1190 arch-tag: c96be356-7e2d-4196-bcda-b13246c5c3f0
1191@end ignore