Add 2011 to FSF/AIST copyright years.
[bpt/emacs.git] / doc / misc / url.texi
CommitLineData
4009494e 1\input texinfo
db78a8cb 2@setfilename ../../info/url
4009494e
GM
3@settitle URL Programmer's Manual
4
5@iftex
6@c @finalout
7@end iftex
8@c @setchapternewpage odd
9@c @smallbook
10
11@tex
12\overfullrule=0pt
13%\global\baselineskip 30pt % for printing in double space
14@end tex
15@dircategory World Wide Web
5dc584b5 16@dircategory Emacs
4009494e 17@direntry
62e034c2 18* URL: (url). URL loading package.
4009494e
GM
19@end direntry
20
e2852284 21@copying
5dc584b5 22This file documents the Emacs Lisp URL loading package.
4009494e
GM
23
24Copyright @copyright{} 1993, 1994, 1995, 1996, 1997, 1998, 1999, 2002,
5df4f04c 252004, 2005, 2006, 2007, 2008, 2009, 2010, 2011 Free Software Foundation, Inc.
4009494e 26
e2852284 27@quotation
4009494e 28Permission is granted to copy, distribute and/or modify this document
6a2c4aec 29under the terms of the GNU Free Documentation License, Version 1.3 or
e2852284 30any later version published by the Free Software Foundation; with no
cd5c05d2
GM
31Invariant Sections, with the Front-Cover texts being ``A GNU Manual,''
32and with the Back-Cover Texts as in (a) below. A copy of the license
33is included in the section entitled ``GNU Free Documentation License''.
34
35(a) The FSF's Back-Cover Text is: ``You have the freedom to copy and
36modify this GNU manual. Buying copies from the FSF supports it in
37developing GNU and promoting software freedom.''
e2852284
GM
38@end quotation
39@end copying
4009494e
GM
40
41@c
42@titlepage
e2852284
GM
43@title URL Programmer's Manual
44@subtitle First Edition, URL Version 2.0
45@author William M. Perry @email{wmperry@@gnu.org}
46@author David Love @email{fx@@gnu.org}
4009494e
GM
47@page
48@vskip 0pt plus 1filll
e2852284 49@insertcopying
4009494e 50@end titlepage
e2852284 51
5dc584b5
KB
52@contents
53
4009494e
GM
54@node Top
55@top URL
56
5dc584b5
KB
57@ifnottex
58@insertcopying
59@end ifnottex
4009494e 60
4009494e
GM
61@menu
62* Getting Started:: Preparing your program to use URLs.
63* Retrieving URLs:: How to use this package to retrieve a URL.
64* Supported URL Types:: Descriptions of URL types currently supported.
65* Defining New URLs:: How to define a URL loader for a new protocol.
66* General Facilities:: URLs can be cached, accessed via a gateway
67 and tracked in a history list.
68* Customization:: Variables you can alter.
69* GNU Free Documentation License:: The license for this documentation.
70* Function Index::
71* Variable Index::
72* Concept Index::
73@end menu
74
75@node Getting Started
76@chapter Getting Started
77@cindex URLs, definition
78@cindex URIs
79
80@dfn{Uniform Resource Locators} (URLs) are a specific form of
81@dfn{Uniform Resource Identifiers} (URI) described in RFC 2396 which
82updates RFC 1738 and RFC 1808. RFC 2016 defines uniform resource
83agents.
84
85URIs have the form @var{scheme}:@var{scheme-specific-part}, where the
86@var{scheme}s supported by this library are described below.
87@xref{Supported URL Types}.
88
89FTP, NFS, HTTP, HTTPS, @code{rlogin}, @code{telnet}, tn3270,
90IRC and gopher URLs all have the form
91
92@example
93@var{scheme}://@r{[}@var{userinfo}@@@r{]}@var{hostname}@r{[}:@var{port}@r{]}@r{[}/@var{path}@r{]}
94@end example
95@noindent
96where @samp{@r{[}} and @samp{@r{]}} delimit optional parts.
97@var{userinfo} sometimes takes the form @var{username}:@var{password}
98but you should beware of the security risks of sending cleartext
99passwords. @var{hostname} may be a domain name or a dotted decimal
100address. If the @samp{:@var{port}} is omitted then the library will
101use the `well known' port for that service when accessing URLs. With
102the possible exception of @code{telnet}, it is rare for ports to be
103specified, and it is possible using a non-standard port may have
104undesired consequences if a different service is listening on that
105port (e.g., an HTTP URL specifying the SMTP port can cause mail to be
106sent). @c , but @xref{Other Variables, url-bad-port-list}.
107The meaning of the @var{path} component depends on the service.
108
109@menu
110* Configuration::
111* Parsed URLs:: URLs are parsed into vector structures.
112@end menu
113
114@node Configuration
115@section Configuration
116
117@defvar url-configuration-directory
118@cindex @file{~/.url}
119@cindex configuration files
120The directory in which URL configuration files, the cache etc.,
121reside. Default @file{~/.url}.
122@end defvar
123
124@node Parsed URLs
125@section Parsed URLs
126@cindex parsed URLs
127The library functions typically operate on @dfn{parsed} versions of
128URLs. These are actually vectors of the form:
129
130@example
131[@var{type} @var{user} @var{password} @var{host} @var{port} @var{file} @var{target} @var{attributes} @var{full}]
132@end example
133
134@noindent where
135@table @var
136@item type
137is the type of the URL scheme, e.g., @code{http}
138@item user
139is the username associated with it, or @code{nil};
140@item password
141is the user password associated with it, or @code{nil};
142@item host
143is the host name associated with it, or @code{nil};
144@item port
145is the port number associated with it, or @code{nil};
146@item file
147is the `file' part of it, or @code{nil}. This doesn't necessarily
148actually refer to a file;
149@item target
150is the target part, or @code{nil};
151@item attributes
152is the attributes associated with it, or @code{nil};
153@item full
154is @code{t} for a fully-specified URL, with a host part indicated by
155@samp{//} after the scheme part.
156@end table
157
158@findex url-type
159@findex url-user
160@findex url-password
161@findex url-host
162@findex url-port
163@findex url-file
164@findex url-target
165@findex url-attributes
166@findex url-full
167@findex url-set-type
168@findex url-set-user
169@findex url-set-password
170@findex url-set-host
171@findex url-set-port
172@findex url-set-file
173@findex url-set-target
174@findex url-set-attributes
175@findex url-set-full
176These attributes have accessors named @code{url-@var{part}}, where
177@var{part} is the name of one of the elements above, e.g.,
178@code{url-host}. Similarly, there are setters of the form
179@code{url-set-@var{part}}.
180
181There are functions for parsing and unparsing between the string and
182vector forms.
183
184@defun url-generic-parse-url url
185Return a parsed version of the string @var{url}.
186@end defun
187
188@defun url-recreate-url url
189@cindex unparsing URLs
190Recreates a URL string from the parsed @var{url}.
191@end defun
192
193@node Retrieving URLs
194@chapter Retrieving URLs
195
196@defun url-retrieve-synchronously url
197Retrieve @var{url} synchronously and return a buffer containing the
198data. @var{url} is either a string or a parsed URL structure. Return
199@code{nil} if there are no data associated with it (the case for dired,
200info, or mailto URLs that need no further processing).
201@end defun
202
203@defun url-retrieve url callback &optional cbargs
204Retrieve @var{url} asynchronously and call @var{callback} with args
205@var{cbargs} when finished. The callback is called when the object
206has been completely retrieved, with the current buffer containing the
207object and any MIME headers associated with it. @var{url} is either a
208string or a parsed URL structure. Returns the buffer @var{url} will
209load into, or @code{nil} if the process has already completed.
210@end defun
211
212@node Supported URL Types
213@chapter Supported URL Types
214
215@menu
216* http/https:: Hypertext Transfer Protocol.
217* file/ftp:: Local files and FTP archives.
218* info:: Emacs `Info' pages.
219* mailto:: Sending email.
220* news/nntp/snews:: Usenet news.
221* rlogin/telnet/tn3270:: Remote host connectivity.
222* irc:: Internet Relay Chat.
223* data:: Embedded data URLs.
224* nfs:: Networked File System
225@c * finger::
226@c * gopher::
227@c * netrek::
228@c * prospero::
229* cid:: Content-ID.
230* about::
231* ldap:: Lightweight Directory Access Protocol
232* imap:: IMAP mailboxes.
233* man:: Unix man pages.
234@end menu
235
236@node http/https
237@section @code{http} and @code{https}
238
239The scheme @code{http} is Hypertext Transfer Protocol. The library
240supports version 1.1, specified in RFC 2616. (This supersedes 1.0,
241defined in RFC 1945) HTTP URLs have the following form, where most of
242the parts are optional:
243@example
244http://@var{user}:@var{password}@@@var{host}:@var{port}/@var{path}?@var{searchpart}#@var{fragment}
245@end example
246@c The @code{:@var{port}} part is optional, and @var{port} defaults to
247@c 80. The @code{/@var{path}} part, if present, is a slash-separated
248@c series elements. The @code{?@var{searchpart}}, if present, is the
249@c query for a search or the content of a form submission. The
250@c @code{#fragment} part, if present, is a location in the document.
251
252The scheme @code{https} is a secure version of @code{http}, with
253transmission via SSL. It is defined in RFC 2069. Its default port is
254443. This scheme depends on SSL support in Emacs via the
255@file{ssl.el} library and is actually implemented by forcing the
256@code{ssl} gateway method to be used. @xref{Gateways in general}.
257
258@defopt url-honor-refresh-requests
135305ed 259This controls honoring of HTTP @samp{Refresh} headers by which
4009494e 260servers can direct clients to reload documents from the same URL or a
135305ed
GM
261or different one. @code{nil} means they will not be honored,
262@code{t} (the default) means they will always be honored, and
4009494e
GM
263otherwise the user will be asked on each request.
264@end defopt
265
266
267@menu
268* Cookies::
269* HTTP language/coding::
270* HTTP URL Options::
271* Dealing with HTTP documents::
272@end menu
273
274@node Cookies
275@subsection Cookies
276
277@defopt url-cookie-file
278The file in which cookies are stored, defaulting to @file{cookies} in
279the directory specified by @code{url-configuration-directory}.
280@end defopt
281
282@defopt url-cookie-confirmation
283Specifies whether confirmation is require to accept cookies.
284@end defopt
285
286@defopt url-cookie-multiple-line
287Specifies whether to put all cookies for the server on one line in the
288HTTP request to satisfy broken servers like
289@url{http://www.hotmail.com}.
290@end defopt
291
292@defopt url-cookie-trusted-urls
293A list of regular expressions matching URLs from which to accept
294cookies always.
295@end defopt
296
297@defopt url-cookie-untrusted-urls
298A list of regular expressions matching URLs from which to reject
299cookies always.
300@end defopt
301
302@defopt url-cookie-save-interval
303The number of seconds between automatic saves of cookies to disk.
304Default is one hour.
305@end defopt
306
307
308@node HTTP language/coding
309@subsection Language and Encoding Preferences
310
311HTTP allows clients to express preferences for the language and
135305ed 312encoding of documents which servers may honor. For each of these
4009494e
GM
313variables, the value is a string; it can specify a single choice, or
314it can be a comma-separated list.
315
da0bbbc4 316Normally, this list is ordered by descending preference. However, each
4009494e
GM
317element can be followed by @samp{;q=@var{priority}} to specify its
318preference level, a decimal number from 0 to 1; e.g., for
319@code{url-mime-language-string}, @w{@code{"de, en-gb;q=0.8,
320en;q=0.7"}}. An element that has no @samp{;q} specification has
321preference level 1.
322
323@defopt url-mime-charset-string
324@cindex character sets
325@cindex coding systems
326This variable specifies a preference for character sets when documents
327can be served in more than one encoding.
328
329HTTP allows specifying a series of MIME charsets which indicate your
330preferred character set encodings, e.g., Latin-9 or Big5, and these
331can be weighted. The default series is generated automatically from
332the associated MIME types of all defined coding systems, sorted by the
333coding system priority specified in Emacs. @xref{Recognize Coding, ,
334Recognizing Coding Systems, emacs, The GNU Emacs Manual}.
335@end defopt
336
337@defopt url-mime-language-string
338@cindex language preferences
339A string specifying the preferred language when servers can serve
340files in several languages. Use RFC 1766 abbreviations, e.g.,
341@samp{en} for English, @samp{de} for German.
342
343The string can be @code{"*"} to get the first available language (as
344opposed to the default).
345@end defopt
346
347@node HTTP URL Options
348@subsection HTTP URL Options
349
350HTTP supports an @samp{OPTIONS} method describing things supported by
351the URL@.
352
353@defun url-http-options url
354Returns a property list describing options available for URL. The
355property list members are:
356
357@table @code
358@item methods
359A list of symbols specifying what HTTP methods the resource
360supports.
361
362@item dav
363@cindex DAV
364A list of numbers specifying what DAV protocol/schema versions are
365supported.
366
367@item dasl
368@cindex DASL
369A list of supported DASL search types supported (string form).
370
371@item ranges
372A list of the units available for use in partial document fetches.
373
374@item p3p
375@cindex P3P
376The @dfn{Platform For Privacy Protection} description for the resource.
377Currently this is just the raw header contents.
378@end table
379
380@end defun
381
382@node Dealing with HTTP documents
383@subsection Dealing with HTTP documents
384
385HTTP URLs are retrieved into a buffer containing the HTTP headers
386followed by the body. Since the headers are quasi-MIME, they may be
387processed using the MIME library. @xref{Top,, Emacs MIME,
388emacs-mime, The Emacs MIME Manual}. The URL package provides a
389function to do this in general:
390
391@defun url-decode-text-part handle &optional coding
392This function decodes charset-encoded text in the current buffer. In
393Emacs, the buffer is expected to be unibyte initially and is set to
394multibyte after decoding.
395HANDLE is the MIME handle of the original part. CODING is an explicit
396coding to use, overriding what the MIME headers specify.
397The coding system used for the decoding is returned.
398
399Note that this function doesn't deal with @samp{http-equiv} charset
400specifications in HTML @samp{<meta>} elements.
401@end defun
402
403@node file/ftp
404@section file and ftp
405@cindex files
406@cindex FTP
407@cindex File Transfer Protocol
408@cindex compressed files
409@cindex dired
410
411@example
412ftp://@var{user}:@var{password}@@@var{host}:@var{port}/@var{file}
413file://@var{user}:@var{password}@@@var{host}:@var{port}/@var{file}
414@end example
415
416These schemes are defined in RFC 1808.
417@samp{ftp:} and @samp{file:} are synonymous in this library. They
418allow reading arbitrary files from hosts. Either @samp{ange-ftp}
419(Emacs) or @samp{efs} (XEmacs) is used to retrieve them from remote
420hosts. Local files are accessed directly.
421
422Compressed files are handled, but support is hard-coded so that
423@code{jka-compr-compression-info-list} and so on have no affect.
424Suffixes recognized are @samp{.z}, @samp{.gz}, @samp{.Z} and
425@samp{.bz2}.
426
427@defopt url-directory-index-file
428The filename to look for when indexing a directory, default
429@samp{"index.html"}. If this file exists, and is readable, then it
430will be viewed instead of using @code{dired} to view the directory.
431@end defopt
432
433@node info
434@section info
435@cindex Info
436@cindex Texinfo
437@findex Info-goto-node
438
439@example
440info:@var{file}#@var{node}
441@end example
442
443Info URLs are not officially defined. They invoke
444@code{Info-goto-node} with argument @samp{(@var{file})@var{node}}.
445@samp{#@var{node}} is optional, defaulting to @samp{Top}.
446
447@node mailto
448@section mailto
449
450@cindex mailto
451@cindex email
452A mailto URL will send an email message to the address in the
453URL, for example @samp{mailto:foo@@bar.com} would compose a
454message to @samp{foo@@bar.com}.
455
456@defopt url-mail-command
457@vindex mail-user-agent
458The function called whenever url needs to send mail. This should
459normally be left to default from @var{mail-user-agent}. @xref{Mail
460Methods, , Mail-Composition Methods, emacs, The GNU Emacs Manual}.
461@end defopt
462
463An @samp{X-Url-From} header field containing the URL of the document
464that contained the mailto URL is added if that URL is known.
465
466RFC 2368 extends the definition of mailto URLs in RFC 1738.
467The form of a mailto URL is
468@example
469@samp{mailto:@var{mailbox}[?@var{header}=@var{contents}[&@var{header}=@var{contents}]]}
470@end example
471@noindent where an arbitrary number of @var{header}s can be added. If the
472@var{header} is @samp{body}, then @var{contents} is put in the body
473otherwise a @var{header} header field is created with @var{contents}
474as its contents. Note that the URL library does not consider any
475headers `dangerous' so you should check them before sending the
476message.
477
478@c Fixme: update
479Email messages are defined in @sc{rfc}822.
480
481@node news/nntp/snews
482@section @code{news}, @code{nntp} and @code{snews}
483@cindex news
484@cindex network news
485@cindex usenet
486@cindex NNTP
487@cindex snews
488
489@c draft-gilman-news-url-01
490The network news URL scheme take the following forms following RFC
4911738 except that for compatibility with other clients, host and port
492fields may be included in news URLs though they are properly only
493allowed for nntp an snews.
494
495@table @samp
496@item news:@var{newsgroup}
497Retrieves a list of messages in @var{newsgroup};
498@item news:@var{message-id}
499Retrieves the message with the given @var{message-id};
500@item news:*
501Retrieves a list of all available newsgroups;
502@item nntp://@var{host}:@var{port}/@var{newsgroup}
503@itemx nntp://@var{host}:@var{port}/@var{message-id}
504@itemx nntp://@var{host}:@var{port}/*
505Similar to the @samp{news} versions.
506@end table
507
508@samp{:@var{port}} is optional and defaults to :119.
509
510@samp{snews} is the same as @samp{nntp} except that the default port
511is :563.
512@cindex SSL
513(It is tunneled through SSL.)
514
515An @samp{nntp} URL is the same as a news URL, except that the URL may
516specify an article by its number.
517
518@defopt url-news-server
519This variable can be used to override the default news server.
520Usually this will be set by the Gnus package, which is used to fetch
521news.
522@cindex environment variable
523@vindex NNTPSERVER
524It may be set from the conventional environment variable
525@code{NNTPSERVER}.
526@end defopt
527
528@node rlogin/telnet/tn3270
529@section rlogin, telnet and tn3270
530@cindex rlogin
531@cindex telnet
532@cindex tn3270
533@cindex terminal emulation
534@findex terminal-emulator
535
536These URL schemes from RFC 1738 for logon via a terminal emulator have
537the form
538@example
539telnet://@var{user}:@var{password}@@@var{host}:@var{port}
540@end example
541but the @code{:@var{password}} component is ignored.
542
543To handle rlogin, telnet and tn3270 URLs, a @code{rlogin},
544@code{telnet} or @code{tn3270} (the program names and arguments are
545hardcoded) session is run in a @code{terminal-emulator} buffer.
546Well-known ports are used if the URL does not specify a port.
547
548@node irc
549@section irc
550@cindex IRC
551@cindex Internet Relay Chat
552@cindex ZEN IRC
553@cindex ERC
554@cindex rcirc
555@c Fixme: reference (was http://www.w3.org/Addressing/draft-mirashi-url-irc-01.txt)
556@dfn{Internet Relay Chat} (IRC) is handled by handing off the @sc{irc}
557session to a function named in @code{url-irc-function}.
558
559@defopt url-irc-function
560A function to actually open an IRC connection.
561This function
562must take five arguments, @var{host}, @var{port}, @var{channel},
563@var{user} and @var{password}. The @var{channel} argument specifies the
564channel to join immediately, this can be @code{nil}. By default this is
565@code{url-irc-rcirc}.
566@end defopt
567@defun url-irc-rcirc host port channel user password
568Processes the arguments and lets @code{rcirc} handle the session.
569@end defun
570@defun url-irc-erc host port channel user password
571Processes the arguments and lets @code{ERC} handle the session.
572@end defun
573@defun url-irc-zenirc host port channel user password
574Processes the arguments and lets @code{zenirc} handle the session.
575@end defun
576
577@node data
578@section data
579@cindex data URLs
580
581@example
582data:@r{[}@var{media-type}@r{]}@r{[};@var{base64}@r{]},@var{data}
583@end example
584
585Data URLs contain MIME data in the URL itself. They are defined in
586RFC 2397.
587
588@var{media-type} is a MIME @samp{Content-Type} string, possibly
589including parameters. It defaults to
590@samp{text/plain;charset=US-ASCII}. The @samp{text/plain} can be
591omitted but the charset parameter supplied. If @samp{;base64} is
592present, the @var{data} are base64-encoded.
593
594@node nfs
595@section nfs
596@cindex NFS
597@cindex Network File System
598@cindex automounter
599
600@example
601nfs://@var{user}:@var{password}@@@var{host}:@var{port}/@var{file}
602@end example
603
604The @samp{nfs:} scheme is defined in RFC 2224. It is similar to
605@samp{ftp:} except that it points to a file on a remote host that is
606handled by the automounter on the local host.
607
608@defvar url-nfs-automounter-directory-spec
609@end defvar
610A string saying how to invoke the NFS automounter. Certain @samp{%}
611sequences are recognized:
612
613@table @samp
614@item %h
615The hostname of the NFS server;
616@item %n
617The port number of the NFS server;
618@item %u
619The username to use to authenticate;
620@item %p
621The password to use to authenticate;
622@item %f
623The filename on the remote server;
624@item %%
625A literal @samp{%}.
626@end table
627
628Each can be used any number of times.
629
630@node cid
631@section cid
632@cindex Content-ID
633
634RFC 2111
635
636@node about
637@section about
638
639@node ldap
640@section ldap
641@cindex LDAP
642@cindex Lightweight Directory Access Protocol
643
644The LDAP scheme is defined in RFC 2255.
645
646@node imap
647@section imap
648@cindex IMAP
649
650RFC 2192
651
652@node man
653@section man
654@cindex @command{man}
655@cindex Unix man pages
656@findex man
657
658@example
659@samp{man:@var{page-spec}}
660@end example
661
662This is a non-standard scheme. @var{page-spec} is passed directly to
663the Lisp @code{man} function.
664
665@node Defining New URLs
666@chapter Defining New URLs
667
668@menu
669* Naming conventions::
670* Required functions::
671* Optional functions::
672* Asynchronous fetching::
673* Supporting file-name-handlers::
674@end menu
675
676@node Naming conventions
677@section Naming conventions
678
679@node Required functions
680@section Required functions
681
682@node Optional functions
683@section Optional functions
684
685@node Asynchronous fetching
686@section Asynchronous fetching
687
688@node Supporting file-name-handlers
689@section Supporting file-name-handlers
690
691@node General Facilities
692@chapter General Facilities
693
694@menu
695* Disk Caching::
696* Proxies::
697* Gateways in general::
698* History::
699@end menu
700
701@node Disk Caching
702@section Disk Caching
703@cindex Caching
704@cindex Persistent Cache
705@cindex Disk Cache
706
707The disk cache stores retrieved documents locally, whence they can be
708retrieved more quickly. When requesting a URL that is in the cache,
709the library checks to see if the page has changed since it was last
710retrieved from the remote machine. If not, the local copy is used,
711saving the transmission over the network.
712@cindex Cleaning the cache
713@cindex Clearing the cache
714@cindex Cache cleaning
715Currently the cache isn't cleared automatically.
716@c Running the @code{clean-cache} shell script
717@c fist is recommended, to allow for future cleaning of the cache. This
718@c shell script will remove all files that have not been accessed since it
719@c was last run. To keep the cache pared down, it is recommended that this
720@c script be run from @i{at} or @i{cron} (see the manual pages for
721@c crontab(5) or at(1) for more information)
722
723@defopt url-automatic-caching
724Setting this variable non-@code{nil} causes documents to be cached
725automatically.
726@end defopt
727
728@defopt url-cache-directory
729This variable specifies the
730directory to store the cache files. It defaults to sub-directory
731@file{cache} of @code{url-configuration-directory}.
732@end defopt
733
734@c Fixme: function v. option, but neither used.
735@c @findex url-cache-expired
736@c @defopt url-cache-expired
737@c This is a function to decide whether or not a cache entry has expired.
738@c It takes two times as it parameters and returns non-@code{nil} if the
739@c second time is ``too old'' when compared with the first time.
740@c @end defopt
741
742@defopt url-cache-creation-function
743The cache relies on a scheme for mapping URLs to files in the cache.
744This variable names a function which sets the type of cache to use.
745It takes a URL as argument and returns the absolute file name of the
746corresponding cache file. The two supplied possibilities are
747@code{url-cache-create-filename-using-md5} and
748@code{url-cache-create-filename-human-readable}.
749@end defopt
750
751@defun url-cache-create-filename-using-md5 url
752Creates a cache file name from @var{url} using MD5 hashing.
753This is creates entries with very few cache collisions and is fast.
754@cindex MD5
755@smallexample
756(url-cache-create-filename-using-md5 "http://www.example.com/foo/bar")
757 @result{} "/home/fx/.url/cache/fx/http/com/example/www/b8a35774ad20db71c7c3409a5410e74f"
758@end smallexample
759@end defun
760
761@defun url-cache-create-filename-human-readable url
762Creates a cache file name from @var{url} more obviously connected to
763@var{url} than for @code{url-cache-create-filename-using-md5}, but
764more likely to conflict with other files.
765@smallexample
766(url-cache-create-filename-human-readable "http://www.example.com/foo/bar")
767 @result{} "/home/fx/.url/cache/fx/http/com/example/www/foo/bar"
768@end smallexample
769@end defun
770
771@c Fixme: never actually used currently?
772@c @defopt url-standalone-mode
773@c @cindex Relying on cache
774@c @cindex Cache only mode
775@c @cindex Standalone mode
776@c If this variable is non-@code{nil}, the library relies solely on the
777@c cache for fetching documents and avoids checking if they have changed
778@c on remote servers.
779@c @end defopt
780
781@c With a large cache of documents on the local disk, it can be very handy
782@c when traveling, or any other time the network connection is not active
783@c (a laptop with a dial-on-demand PPP connection, etc). Emacs/W3 can rely
784@c solely on its cache, and avoid checking to see if the page has changed
785@c on the remote server. In the case of a dial-on-demand PPP connection,
786@c this will keep the phone line free as long as possible, only bringing up
787@c the PPP connection when asking for a page that is not located in the
788@c cache. This is very useful for demonstrations as well.
789
790@node Proxies
791@section Proxies and Gatewaying
792
793@c fixme: check/document url-ns stuff
794@cindex proxy servers
795@cindex proxies
796@cindex environment variables
797@vindex HTTP_PROXY
798Proxy servers are commonly used to provide gateways through firewalls
799or as caches serving some more-or-less local network. Each protocol
800(HTTP, FTP, etc.)@: can have a different gateway server. Proxying is
801conventionally configured commonly amongst different programs through
802environment variables of the form @code{@var{protocol}_proxy}, where
803@var{protocol} is one of the supported network protocols (@code{http},
804@code{ftp} etc.). The library recognizes such variables in either
805upper or lower case. Their values are of one of the forms:
806@itemize @bullet
807@item @code{@var{host}:@var{port}}
808@item A full URL;
809@item Simply a host name.
810@end itemize
811
812@vindex NO_PROXY
813The @code{NO_PROXY} environment variable specifies URLs that should be
814excluded from proxying (on servers that should be contacted directly).
815This should be a comma-separated list of hostnames, domain names, or a
816mixture of both. Asterisks can be used as wildcards, but other
817clients may not support that. Domain names may be indicated by a
818leading dot. For example:
819@example
820NO_PROXY="*.aventail.com,home.com,.seanet.com"
821@end example
822@noindent says to contact all machines in the @samp{aventail.com} and
823@samp{seanet.com} domains directly, as well as the machine named
824@samp{home.com}. If @code{NO_PROXY} isn't defined, @code{no_PROXY}
825and @code{no_proxy} are also tried, in that order.
826
827Proxies may also be specified directly in Lisp.
828
829@defopt url-proxy-services
830This variable is an alist of URL schemes and proxy servers that
831gateway them. The items are of the form @w{@code{(@var{scheme}
832. @var{host}:@var{portnumber})}}, says that the URL @var{scheme} is
833gatewayed through @var{portnumber} on the specified @var{host}. An
834exception is the pseudo scheme @code{"no_proxy"}, which is paired with
835a regexp matching host names not to be proxied. This variable is
836initialized from the environment as above.
837
838@example
839(setq url-proxy-services
840 '(("http" . "proxy.aventail.com:80")
841 ("no_proxy" . "^.*\\(aventail\\|seanet\\)\\.com")))
842@end example
843@end defopt
844
845@node Gateways in general
846@section Gateways in General
847@cindex gateways
848@cindex firewalls
849
850The library provides a general gateway layer through which all
851networking passes. It can both control access to the network and
852provide access through gateways in firewalls. This may make direct
853connections in some cases and pass through some sort of gateway in
854others.@footnote{Proxies (which only operate over HTTP) are
855implemented using this.} The library's basic function responsible for
856making connections is @code{url-open-stream}.
857
858@defun url-open-stream name buffer host service
859@cindex opening a stream
860@cindex stream, opening
861Open a stream to @var{host}, possibly via a gateway. The other
862arguments are as for @code{open-network-stream}. This will not make a
863connection if @code{url-gateway-unplugged} is non-@code{nil}.
864@end defun
865
866@defvar url-gateway-local-host-regexp
867This is a regular expression that matches local hosts that do not
868require the use of a gateway. If @code{nil}, all connections are made
869through the gateway.
870@end defvar
871
872@defvar url-gateway-method
873This variable controls which gateway method is used. It may be useful
874to bind it temporarily in some applications. It has values taken from
875a list of symbols. Possible values are:
876
877@table @code
878@item telnet
879@cindex @command{telnet}
880Use this method if you must first telnet and log into a gateway host,
881and then run telnet from that host to connect to outside machines.
882
883@item rlogin
884@cindex @command{rlogin}
885This method is identical to @code{telnet}, but uses @command{rlogin}
886to log into the remote machine without having to send the username and
887password over the wire every time.
888
889@item socks
890@cindex @sc{socks}
891Use if the firewall has a @sc{socks} gateway running on it. The
892@sc{socks} v5 protocol is defined in RFC 1928.
893
894@c @item ssl
895@c This probably shouldn't be documented
896@c Fixme: why not? -- fx
897
898@item native
899This method uses Emacs's builtin networking directly. This is the
900default. It can be used only if there is no firewall blocking access.
901@end table
902@end defvar
903
904The following variables control the gateway methods.
905
906@defopt url-gateway-telnet-host
907The gateway host to telnet to. Once logged in there, you then telnet
908out to the hosts you want to connect to.
909@end defopt
910@defopt url-gateway-telnet-parameters
911This should be a list of parameters to pass to the @command{telnet} program.
912@end defopt
913@defopt url-gateway-telnet-password-prompt
914This is a regular expression that matches the password prompt when
915logging in.
916@end defopt
917@defopt url-gateway-telnet-login-prompt
918This is a regular expression that matches the username prompt when
919logging in.
920@end defopt
921@defopt url-gateway-telnet-user-name
922The username to log in with.
923@end defopt
924@defopt url-gateway-telnet-password
925The password to send when logging in.
926@end defopt
927@defopt url-gateway-prompt-pattern
928This is a regular expression that matches the shell prompt.
929@end defopt
930
931@defopt url-gateway-rlogin-host
932Host to @samp{rlogin} to before telnetting out.
933@end defopt
934@defopt url-gateway-rlogin-parameters
935Parameters to pass to @samp{rsh}.
936@end defopt
937@defopt url-gateway-rlogin-user-name
938User name to use when logging in to the gateway.
939@end defopt
940@defopt url-gateway-prompt-pattern
941This is a regular expression that matches the shell prompt.
942@end defopt
943
944@defopt socks-server
945This specifies the default server, it takes the form
946@w{@code{("Default server" @var{server} @var{port} @var{version})}}
947where @var{version} can be either 4 or 5.
948@end defopt
949@defvar socks-password
950If this is @code{nil} then you will be asked for the password,
951otherwise it will be used as the password for authenticating you to
952the @sc{socks} server.
953@end defvar
954@defvar socks-username
955This is the username to use when authenticating yourself to the
956@sc{socks} server. By default this is your login name.
957@end defvar
958@defvar socks-timeout
959This controls how long, in seconds, to wait for responses from the
960@sc{socks} server; it is 5 by default.
961@end defvar
962@c fixme: these have been effectively commented-out in the code
963@c @defopt socks-server-aliases
964@c This a list of server aliases. It is a list of aliases of the form
965@c @var{(alias hostname port version)}.
966@c @end defopt
967@c @defopt socks-network-aliases
968@c This a list of network aliases. Each entry in the list takes the form
969@c @var{(alias (network))} where @var{alias} is a string that names the
970@c @var{network}. The networks can contain a pair (not a dotted pair) of
971@c @sc{ip} addresses which specify a range of @sc{ip} addresses, an @sc{ip}
972@c address and a netmask, a domain name or a unique hostname or @sc{ip}
973@c address.
974@c @end defopt
975@c @defopt socks-redirection-rules
976@c This a list of redirection rules. Each rule take the form
977@c @var{(Destination network Connection type)} where @var{Destination
978@c network} is a network alias from @code{socks-network-aliases} and
979@c @var{Connection type} can be @code{nil} in which case a direct
980@c connection is used, or it can be an alias from
981@c @code{socks-server-aliases} in which case that server is used as a
982@c proxy.
983@c @end defopt
984@defopt socks-nslookup-program
985@cindex @command{nslookup}
986This the @samp{nslookup} program. It is @code{"nslookup"} by default.
987@end defopt
988
989@menu
990* Suppressing network connections::
991@end menu
992@c * Broken hostname resolution::
993
994@node Suppressing network connections
995@subsection Suppressing Network Connections
996
997@cindex network connections, suppressing
998@cindex suppressing network connections
999@cindex bugs, HTML
1000@cindex HTML `bugs'
1001In some circumstances it is desirable to suppress making network
1002connections. A typical case is when rendering HTML in a mail user
1003agent, when external URLs should not be activated, particularly to
1004avoid `bugs' which `call home' by fetch single-pixel images and the
1005like. To arrange this, bind the following variable for the duration
1006of such processing.
1007
1008@defvar url-gateway-unplugged
1009If this variable is non-@code{nil} new network connections are never
1010opened by the URL library.
1011@end defvar
1012
1013@c @node Broken hostname resolution
1014@c @subsection Broken Hostname Resolution
1015
1016@c @cindex hostname resolver
1017@c @cindex resolver, hostname
1018@c Some C libraries do not include the hostname resolver routines in
1019@c their static libraries. If Emacs was linked statically, and was not
1020@c linked with the resolver libraries, it will not be able to get to any
1021@c machines off the local network. This is characterized by being able
1022@c to reach someplace with a raw ip number, but not its hostname
1023@c (@url{http://129.79.254.191/} works, but
1024@c @url{http://www.cs.indiana.edu/} doesn't). This used to happen on
1025@c SunOS4 and Ultrix, but is now probably now rare. If Emacs can't be
1026@c rebuilt linked against the resolver library, it can use the external
1027@c @command{nslookup} program instead.
1028
1029@c @defopt url-gateway-broken-resolution
1030@c @cindex @code{nslookup} program
1031@c @cindex program, @code{nslookup}
1032@c If non-@code{nil}, this variable says to use the program specified by
1033@c @code{url-gateway-nslookup-program} program to do hostname resolution.
1034@c @end defopt
1035
1036@c @defopt url-gateway-nslookup-program
1037@c The name of the program to do hostname lookup if Emacs can't do it
1038@c directly. This program should expect a single argument on the command
1039@c line---the hostname to resolve---and should produce output similar to
1040@c the standard Unix @command{nslookup} program:
1041@c @example
1042@c Name: www.cs.indiana.edu
1043@c Address: 129.79.254.191
1044@c @end example
1045@c @end defopt
1046
1047@node History
1048@section History
1049
1050@findex url-do-setup
1051The library can maintain a global history list tracking URLs accessed.
1052URL completion can be done from it. The history mechanism is set up
1053automatically via @code{url-do-setup} when it is configured to be on.
1054Note that the size of the history list is currently not limited.
1055
1056@vindex url-history-hash-table
1057The history `list' is actually a hash table,
1058@code{url-history-hash-table}. It contains access times keyed by URL
1059strings. The times are in the format returned by @code{current-time}.
1060
1061@defun url-history-update-url url time
1062This function updates the history table with an entry for @var{url}
1063accessed at the given @var{time}.
1064@end defun
1065
1066@defopt url-history-track
1067If non-@code{nil}, the library will keep track of all the URLs
1068accessed. If it is @code{t}, the list is saved to disk at the end of
1069each Emacs session. The default is @code{nil}.
1070@end defopt
1071
1072@defopt url-history-file
1073The file storing the history list between sessions. It defaults to
1074@file{history} in @code{url-configuration-directory}.
1075@end defopt
1076
1077@defopt url-history-save-interval
1078@findex url-history-setup-save-timer
1079The number of seconds between automatic saves of the history list.
1080Default is one hour. Note that if you change this variable directly,
1081rather than using Custom, after @code{url-do-setup} has been run, you
1082need to run the function @code{url-history-setup-save-timer}.
1083@end defopt
1084
1085@defun url-history-parse-history &optional fname
1086Parses the history file @var{fname} (default @code{url-history-file})
1087and sets up the history list.
1088@end defun
1089
1090@defun url-history-save-history &optional fname
1091Saves the current history to file @var{fname} (default
1092@code{url-history-file}).
1093@end defun
1094
1095@defun url-completion-function string predicate function
1096You can use this function to do completion of URLs from the history.
1097@end defun
1098
1099@node Customization
1100@chapter Customization
1101
1102@section Environment Variables
1103
1104@cindex environment variables
1105The following environment variables affect the library's operation at
1106startup.
1107
1108@table @code
1109@item TMPDIR
1110@vindex TMPDIR
1111@vindex url-temporary-directory
1112If this is defined, @var{url-temporary-directory} is initialized from
1113it.
1114@end table
1115
1116@section General User Options
1117
1118The following user options, settable with Customize, affect the
1119general operation of the package.
1120
1121@defopt url-debug
1122@cindex debugging
da0bbbc4 1123Specifies the types of debug messages which are logged to
4009494e
GM
1124the @code{*URL-DEBUG*} buffer.
1125@code{t} means log all messages.
1126A number means log all messages and show them with @code{message}.
da0bbbc4 1127It may also be a list of the types of messages to be logged.
4009494e
GM
1128@end defopt
1129@defopt url-personal-mail-address
1130@end defopt
1131@defopt url-privacy-level
1132@end defopt
1133@defopt url-uncompressor-alist
1134@end defopt
1135@defopt url-passwd-entry-func
1136@end defopt
1137@defopt url-standalone-mode
1138@end defopt
1139@defopt url-bad-port-list
1140@end defopt
1141@defopt url-max-password-attempts
1142@end defopt
1143@defopt url-temporary-directory
1144@end defopt
1145@defopt url-show-status
1146@end defopt
1147@defopt url-confirmation-func
1148The function to use for asking yes or no functions. This is normally
1149either @code{y-or-n-p} or @code{yes-or-no-p}, but could be another
1150function taking a single argument (the prompt) and returning @code{t}
1151only if an affirmative answer is given.
1152@end defopt
1153@defopt url-gateway-method
1154@c fixme: describe gatewaying
1155A symbol specifying the type of gateway support to use for connections
1156from the local machine. The supported methods are:
1157
1158@table @code
1159@item telnet
1160Run telnet in a subprocess to connect;
1161@item rlogin
1162Rlogin to another machine to connect;
1163@item socks
1164Connect through a socks server;
1165@item ssl
1166Connect with SSL;
1167@item native
1168Connect directly.
1169@end table
1170@end defopt
1171
1172@node GNU Free Documentation License
1173@appendix GNU Free Documentation License
1174@include doclicense.texi
1175
1176@node Function Index
1177@unnumbered Command and Function Index
1178@printindex fn
1179
1180@node Variable Index
1181@unnumbered Variable Index
1182@printindex vr
1183
1184@node Concept Index
1185@unnumbered Concept Index
1186@printindex cp
1187
4009494e
GM
1188@bye
1189
1190@ignore
1191 arch-tag: c96be356-7e2d-4196-bcda-b13246c5c3f0
1192@end ignore