(Fsystem_process_attributes): Doc fix.
[bpt/emacs.git] / doc / misc / url.texi
CommitLineData
4009494e 1\input texinfo
db78a8cb 2@setfilename ../../info/url
4009494e
GM
3@settitle URL Programmer's Manual
4
5@iftex
6@c @finalout
7@end iftex
8@c @setchapternewpage odd
9@c @smallbook
10
11@tex
12\overfullrule=0pt
13%\global\baselineskip 30pt % for printing in double space
14@end tex
15@dircategory World Wide Web
16@dircategory GNU Emacs Lisp
17@direntry
18* URL: (url). URL loading package.
19@end direntry
20
e2852284 21@copying
4009494e
GM
22This file documents the URL loading package.
23
24Copyright @copyright{} 1993, 1994, 1995, 1996, 1997, 1998, 1999, 2002,
3f548a7c 252004, 2005, 2006, 2007, 2008 Free Software Foundation, Inc.
4009494e 26
e2852284 27@quotation
4009494e
GM
28Permission is granted to copy, distribute and/or modify this document
29under the terms of the GNU Free Documentation License, Version 1.2 or
e2852284 30any later version published by the Free Software Foundation; with no
cd5c05d2
GM
31Invariant Sections, with the Front-Cover texts being ``A GNU Manual,''
32and with the Back-Cover Texts as in (a) below. A copy of the license
33is included in the section entitled ``GNU Free Documentation License''.
34
35(a) The FSF's Back-Cover Text is: ``You have the freedom to copy and
36modify this GNU manual. Buying copies from the FSF supports it in
37developing GNU and promoting software freedom.''
e2852284
GM
38@end quotation
39@end copying
4009494e
GM
40
41@c
42@titlepage
e2852284
GM
43@title URL Programmer's Manual
44@subtitle First Edition, URL Version 2.0
45@author William M. Perry @email{wmperry@@gnu.org}
46@author David Love @email{fx@@gnu.org}
4009494e
GM
47@page
48@vskip 0pt plus 1filll
e2852284 49@insertcopying
4009494e 50@end titlepage
e2852284 51
4009494e
GM
52@page
53@node Top
54@top URL
55
56
4009494e
GM
57@menu
58* Getting Started:: Preparing your program to use URLs.
59* Retrieving URLs:: How to use this package to retrieve a URL.
60* Supported URL Types:: Descriptions of URL types currently supported.
61* Defining New URLs:: How to define a URL loader for a new protocol.
62* General Facilities:: URLs can be cached, accessed via a gateway
63 and tracked in a history list.
64* Customization:: Variables you can alter.
65* GNU Free Documentation License:: The license for this documentation.
66* Function Index::
67* Variable Index::
68* Concept Index::
69@end menu
70
71@node Getting Started
72@chapter Getting Started
73@cindex URLs, definition
74@cindex URIs
75
76@dfn{Uniform Resource Locators} (URLs) are a specific form of
77@dfn{Uniform Resource Identifiers} (URI) described in RFC 2396 which
78updates RFC 1738 and RFC 1808. RFC 2016 defines uniform resource
79agents.
80
81URIs have the form @var{scheme}:@var{scheme-specific-part}, where the
82@var{scheme}s supported by this library are described below.
83@xref{Supported URL Types}.
84
85FTP, NFS, HTTP, HTTPS, @code{rlogin}, @code{telnet}, tn3270,
86IRC and gopher URLs all have the form
87
88@example
89@var{scheme}://@r{[}@var{userinfo}@@@r{]}@var{hostname}@r{[}:@var{port}@r{]}@r{[}/@var{path}@r{]}
90@end example
91@noindent
92where @samp{@r{[}} and @samp{@r{]}} delimit optional parts.
93@var{userinfo} sometimes takes the form @var{username}:@var{password}
94but you should beware of the security risks of sending cleartext
95passwords. @var{hostname} may be a domain name or a dotted decimal
96address. If the @samp{:@var{port}} is omitted then the library will
97use the `well known' port for that service when accessing URLs. With
98the possible exception of @code{telnet}, it is rare for ports to be
99specified, and it is possible using a non-standard port may have
100undesired consequences if a different service is listening on that
101port (e.g., an HTTP URL specifying the SMTP port can cause mail to be
102sent). @c , but @xref{Other Variables, url-bad-port-list}.
103The meaning of the @var{path} component depends on the service.
104
105@menu
106* Configuration::
107* Parsed URLs:: URLs are parsed into vector structures.
108@end menu
109
110@node Configuration
111@section Configuration
112
113@defvar url-configuration-directory
114@cindex @file{~/.url}
115@cindex configuration files
116The directory in which URL configuration files, the cache etc.,
117reside. Default @file{~/.url}.
118@end defvar
119
120@node Parsed URLs
121@section Parsed URLs
122@cindex parsed URLs
123The library functions typically operate on @dfn{parsed} versions of
124URLs. These are actually vectors of the form:
125
126@example
127[@var{type} @var{user} @var{password} @var{host} @var{port} @var{file} @var{target} @var{attributes} @var{full}]
128@end example
129
130@noindent where
131@table @var
132@item type
133is the type of the URL scheme, e.g., @code{http}
134@item user
135is the username associated with it, or @code{nil};
136@item password
137is the user password associated with it, or @code{nil};
138@item host
139is the host name associated with it, or @code{nil};
140@item port
141is the port number associated with it, or @code{nil};
142@item file
143is the `file' part of it, or @code{nil}. This doesn't necessarily
144actually refer to a file;
145@item target
146is the target part, or @code{nil};
147@item attributes
148is the attributes associated with it, or @code{nil};
149@item full
150is @code{t} for a fully-specified URL, with a host part indicated by
151@samp{//} after the scheme part.
152@end table
153
154@findex url-type
155@findex url-user
156@findex url-password
157@findex url-host
158@findex url-port
159@findex url-file
160@findex url-target
161@findex url-attributes
162@findex url-full
163@findex url-set-type
164@findex url-set-user
165@findex url-set-password
166@findex url-set-host
167@findex url-set-port
168@findex url-set-file
169@findex url-set-target
170@findex url-set-attributes
171@findex url-set-full
172These attributes have accessors named @code{url-@var{part}}, where
173@var{part} is the name of one of the elements above, e.g.,
174@code{url-host}. Similarly, there are setters of the form
175@code{url-set-@var{part}}.
176
177There are functions for parsing and unparsing between the string and
178vector forms.
179
180@defun url-generic-parse-url url
181Return a parsed version of the string @var{url}.
182@end defun
183
184@defun url-recreate-url url
185@cindex unparsing URLs
186Recreates a URL string from the parsed @var{url}.
187@end defun
188
189@node Retrieving URLs
190@chapter Retrieving URLs
191
192@defun url-retrieve-synchronously url
193Retrieve @var{url} synchronously and return a buffer containing the
194data. @var{url} is either a string or a parsed URL structure. Return
195@code{nil} if there are no data associated with it (the case for dired,
196info, or mailto URLs that need no further processing).
197@end defun
198
199@defun url-retrieve url callback &optional cbargs
200Retrieve @var{url} asynchronously and call @var{callback} with args
201@var{cbargs} when finished. The callback is called when the object
202has been completely retrieved, with the current buffer containing the
203object and any MIME headers associated with it. @var{url} is either a
204string or a parsed URL structure. Returns the buffer @var{url} will
205load into, or @code{nil} if the process has already completed.
206@end defun
207
208@node Supported URL Types
209@chapter Supported URL Types
210
211@menu
212* http/https:: Hypertext Transfer Protocol.
213* file/ftp:: Local files and FTP archives.
214* info:: Emacs `Info' pages.
215* mailto:: Sending email.
216* news/nntp/snews:: Usenet news.
217* rlogin/telnet/tn3270:: Remote host connectivity.
218* irc:: Internet Relay Chat.
219* data:: Embedded data URLs.
220* nfs:: Networked File System
221@c * finger::
222@c * gopher::
223@c * netrek::
224@c * prospero::
225* cid:: Content-ID.
226* about::
227* ldap:: Lightweight Directory Access Protocol
228* imap:: IMAP mailboxes.
229* man:: Unix man pages.
230@end menu
231
232@node http/https
233@section @code{http} and @code{https}
234
235The scheme @code{http} is Hypertext Transfer Protocol. The library
236supports version 1.1, specified in RFC 2616. (This supersedes 1.0,
237defined in RFC 1945) HTTP URLs have the following form, where most of
238the parts are optional:
239@example
240http://@var{user}:@var{password}@@@var{host}:@var{port}/@var{path}?@var{searchpart}#@var{fragment}
241@end example
242@c The @code{:@var{port}} part is optional, and @var{port} defaults to
243@c 80. The @code{/@var{path}} part, if present, is a slash-separated
244@c series elements. The @code{?@var{searchpart}}, if present, is the
245@c query for a search or the content of a form submission. The
246@c @code{#fragment} part, if present, is a location in the document.
247
248The scheme @code{https} is a secure version of @code{http}, with
249transmission via SSL. It is defined in RFC 2069. Its default port is
250443. This scheme depends on SSL support in Emacs via the
251@file{ssl.el} library and is actually implemented by forcing the
252@code{ssl} gateway method to be used. @xref{Gateways in general}.
253
254@defopt url-honor-refresh-requests
135305ed 255This controls honoring of HTTP @samp{Refresh} headers by which
4009494e 256servers can direct clients to reload documents from the same URL or a
135305ed
GM
257or different one. @code{nil} means they will not be honored,
258@code{t} (the default) means they will always be honored, and
4009494e
GM
259otherwise the user will be asked on each request.
260@end defopt
261
262
263@menu
264* Cookies::
265* HTTP language/coding::
266* HTTP URL Options::
267* Dealing with HTTP documents::
268@end menu
269
270@node Cookies
271@subsection Cookies
272
273@defopt url-cookie-file
274The file in which cookies are stored, defaulting to @file{cookies} in
275the directory specified by @code{url-configuration-directory}.
276@end defopt
277
278@defopt url-cookie-confirmation
279Specifies whether confirmation is require to accept cookies.
280@end defopt
281
282@defopt url-cookie-multiple-line
283Specifies whether to put all cookies for the server on one line in the
284HTTP request to satisfy broken servers like
285@url{http://www.hotmail.com}.
286@end defopt
287
288@defopt url-cookie-trusted-urls
289A list of regular expressions matching URLs from which to accept
290cookies always.
291@end defopt
292
293@defopt url-cookie-untrusted-urls
294A list of regular expressions matching URLs from which to reject
295cookies always.
296@end defopt
297
298@defopt url-cookie-save-interval
299The number of seconds between automatic saves of cookies to disk.
300Default is one hour.
301@end defopt
302
303
304@node HTTP language/coding
305@subsection Language and Encoding Preferences
306
307HTTP allows clients to express preferences for the language and
135305ed 308encoding of documents which servers may honor. For each of these
4009494e
GM
309variables, the value is a string; it can specify a single choice, or
310it can be a comma-separated list.
311
312Normally this list ordered by descending preference. However, each
313element can be followed by @samp{;q=@var{priority}} to specify its
314preference level, a decimal number from 0 to 1; e.g., for
315@code{url-mime-language-string}, @w{@code{"de, en-gb;q=0.8,
316en;q=0.7"}}. An element that has no @samp{;q} specification has
317preference level 1.
318
319@defopt url-mime-charset-string
320@cindex character sets
321@cindex coding systems
322This variable specifies a preference for character sets when documents
323can be served in more than one encoding.
324
325HTTP allows specifying a series of MIME charsets which indicate your
326preferred character set encodings, e.g., Latin-9 or Big5, and these
327can be weighted. The default series is generated automatically from
328the associated MIME types of all defined coding systems, sorted by the
329coding system priority specified in Emacs. @xref{Recognize Coding, ,
330Recognizing Coding Systems, emacs, The GNU Emacs Manual}.
331@end defopt
332
333@defopt url-mime-language-string
334@cindex language preferences
335A string specifying the preferred language when servers can serve
336files in several languages. Use RFC 1766 abbreviations, e.g.,
337@samp{en} for English, @samp{de} for German.
338
339The string can be @code{"*"} to get the first available language (as
340opposed to the default).
341@end defopt
342
343@node HTTP URL Options
344@subsection HTTP URL Options
345
346HTTP supports an @samp{OPTIONS} method describing things supported by
347the URL@.
348
349@defun url-http-options url
350Returns a property list describing options available for URL. The
351property list members are:
352
353@table @code
354@item methods
355A list of symbols specifying what HTTP methods the resource
356supports.
357
358@item dav
359@cindex DAV
360A list of numbers specifying what DAV protocol/schema versions are
361supported.
362
363@item dasl
364@cindex DASL
365A list of supported DASL search types supported (string form).
366
367@item ranges
368A list of the units available for use in partial document fetches.
369
370@item p3p
371@cindex P3P
372The @dfn{Platform For Privacy Protection} description for the resource.
373Currently this is just the raw header contents.
374@end table
375
376@end defun
377
378@node Dealing with HTTP documents
379@subsection Dealing with HTTP documents
380
381HTTP URLs are retrieved into a buffer containing the HTTP headers
382followed by the body. Since the headers are quasi-MIME, they may be
383processed using the MIME library. @xref{Top,, Emacs MIME,
384emacs-mime, The Emacs MIME Manual}. The URL package provides a
385function to do this in general:
386
387@defun url-decode-text-part handle &optional coding
388This function decodes charset-encoded text in the current buffer. In
389Emacs, the buffer is expected to be unibyte initially and is set to
390multibyte after decoding.
391HANDLE is the MIME handle of the original part. CODING is an explicit
392coding to use, overriding what the MIME headers specify.
393The coding system used for the decoding is returned.
394
395Note that this function doesn't deal with @samp{http-equiv} charset
396specifications in HTML @samp{<meta>} elements.
397@end defun
398
399@node file/ftp
400@section file and ftp
401@cindex files
402@cindex FTP
403@cindex File Transfer Protocol
404@cindex compressed files
405@cindex dired
406
407@example
408ftp://@var{user}:@var{password}@@@var{host}:@var{port}/@var{file}
409file://@var{user}:@var{password}@@@var{host}:@var{port}/@var{file}
410@end example
411
412These schemes are defined in RFC 1808.
413@samp{ftp:} and @samp{file:} are synonymous in this library. They
414allow reading arbitrary files from hosts. Either @samp{ange-ftp}
415(Emacs) or @samp{efs} (XEmacs) is used to retrieve them from remote
416hosts. Local files are accessed directly.
417
418Compressed files are handled, but support is hard-coded so that
419@code{jka-compr-compression-info-list} and so on have no affect.
420Suffixes recognized are @samp{.z}, @samp{.gz}, @samp{.Z} and
421@samp{.bz2}.
422
423@defopt url-directory-index-file
424The filename to look for when indexing a directory, default
425@samp{"index.html"}. If this file exists, and is readable, then it
426will be viewed instead of using @code{dired} to view the directory.
427@end defopt
428
429@node info
430@section info
431@cindex Info
432@cindex Texinfo
433@findex Info-goto-node
434
435@example
436info:@var{file}#@var{node}
437@end example
438
439Info URLs are not officially defined. They invoke
440@code{Info-goto-node} with argument @samp{(@var{file})@var{node}}.
441@samp{#@var{node}} is optional, defaulting to @samp{Top}.
442
443@node mailto
444@section mailto
445
446@cindex mailto
447@cindex email
448A mailto URL will send an email message to the address in the
449URL, for example @samp{mailto:foo@@bar.com} would compose a
450message to @samp{foo@@bar.com}.
451
452@defopt url-mail-command
453@vindex mail-user-agent
454The function called whenever url needs to send mail. This should
455normally be left to default from @var{mail-user-agent}. @xref{Mail
456Methods, , Mail-Composition Methods, emacs, The GNU Emacs Manual}.
457@end defopt
458
459An @samp{X-Url-From} header field containing the URL of the document
460that contained the mailto URL is added if that URL is known.
461
462RFC 2368 extends the definition of mailto URLs in RFC 1738.
463The form of a mailto URL is
464@example
465@samp{mailto:@var{mailbox}[?@var{header}=@var{contents}[&@var{header}=@var{contents}]]}
466@end example
467@noindent where an arbitrary number of @var{header}s can be added. If the
468@var{header} is @samp{body}, then @var{contents} is put in the body
469otherwise a @var{header} header field is created with @var{contents}
470as its contents. Note that the URL library does not consider any
471headers `dangerous' so you should check them before sending the
472message.
473
474@c Fixme: update
475Email messages are defined in @sc{rfc}822.
476
477@node news/nntp/snews
478@section @code{news}, @code{nntp} and @code{snews}
479@cindex news
480@cindex network news
481@cindex usenet
482@cindex NNTP
483@cindex snews
484
485@c draft-gilman-news-url-01
486The network news URL scheme take the following forms following RFC
4871738 except that for compatibility with other clients, host and port
488fields may be included in news URLs though they are properly only
489allowed for nntp an snews.
490
491@table @samp
492@item news:@var{newsgroup}
493Retrieves a list of messages in @var{newsgroup};
494@item news:@var{message-id}
495Retrieves the message with the given @var{message-id};
496@item news:*
497Retrieves a list of all available newsgroups;
498@item nntp://@var{host}:@var{port}/@var{newsgroup}
499@itemx nntp://@var{host}:@var{port}/@var{message-id}
500@itemx nntp://@var{host}:@var{port}/*
501Similar to the @samp{news} versions.
502@end table
503
504@samp{:@var{port}} is optional and defaults to :119.
505
506@samp{snews} is the same as @samp{nntp} except that the default port
507is :563.
508@cindex SSL
509(It is tunneled through SSL.)
510
511An @samp{nntp} URL is the same as a news URL, except that the URL may
512specify an article by its number.
513
514@defopt url-news-server
515This variable can be used to override the default news server.
516Usually this will be set by the Gnus package, which is used to fetch
517news.
518@cindex environment variable
519@vindex NNTPSERVER
520It may be set from the conventional environment variable
521@code{NNTPSERVER}.
522@end defopt
523
524@node rlogin/telnet/tn3270
525@section rlogin, telnet and tn3270
526@cindex rlogin
527@cindex telnet
528@cindex tn3270
529@cindex terminal emulation
530@findex terminal-emulator
531
532These URL schemes from RFC 1738 for logon via a terminal emulator have
533the form
534@example
535telnet://@var{user}:@var{password}@@@var{host}:@var{port}
536@end example
537but the @code{:@var{password}} component is ignored.
538
539To handle rlogin, telnet and tn3270 URLs, a @code{rlogin},
540@code{telnet} or @code{tn3270} (the program names and arguments are
541hardcoded) session is run in a @code{terminal-emulator} buffer.
542Well-known ports are used if the URL does not specify a port.
543
544@node irc
545@section irc
546@cindex IRC
547@cindex Internet Relay Chat
548@cindex ZEN IRC
549@cindex ERC
550@cindex rcirc
551@c Fixme: reference (was http://www.w3.org/Addressing/draft-mirashi-url-irc-01.txt)
552@dfn{Internet Relay Chat} (IRC) is handled by handing off the @sc{irc}
553session to a function named in @code{url-irc-function}.
554
555@defopt url-irc-function
556A function to actually open an IRC connection.
557This function
558must take five arguments, @var{host}, @var{port}, @var{channel},
559@var{user} and @var{password}. The @var{channel} argument specifies the
560channel to join immediately, this can be @code{nil}. By default this is
561@code{url-irc-rcirc}.
562@end defopt
563@defun url-irc-rcirc host port channel user password
564Processes the arguments and lets @code{rcirc} handle the session.
565@end defun
566@defun url-irc-erc host port channel user password
567Processes the arguments and lets @code{ERC} handle the session.
568@end defun
569@defun url-irc-zenirc host port channel user password
570Processes the arguments and lets @code{zenirc} handle the session.
571@end defun
572
573@node data
574@section data
575@cindex data URLs
576
577@example
578data:@r{[}@var{media-type}@r{]}@r{[};@var{base64}@r{]},@var{data}
579@end example
580
581Data URLs contain MIME data in the URL itself. They are defined in
582RFC 2397.
583
584@var{media-type} is a MIME @samp{Content-Type} string, possibly
585including parameters. It defaults to
586@samp{text/plain;charset=US-ASCII}. The @samp{text/plain} can be
587omitted but the charset parameter supplied. If @samp{;base64} is
588present, the @var{data} are base64-encoded.
589
590@node nfs
591@section nfs
592@cindex NFS
593@cindex Network File System
594@cindex automounter
595
596@example
597nfs://@var{user}:@var{password}@@@var{host}:@var{port}/@var{file}
598@end example
599
600The @samp{nfs:} scheme is defined in RFC 2224. It is similar to
601@samp{ftp:} except that it points to a file on a remote host that is
602handled by the automounter on the local host.
603
604@defvar url-nfs-automounter-directory-spec
605@end defvar
606A string saying how to invoke the NFS automounter. Certain @samp{%}
607sequences are recognized:
608
609@table @samp
610@item %h
611The hostname of the NFS server;
612@item %n
613The port number of the NFS server;
614@item %u
615The username to use to authenticate;
616@item %p
617The password to use to authenticate;
618@item %f
619The filename on the remote server;
620@item %%
621A literal @samp{%}.
622@end table
623
624Each can be used any number of times.
625
626@node cid
627@section cid
628@cindex Content-ID
629
630RFC 2111
631
632@node about
633@section about
634
635@node ldap
636@section ldap
637@cindex LDAP
638@cindex Lightweight Directory Access Protocol
639
640The LDAP scheme is defined in RFC 2255.
641
642@node imap
643@section imap
644@cindex IMAP
645
646RFC 2192
647
648@node man
649@section man
650@cindex @command{man}
651@cindex Unix man pages
652@findex man
653
654@example
655@samp{man:@var{page-spec}}
656@end example
657
658This is a non-standard scheme. @var{page-spec} is passed directly to
659the Lisp @code{man} function.
660
661@node Defining New URLs
662@chapter Defining New URLs
663
664@menu
665* Naming conventions::
666* Required functions::
667* Optional functions::
668* Asynchronous fetching::
669* Supporting file-name-handlers::
670@end menu
671
672@node Naming conventions
673@section Naming conventions
674
675@node Required functions
676@section Required functions
677
678@node Optional functions
679@section Optional functions
680
681@node Asynchronous fetching
682@section Asynchronous fetching
683
684@node Supporting file-name-handlers
685@section Supporting file-name-handlers
686
687@node General Facilities
688@chapter General Facilities
689
690@menu
691* Disk Caching::
692* Proxies::
693* Gateways in general::
694* History::
695@end menu
696
697@node Disk Caching
698@section Disk Caching
699@cindex Caching
700@cindex Persistent Cache
701@cindex Disk Cache
702
703The disk cache stores retrieved documents locally, whence they can be
704retrieved more quickly. When requesting a URL that is in the cache,
705the library checks to see if the page has changed since it was last
706retrieved from the remote machine. If not, the local copy is used,
707saving the transmission over the network.
708@cindex Cleaning the cache
709@cindex Clearing the cache
710@cindex Cache cleaning
711Currently the cache isn't cleared automatically.
712@c Running the @code{clean-cache} shell script
713@c fist is recommended, to allow for future cleaning of the cache. This
714@c shell script will remove all files that have not been accessed since it
715@c was last run. To keep the cache pared down, it is recommended that this
716@c script be run from @i{at} or @i{cron} (see the manual pages for
717@c crontab(5) or at(1) for more information)
718
719@defopt url-automatic-caching
720Setting this variable non-@code{nil} causes documents to be cached
721automatically.
722@end defopt
723
724@defopt url-cache-directory
725This variable specifies the
726directory to store the cache files. It defaults to sub-directory
727@file{cache} of @code{url-configuration-directory}.
728@end defopt
729
730@c Fixme: function v. option, but neither used.
731@c @findex url-cache-expired
732@c @defopt url-cache-expired
733@c This is a function to decide whether or not a cache entry has expired.
734@c It takes two times as it parameters and returns non-@code{nil} if the
735@c second time is ``too old'' when compared with the first time.
736@c @end defopt
737
738@defopt url-cache-creation-function
739The cache relies on a scheme for mapping URLs to files in the cache.
740This variable names a function which sets the type of cache to use.
741It takes a URL as argument and returns the absolute file name of the
742corresponding cache file. The two supplied possibilities are
743@code{url-cache-create-filename-using-md5} and
744@code{url-cache-create-filename-human-readable}.
745@end defopt
746
747@defun url-cache-create-filename-using-md5 url
748Creates a cache file name from @var{url} using MD5 hashing.
749This is creates entries with very few cache collisions and is fast.
750@cindex MD5
751@smallexample
752(url-cache-create-filename-using-md5 "http://www.example.com/foo/bar")
753 @result{} "/home/fx/.url/cache/fx/http/com/example/www/b8a35774ad20db71c7c3409a5410e74f"
754@end smallexample
755@end defun
756
757@defun url-cache-create-filename-human-readable url
758Creates a cache file name from @var{url} more obviously connected to
759@var{url} than for @code{url-cache-create-filename-using-md5}, but
760more likely to conflict with other files.
761@smallexample
762(url-cache-create-filename-human-readable "http://www.example.com/foo/bar")
763 @result{} "/home/fx/.url/cache/fx/http/com/example/www/foo/bar"
764@end smallexample
765@end defun
766
767@c Fixme: never actually used currently?
768@c @defopt url-standalone-mode
769@c @cindex Relying on cache
770@c @cindex Cache only mode
771@c @cindex Standalone mode
772@c If this variable is non-@code{nil}, the library relies solely on the
773@c cache for fetching documents and avoids checking if they have changed
774@c on remote servers.
775@c @end defopt
776
777@c With a large cache of documents on the local disk, it can be very handy
778@c when traveling, or any other time the network connection is not active
779@c (a laptop with a dial-on-demand PPP connection, etc). Emacs/W3 can rely
780@c solely on its cache, and avoid checking to see if the page has changed
781@c on the remote server. In the case of a dial-on-demand PPP connection,
782@c this will keep the phone line free as long as possible, only bringing up
783@c the PPP connection when asking for a page that is not located in the
784@c cache. This is very useful for demonstrations as well.
785
786@node Proxies
787@section Proxies and Gatewaying
788
789@c fixme: check/document url-ns stuff
790@cindex proxy servers
791@cindex proxies
792@cindex environment variables
793@vindex HTTP_PROXY
794Proxy servers are commonly used to provide gateways through firewalls
795or as caches serving some more-or-less local network. Each protocol
796(HTTP, FTP, etc.)@: can have a different gateway server. Proxying is
797conventionally configured commonly amongst different programs through
798environment variables of the form @code{@var{protocol}_proxy}, where
799@var{protocol} is one of the supported network protocols (@code{http},
800@code{ftp} etc.). The library recognizes such variables in either
801upper or lower case. Their values are of one of the forms:
802@itemize @bullet
803@item @code{@var{host}:@var{port}}
804@item A full URL;
805@item Simply a host name.
806@end itemize
807
808@vindex NO_PROXY
809The @code{NO_PROXY} environment variable specifies URLs that should be
810excluded from proxying (on servers that should be contacted directly).
811This should be a comma-separated list of hostnames, domain names, or a
812mixture of both. Asterisks can be used as wildcards, but other
813clients may not support that. Domain names may be indicated by a
814leading dot. For example:
815@example
816NO_PROXY="*.aventail.com,home.com,.seanet.com"
817@end example
818@noindent says to contact all machines in the @samp{aventail.com} and
819@samp{seanet.com} domains directly, as well as the machine named
820@samp{home.com}. If @code{NO_PROXY} isn't defined, @code{no_PROXY}
821and @code{no_proxy} are also tried, in that order.
822
823Proxies may also be specified directly in Lisp.
824
825@defopt url-proxy-services
826This variable is an alist of URL schemes and proxy servers that
827gateway them. The items are of the form @w{@code{(@var{scheme}
828. @var{host}:@var{portnumber})}}, says that the URL @var{scheme} is
829gatewayed through @var{portnumber} on the specified @var{host}. An
830exception is the pseudo scheme @code{"no_proxy"}, which is paired with
831a regexp matching host names not to be proxied. This variable is
832initialized from the environment as above.
833
834@example
835(setq url-proxy-services
836 '(("http" . "proxy.aventail.com:80")
837 ("no_proxy" . "^.*\\(aventail\\|seanet\\)\\.com")))
838@end example
839@end defopt
840
841@node Gateways in general
842@section Gateways in General
843@cindex gateways
844@cindex firewalls
845
846The library provides a general gateway layer through which all
847networking passes. It can both control access to the network and
848provide access through gateways in firewalls. This may make direct
849connections in some cases and pass through some sort of gateway in
850others.@footnote{Proxies (which only operate over HTTP) are
851implemented using this.} The library's basic function responsible for
852making connections is @code{url-open-stream}.
853
854@defun url-open-stream name buffer host service
855@cindex opening a stream
856@cindex stream, opening
857Open a stream to @var{host}, possibly via a gateway. The other
858arguments are as for @code{open-network-stream}. This will not make a
859connection if @code{url-gateway-unplugged} is non-@code{nil}.
860@end defun
861
862@defvar url-gateway-local-host-regexp
863This is a regular expression that matches local hosts that do not
864require the use of a gateway. If @code{nil}, all connections are made
865through the gateway.
866@end defvar
867
868@defvar url-gateway-method
869This variable controls which gateway method is used. It may be useful
870to bind it temporarily in some applications. It has values taken from
871a list of symbols. Possible values are:
872
873@table @code
874@item telnet
875@cindex @command{telnet}
876Use this method if you must first telnet and log into a gateway host,
877and then run telnet from that host to connect to outside machines.
878
879@item rlogin
880@cindex @command{rlogin}
881This method is identical to @code{telnet}, but uses @command{rlogin}
882to log into the remote machine without having to send the username and
883password over the wire every time.
884
885@item socks
886@cindex @sc{socks}
887Use if the firewall has a @sc{socks} gateway running on it. The
888@sc{socks} v5 protocol is defined in RFC 1928.
889
890@c @item ssl
891@c This probably shouldn't be documented
892@c Fixme: why not? -- fx
893
894@item native
895This method uses Emacs's builtin networking directly. This is the
896default. It can be used only if there is no firewall blocking access.
897@end table
898@end defvar
899
900The following variables control the gateway methods.
901
902@defopt url-gateway-telnet-host
903The gateway host to telnet to. Once logged in there, you then telnet
904out to the hosts you want to connect to.
905@end defopt
906@defopt url-gateway-telnet-parameters
907This should be a list of parameters to pass to the @command{telnet} program.
908@end defopt
909@defopt url-gateway-telnet-password-prompt
910This is a regular expression that matches the password prompt when
911logging in.
912@end defopt
913@defopt url-gateway-telnet-login-prompt
914This is a regular expression that matches the username prompt when
915logging in.
916@end defopt
917@defopt url-gateway-telnet-user-name
918The username to log in with.
919@end defopt
920@defopt url-gateway-telnet-password
921The password to send when logging in.
922@end defopt
923@defopt url-gateway-prompt-pattern
924This is a regular expression that matches the shell prompt.
925@end defopt
926
927@defopt url-gateway-rlogin-host
928Host to @samp{rlogin} to before telnetting out.
929@end defopt
930@defopt url-gateway-rlogin-parameters
931Parameters to pass to @samp{rsh}.
932@end defopt
933@defopt url-gateway-rlogin-user-name
934User name to use when logging in to the gateway.
935@end defopt
936@defopt url-gateway-prompt-pattern
937This is a regular expression that matches the shell prompt.
938@end defopt
939
940@defopt socks-server
941This specifies the default server, it takes the form
942@w{@code{("Default server" @var{server} @var{port} @var{version})}}
943where @var{version} can be either 4 or 5.
944@end defopt
945@defvar socks-password
946If this is @code{nil} then you will be asked for the password,
947otherwise it will be used as the password for authenticating you to
948the @sc{socks} server.
949@end defvar
950@defvar socks-username
951This is the username to use when authenticating yourself to the
952@sc{socks} server. By default this is your login name.
953@end defvar
954@defvar socks-timeout
955This controls how long, in seconds, to wait for responses from the
956@sc{socks} server; it is 5 by default.
957@end defvar
958@c fixme: these have been effectively commented-out in the code
959@c @defopt socks-server-aliases
960@c This a list of server aliases. It is a list of aliases of the form
961@c @var{(alias hostname port version)}.
962@c @end defopt
963@c @defopt socks-network-aliases
964@c This a list of network aliases. Each entry in the list takes the form
965@c @var{(alias (network))} where @var{alias} is a string that names the
966@c @var{network}. The networks can contain a pair (not a dotted pair) of
967@c @sc{ip} addresses which specify a range of @sc{ip} addresses, an @sc{ip}
968@c address and a netmask, a domain name or a unique hostname or @sc{ip}
969@c address.
970@c @end defopt
971@c @defopt socks-redirection-rules
972@c This a list of redirection rules. Each rule take the form
973@c @var{(Destination network Connection type)} where @var{Destination
974@c network} is a network alias from @code{socks-network-aliases} and
975@c @var{Connection type} can be @code{nil} in which case a direct
976@c connection is used, or it can be an alias from
977@c @code{socks-server-aliases} in which case that server is used as a
978@c proxy.
979@c @end defopt
980@defopt socks-nslookup-program
981@cindex @command{nslookup}
982This the @samp{nslookup} program. It is @code{"nslookup"} by default.
983@end defopt
984
985@menu
986* Suppressing network connections::
987@end menu
988@c * Broken hostname resolution::
989
990@node Suppressing network connections
991@subsection Suppressing Network Connections
992
993@cindex network connections, suppressing
994@cindex suppressing network connections
995@cindex bugs, HTML
996@cindex HTML `bugs'
997In some circumstances it is desirable to suppress making network
998connections. A typical case is when rendering HTML in a mail user
999agent, when external URLs should not be activated, particularly to
1000avoid `bugs' which `call home' by fetch single-pixel images and the
1001like. To arrange this, bind the following variable for the duration
1002of such processing.
1003
1004@defvar url-gateway-unplugged
1005If this variable is non-@code{nil} new network connections are never
1006opened by the URL library.
1007@end defvar
1008
1009@c @node Broken hostname resolution
1010@c @subsection Broken Hostname Resolution
1011
1012@c @cindex hostname resolver
1013@c @cindex resolver, hostname
1014@c Some C libraries do not include the hostname resolver routines in
1015@c their static libraries. If Emacs was linked statically, and was not
1016@c linked with the resolver libraries, it will not be able to get to any
1017@c machines off the local network. This is characterized by being able
1018@c to reach someplace with a raw ip number, but not its hostname
1019@c (@url{http://129.79.254.191/} works, but
1020@c @url{http://www.cs.indiana.edu/} doesn't). This used to happen on
1021@c SunOS4 and Ultrix, but is now probably now rare. If Emacs can't be
1022@c rebuilt linked against the resolver library, it can use the external
1023@c @command{nslookup} program instead.
1024
1025@c @defopt url-gateway-broken-resolution
1026@c @cindex @code{nslookup} program
1027@c @cindex program, @code{nslookup}
1028@c If non-@code{nil}, this variable says to use the program specified by
1029@c @code{url-gateway-nslookup-program} program to do hostname resolution.
1030@c @end defopt
1031
1032@c @defopt url-gateway-nslookup-program
1033@c The name of the program to do hostname lookup if Emacs can't do it
1034@c directly. This program should expect a single argument on the command
1035@c line---the hostname to resolve---and should produce output similar to
1036@c the standard Unix @command{nslookup} program:
1037@c @example
1038@c Name: www.cs.indiana.edu
1039@c Address: 129.79.254.191
1040@c @end example
1041@c @end defopt
1042
1043@node History
1044@section History
1045
1046@findex url-do-setup
1047The library can maintain a global history list tracking URLs accessed.
1048URL completion can be done from it. The history mechanism is set up
1049automatically via @code{url-do-setup} when it is configured to be on.
1050Note that the size of the history list is currently not limited.
1051
1052@vindex url-history-hash-table
1053The history `list' is actually a hash table,
1054@code{url-history-hash-table}. It contains access times keyed by URL
1055strings. The times are in the format returned by @code{current-time}.
1056
1057@defun url-history-update-url url time
1058This function updates the history table with an entry for @var{url}
1059accessed at the given @var{time}.
1060@end defun
1061
1062@defopt url-history-track
1063If non-@code{nil}, the library will keep track of all the URLs
1064accessed. If it is @code{t}, the list is saved to disk at the end of
1065each Emacs session. The default is @code{nil}.
1066@end defopt
1067
1068@defopt url-history-file
1069The file storing the history list between sessions. It defaults to
1070@file{history} in @code{url-configuration-directory}.
1071@end defopt
1072
1073@defopt url-history-save-interval
1074@findex url-history-setup-save-timer
1075The number of seconds between automatic saves of the history list.
1076Default is one hour. Note that if you change this variable directly,
1077rather than using Custom, after @code{url-do-setup} has been run, you
1078need to run the function @code{url-history-setup-save-timer}.
1079@end defopt
1080
1081@defun url-history-parse-history &optional fname
1082Parses the history file @var{fname} (default @code{url-history-file})
1083and sets up the history list.
1084@end defun
1085
1086@defun url-history-save-history &optional fname
1087Saves the current history to file @var{fname} (default
1088@code{url-history-file}).
1089@end defun
1090
1091@defun url-completion-function string predicate function
1092You can use this function to do completion of URLs from the history.
1093@end defun
1094
1095@node Customization
1096@chapter Customization
1097
1098@section Environment Variables
1099
1100@cindex environment variables
1101The following environment variables affect the library's operation at
1102startup.
1103
1104@table @code
1105@item TMPDIR
1106@vindex TMPDIR
1107@vindex url-temporary-directory
1108If this is defined, @var{url-temporary-directory} is initialized from
1109it.
1110@end table
1111
1112@section General User Options
1113
1114The following user options, settable with Customize, affect the
1115general operation of the package.
1116
1117@defopt url-debug
1118@cindex debugging
1119Specifies the types of debug messages the library which are logged to
1120the @code{*URL-DEBUG*} buffer.
1121@code{t} means log all messages.
1122A number means log all messages and show them with @code{message}.
1123If may also be a list of the types of messages to be logged.
1124@end defopt
1125@defopt url-personal-mail-address
1126@end defopt
1127@defopt url-privacy-level
1128@end defopt
1129@defopt url-uncompressor-alist
1130@end defopt
1131@defopt url-passwd-entry-func
1132@end defopt
1133@defopt url-standalone-mode
1134@end defopt
1135@defopt url-bad-port-list
1136@end defopt
1137@defopt url-max-password-attempts
1138@end defopt
1139@defopt url-temporary-directory
1140@end defopt
1141@defopt url-show-status
1142@end defopt
1143@defopt url-confirmation-func
1144The function to use for asking yes or no functions. This is normally
1145either @code{y-or-n-p} or @code{yes-or-no-p}, but could be another
1146function taking a single argument (the prompt) and returning @code{t}
1147only if an affirmative answer is given.
1148@end defopt
1149@defopt url-gateway-method
1150@c fixme: describe gatewaying
1151A symbol specifying the type of gateway support to use for connections
1152from the local machine. The supported methods are:
1153
1154@table @code
1155@item telnet
1156Run telnet in a subprocess to connect;
1157@item rlogin
1158Rlogin to another machine to connect;
1159@item socks
1160Connect through a socks server;
1161@item ssl
1162Connect with SSL;
1163@item native
1164Connect directly.
1165@end table
1166@end defopt
1167
1168@node GNU Free Documentation License
1169@appendix GNU Free Documentation License
1170@include doclicense.texi
1171
1172@node Function Index
1173@unnumbered Command and Function Index
1174@printindex fn
1175
1176@node Variable Index
1177@unnumbered Variable Index
1178@printindex vr
1179
1180@node Concept Index
1181@unnumbered Concept Index
1182@printindex cp
1183
1184@setchapternewpage odd
1185@contents
1186@bye
1187
1188@ignore
1189 arch-tag: c96be356-7e2d-4196-bcda-b13246c5c3f0
1190@end ignore