Convert consecutive FSF copyright years to ranges.
[bpt/emacs.git] / doc / misc / url.texi
CommitLineData
4009494e 1\input texinfo
db78a8cb 2@setfilename ../../info/url
4009494e
GM
3@settitle URL Programmer's Manual
4
5@iftex
6@c @finalout
7@end iftex
8@c @setchapternewpage odd
9@c @smallbook
10
11@tex
12\overfullrule=0pt
13%\global\baselineskip 30pt % for printing in double space
14@end tex
15@dircategory World Wide Web
5dc584b5 16@dircategory Emacs
4009494e 17@direntry
62e034c2 18* URL: (url). URL loading package.
4009494e
GM
19@end direntry
20
e2852284 21@copying
5dc584b5 22This file documents the Emacs Lisp URL loading package.
4009494e 23
73b0cd50 24Copyright @copyright{} 1993-1999, 2002, 2004-2011 Free Software Foundation, Inc.
4009494e 25
e2852284 26@quotation
4009494e 27Permission is granted to copy, distribute and/or modify this document
6a2c4aec 28under the terms of the GNU Free Documentation License, Version 1.3 or
e2852284 29any later version published by the Free Software Foundation; with no
cd5c05d2
GM
30Invariant Sections, with the Front-Cover texts being ``A GNU Manual,''
31and with the Back-Cover Texts as in (a) below. A copy of the license
32is included in the section entitled ``GNU Free Documentation License''.
33
34(a) The FSF's Back-Cover Text is: ``You have the freedom to copy and
35modify this GNU manual. Buying copies from the FSF supports it in
36developing GNU and promoting software freedom.''
e2852284
GM
37@end quotation
38@end copying
4009494e
GM
39
40@c
41@titlepage
e2852284
GM
42@title URL Programmer's Manual
43@subtitle First Edition, URL Version 2.0
44@author William M. Perry @email{wmperry@@gnu.org}
45@author David Love @email{fx@@gnu.org}
4009494e
GM
46@page
47@vskip 0pt plus 1filll
e2852284 48@insertcopying
4009494e 49@end titlepage
e2852284 50
5dc584b5
KB
51@contents
52
4009494e
GM
53@node Top
54@top URL
55
5dc584b5
KB
56@ifnottex
57@insertcopying
58@end ifnottex
4009494e 59
4009494e
GM
60@menu
61* Getting Started:: Preparing your program to use URLs.
62* Retrieving URLs:: How to use this package to retrieve a URL.
63* Supported URL Types:: Descriptions of URL types currently supported.
64* Defining New URLs:: How to define a URL loader for a new protocol.
65* General Facilities:: URLs can be cached, accessed via a gateway
66 and tracked in a history list.
67* Customization:: Variables you can alter.
68* GNU Free Documentation License:: The license for this documentation.
69* Function Index::
70* Variable Index::
71* Concept Index::
72@end menu
73
74@node Getting Started
75@chapter Getting Started
76@cindex URLs, definition
77@cindex URIs
78
79@dfn{Uniform Resource Locators} (URLs) are a specific form of
80@dfn{Uniform Resource Identifiers} (URI) described in RFC 2396 which
81updates RFC 1738 and RFC 1808. RFC 2016 defines uniform resource
82agents.
83
84URIs have the form @var{scheme}:@var{scheme-specific-part}, where the
85@var{scheme}s supported by this library are described below.
86@xref{Supported URL Types}.
87
88FTP, NFS, HTTP, HTTPS, @code{rlogin}, @code{telnet}, tn3270,
89IRC and gopher URLs all have the form
90
91@example
92@var{scheme}://@r{[}@var{userinfo}@@@r{]}@var{hostname}@r{[}:@var{port}@r{]}@r{[}/@var{path}@r{]}
93@end example
94@noindent
95where @samp{@r{[}} and @samp{@r{]}} delimit optional parts.
96@var{userinfo} sometimes takes the form @var{username}:@var{password}
97but you should beware of the security risks of sending cleartext
98passwords. @var{hostname} may be a domain name or a dotted decimal
99address. If the @samp{:@var{port}} is omitted then the library will
100use the `well known' port for that service when accessing URLs. With
101the possible exception of @code{telnet}, it is rare for ports to be
102specified, and it is possible using a non-standard port may have
103undesired consequences if a different service is listening on that
104port (e.g., an HTTP URL specifying the SMTP port can cause mail to be
105sent). @c , but @xref{Other Variables, url-bad-port-list}.
106The meaning of the @var{path} component depends on the service.
107
108@menu
109* Configuration::
110* Parsed URLs:: URLs are parsed into vector structures.
111@end menu
112
113@node Configuration
114@section Configuration
115
116@defvar url-configuration-directory
117@cindex @file{~/.url}
118@cindex configuration files
119The directory in which URL configuration files, the cache etc.,
120reside. Default @file{~/.url}.
121@end defvar
122
123@node Parsed URLs
124@section Parsed URLs
125@cindex parsed URLs
126The library functions typically operate on @dfn{parsed} versions of
127URLs. These are actually vectors of the form:
128
129@example
130[@var{type} @var{user} @var{password} @var{host} @var{port} @var{file} @var{target} @var{attributes} @var{full}]
131@end example
132
133@noindent where
134@table @var
135@item type
136is the type of the URL scheme, e.g., @code{http}
137@item user
138is the username associated with it, or @code{nil};
139@item password
140is the user password associated with it, or @code{nil};
141@item host
142is the host name associated with it, or @code{nil};
143@item port
144is the port number associated with it, or @code{nil};
145@item file
146is the `file' part of it, or @code{nil}. This doesn't necessarily
147actually refer to a file;
148@item target
149is the target part, or @code{nil};
150@item attributes
151is the attributes associated with it, or @code{nil};
152@item full
153is @code{t} for a fully-specified URL, with a host part indicated by
154@samp{//} after the scheme part.
155@end table
156
157@findex url-type
158@findex url-user
159@findex url-password
160@findex url-host
161@findex url-port
162@findex url-file
163@findex url-target
164@findex url-attributes
165@findex url-full
166@findex url-set-type
167@findex url-set-user
168@findex url-set-password
169@findex url-set-host
170@findex url-set-port
171@findex url-set-file
172@findex url-set-target
173@findex url-set-attributes
174@findex url-set-full
175These attributes have accessors named @code{url-@var{part}}, where
176@var{part} is the name of one of the elements above, e.g.,
177@code{url-host}. Similarly, there are setters of the form
178@code{url-set-@var{part}}.
179
180There are functions for parsing and unparsing between the string and
181vector forms.
182
183@defun url-generic-parse-url url
184Return a parsed version of the string @var{url}.
185@end defun
186
187@defun url-recreate-url url
188@cindex unparsing URLs
189Recreates a URL string from the parsed @var{url}.
190@end defun
191
192@node Retrieving URLs
193@chapter Retrieving URLs
194
195@defun url-retrieve-synchronously url
196Retrieve @var{url} synchronously and return a buffer containing the
197data. @var{url} is either a string or a parsed URL structure. Return
198@code{nil} if there are no data associated with it (the case for dired,
199info, or mailto URLs that need no further processing).
200@end defun
201
202@defun url-retrieve url callback &optional cbargs
203Retrieve @var{url} asynchronously and call @var{callback} with args
204@var{cbargs} when finished. The callback is called when the object
205has been completely retrieved, with the current buffer containing the
206object and any MIME headers associated with it. @var{url} is either a
207string or a parsed URL structure. Returns the buffer @var{url} will
208load into, or @code{nil} if the process has already completed.
209@end defun
210
211@node Supported URL Types
212@chapter Supported URL Types
213
214@menu
215* http/https:: Hypertext Transfer Protocol.
216* file/ftp:: Local files and FTP archives.
217* info:: Emacs `Info' pages.
218* mailto:: Sending email.
219* news/nntp/snews:: Usenet news.
220* rlogin/telnet/tn3270:: Remote host connectivity.
221* irc:: Internet Relay Chat.
222* data:: Embedded data URLs.
223* nfs:: Networked File System
224@c * finger::
225@c * gopher::
226@c * netrek::
227@c * prospero::
228* cid:: Content-ID.
229* about::
230* ldap:: Lightweight Directory Access Protocol
231* imap:: IMAP mailboxes.
232* man:: Unix man pages.
233@end menu
234
235@node http/https
236@section @code{http} and @code{https}
237
238The scheme @code{http} is Hypertext Transfer Protocol. The library
239supports version 1.1, specified in RFC 2616. (This supersedes 1.0,
240defined in RFC 1945) HTTP URLs have the following form, where most of
241the parts are optional:
242@example
243http://@var{user}:@var{password}@@@var{host}:@var{port}/@var{path}?@var{searchpart}#@var{fragment}
244@end example
245@c The @code{:@var{port}} part is optional, and @var{port} defaults to
246@c 80. The @code{/@var{path}} part, if present, is a slash-separated
247@c series elements. The @code{?@var{searchpart}}, if present, is the
248@c query for a search or the content of a form submission. The
249@c @code{#fragment} part, if present, is a location in the document.
250
251The scheme @code{https} is a secure version of @code{http}, with
252transmission via SSL. It is defined in RFC 2069. Its default port is
253443. This scheme depends on SSL support in Emacs via the
254@file{ssl.el} library and is actually implemented by forcing the
255@code{ssl} gateway method to be used. @xref{Gateways in general}.
256
257@defopt url-honor-refresh-requests
135305ed 258This controls honoring of HTTP @samp{Refresh} headers by which
4009494e 259servers can direct clients to reload documents from the same URL or a
135305ed
GM
260or different one. @code{nil} means they will not be honored,
261@code{t} (the default) means they will always be honored, and
4009494e
GM
262otherwise the user will be asked on each request.
263@end defopt
264
265
266@menu
267* Cookies::
268* HTTP language/coding::
269* HTTP URL Options::
270* Dealing with HTTP documents::
271@end menu
272
273@node Cookies
274@subsection Cookies
275
276@defopt url-cookie-file
277The file in which cookies are stored, defaulting to @file{cookies} in
278the directory specified by @code{url-configuration-directory}.
279@end defopt
280
281@defopt url-cookie-confirmation
282Specifies whether confirmation is require to accept cookies.
283@end defopt
284
285@defopt url-cookie-multiple-line
286Specifies whether to put all cookies for the server on one line in the
287HTTP request to satisfy broken servers like
288@url{http://www.hotmail.com}.
289@end defopt
290
291@defopt url-cookie-trusted-urls
292A list of regular expressions matching URLs from which to accept
293cookies always.
294@end defopt
295
296@defopt url-cookie-untrusted-urls
297A list of regular expressions matching URLs from which to reject
298cookies always.
299@end defopt
300
301@defopt url-cookie-save-interval
302The number of seconds between automatic saves of cookies to disk.
303Default is one hour.
304@end defopt
305
306
307@node HTTP language/coding
308@subsection Language and Encoding Preferences
309
310HTTP allows clients to express preferences for the language and
135305ed 311encoding of documents which servers may honor. For each of these
4009494e
GM
312variables, the value is a string; it can specify a single choice, or
313it can be a comma-separated list.
314
da0bbbc4 315Normally, this list is ordered by descending preference. However, each
4009494e
GM
316element can be followed by @samp{;q=@var{priority}} to specify its
317preference level, a decimal number from 0 to 1; e.g., for
318@code{url-mime-language-string}, @w{@code{"de, en-gb;q=0.8,
319en;q=0.7"}}. An element that has no @samp{;q} specification has
320preference level 1.
321
322@defopt url-mime-charset-string
323@cindex character sets
324@cindex coding systems
325This variable specifies a preference for character sets when documents
326can be served in more than one encoding.
327
328HTTP allows specifying a series of MIME charsets which indicate your
329preferred character set encodings, e.g., Latin-9 or Big5, and these
330can be weighted. The default series is generated automatically from
331the associated MIME types of all defined coding systems, sorted by the
332coding system priority specified in Emacs. @xref{Recognize Coding, ,
333Recognizing Coding Systems, emacs, The GNU Emacs Manual}.
334@end defopt
335
336@defopt url-mime-language-string
337@cindex language preferences
338A string specifying the preferred language when servers can serve
339files in several languages. Use RFC 1766 abbreviations, e.g.,
340@samp{en} for English, @samp{de} for German.
341
342The string can be @code{"*"} to get the first available language (as
343opposed to the default).
344@end defopt
345
346@node HTTP URL Options
347@subsection HTTP URL Options
348
349HTTP supports an @samp{OPTIONS} method describing things supported by
350the URL@.
351
352@defun url-http-options url
353Returns a property list describing options available for URL. The
354property list members are:
355
356@table @code
357@item methods
358A list of symbols specifying what HTTP methods the resource
359supports.
360
361@item dav
362@cindex DAV
363A list of numbers specifying what DAV protocol/schema versions are
364supported.
365
366@item dasl
367@cindex DASL
368A list of supported DASL search types supported (string form).
369
370@item ranges
371A list of the units available for use in partial document fetches.
372
373@item p3p
374@cindex P3P
375The @dfn{Platform For Privacy Protection} description for the resource.
376Currently this is just the raw header contents.
377@end table
378
379@end defun
380
381@node Dealing with HTTP documents
382@subsection Dealing with HTTP documents
383
384HTTP URLs are retrieved into a buffer containing the HTTP headers
385followed by the body. Since the headers are quasi-MIME, they may be
386processed using the MIME library. @xref{Top,, Emacs MIME,
387emacs-mime, The Emacs MIME Manual}. The URL package provides a
388function to do this in general:
389
390@defun url-decode-text-part handle &optional coding
391This function decodes charset-encoded text in the current buffer. In
392Emacs, the buffer is expected to be unibyte initially and is set to
393multibyte after decoding.
394HANDLE is the MIME handle of the original part. CODING is an explicit
395coding to use, overriding what the MIME headers specify.
396The coding system used for the decoding is returned.
397
398Note that this function doesn't deal with @samp{http-equiv} charset
399specifications in HTML @samp{<meta>} elements.
400@end defun
401
402@node file/ftp
403@section file and ftp
404@cindex files
405@cindex FTP
406@cindex File Transfer Protocol
407@cindex compressed files
408@cindex dired
409
410@example
411ftp://@var{user}:@var{password}@@@var{host}:@var{port}/@var{file}
412file://@var{user}:@var{password}@@@var{host}:@var{port}/@var{file}
413@end example
414
415These schemes are defined in RFC 1808.
416@samp{ftp:} and @samp{file:} are synonymous in this library. They
417allow reading arbitrary files from hosts. Either @samp{ange-ftp}
418(Emacs) or @samp{efs} (XEmacs) is used to retrieve them from remote
419hosts. Local files are accessed directly.
420
421Compressed files are handled, but support is hard-coded so that
422@code{jka-compr-compression-info-list} and so on have no affect.
423Suffixes recognized are @samp{.z}, @samp{.gz}, @samp{.Z} and
424@samp{.bz2}.
425
426@defopt url-directory-index-file
427The filename to look for when indexing a directory, default
428@samp{"index.html"}. If this file exists, and is readable, then it
429will be viewed instead of using @code{dired} to view the directory.
430@end defopt
431
432@node info
433@section info
434@cindex Info
435@cindex Texinfo
436@findex Info-goto-node
437
438@example
439info:@var{file}#@var{node}
440@end example
441
442Info URLs are not officially defined. They invoke
443@code{Info-goto-node} with argument @samp{(@var{file})@var{node}}.
444@samp{#@var{node}} is optional, defaulting to @samp{Top}.
445
446@node mailto
447@section mailto
448
449@cindex mailto
450@cindex email
451A mailto URL will send an email message to the address in the
452URL, for example @samp{mailto:foo@@bar.com} would compose a
453message to @samp{foo@@bar.com}.
454
455@defopt url-mail-command
456@vindex mail-user-agent
457The function called whenever url needs to send mail. This should
458normally be left to default from @var{mail-user-agent}. @xref{Mail
459Methods, , Mail-Composition Methods, emacs, The GNU Emacs Manual}.
460@end defopt
461
462An @samp{X-Url-From} header field containing the URL of the document
463that contained the mailto URL is added if that URL is known.
464
465RFC 2368 extends the definition of mailto URLs in RFC 1738.
466The form of a mailto URL is
467@example
468@samp{mailto:@var{mailbox}[?@var{header}=@var{contents}[&@var{header}=@var{contents}]]}
469@end example
470@noindent where an arbitrary number of @var{header}s can be added. If the
471@var{header} is @samp{body}, then @var{contents} is put in the body
472otherwise a @var{header} header field is created with @var{contents}
473as its contents. Note that the URL library does not consider any
474headers `dangerous' so you should check them before sending the
475message.
476
477@c Fixme: update
478Email messages are defined in @sc{rfc}822.
479
480@node news/nntp/snews
481@section @code{news}, @code{nntp} and @code{snews}
482@cindex news
483@cindex network news
484@cindex usenet
485@cindex NNTP
486@cindex snews
487
488@c draft-gilman-news-url-01
489The network news URL scheme take the following forms following RFC
4901738 except that for compatibility with other clients, host and port
491fields may be included in news URLs though they are properly only
492allowed for nntp an snews.
493
494@table @samp
495@item news:@var{newsgroup}
496Retrieves a list of messages in @var{newsgroup};
497@item news:@var{message-id}
498Retrieves the message with the given @var{message-id};
499@item news:*
500Retrieves a list of all available newsgroups;
501@item nntp://@var{host}:@var{port}/@var{newsgroup}
502@itemx nntp://@var{host}:@var{port}/@var{message-id}
503@itemx nntp://@var{host}:@var{port}/*
504Similar to the @samp{news} versions.
505@end table
506
507@samp{:@var{port}} is optional and defaults to :119.
508
509@samp{snews} is the same as @samp{nntp} except that the default port
510is :563.
511@cindex SSL
512(It is tunneled through SSL.)
513
514An @samp{nntp} URL is the same as a news URL, except that the URL may
515specify an article by its number.
516
517@defopt url-news-server
518This variable can be used to override the default news server.
519Usually this will be set by the Gnus package, which is used to fetch
520news.
521@cindex environment variable
522@vindex NNTPSERVER
523It may be set from the conventional environment variable
524@code{NNTPSERVER}.
525@end defopt
526
527@node rlogin/telnet/tn3270
528@section rlogin, telnet and tn3270
529@cindex rlogin
530@cindex telnet
531@cindex tn3270
532@cindex terminal emulation
533@findex terminal-emulator
534
535These URL schemes from RFC 1738 for logon via a terminal emulator have
536the form
537@example
538telnet://@var{user}:@var{password}@@@var{host}:@var{port}
539@end example
540but the @code{:@var{password}} component is ignored.
541
542To handle rlogin, telnet and tn3270 URLs, a @code{rlogin},
543@code{telnet} or @code{tn3270} (the program names and arguments are
544hardcoded) session is run in a @code{terminal-emulator} buffer.
545Well-known ports are used if the URL does not specify a port.
546
547@node irc
548@section irc
549@cindex IRC
550@cindex Internet Relay Chat
551@cindex ZEN IRC
552@cindex ERC
553@cindex rcirc
554@c Fixme: reference (was http://www.w3.org/Addressing/draft-mirashi-url-irc-01.txt)
555@dfn{Internet Relay Chat} (IRC) is handled by handing off the @sc{irc}
556session to a function named in @code{url-irc-function}.
557
558@defopt url-irc-function
559A function to actually open an IRC connection.
560This function
561must take five arguments, @var{host}, @var{port}, @var{channel},
562@var{user} and @var{password}. The @var{channel} argument specifies the
563channel to join immediately, this can be @code{nil}. By default this is
564@code{url-irc-rcirc}.
565@end defopt
566@defun url-irc-rcirc host port channel user password
567Processes the arguments and lets @code{rcirc} handle the session.
568@end defun
569@defun url-irc-erc host port channel user password
570Processes the arguments and lets @code{ERC} handle the session.
571@end defun
572@defun url-irc-zenirc host port channel user password
573Processes the arguments and lets @code{zenirc} handle the session.
574@end defun
575
576@node data
577@section data
578@cindex data URLs
579
580@example
581data:@r{[}@var{media-type}@r{]}@r{[};@var{base64}@r{]},@var{data}
582@end example
583
584Data URLs contain MIME data in the URL itself. They are defined in
585RFC 2397.
586
587@var{media-type} is a MIME @samp{Content-Type} string, possibly
588including parameters. It defaults to
589@samp{text/plain;charset=US-ASCII}. The @samp{text/plain} can be
590omitted but the charset parameter supplied. If @samp{;base64} is
591present, the @var{data} are base64-encoded.
592
593@node nfs
594@section nfs
595@cindex NFS
596@cindex Network File System
597@cindex automounter
598
599@example
600nfs://@var{user}:@var{password}@@@var{host}:@var{port}/@var{file}
601@end example
602
603The @samp{nfs:} scheme is defined in RFC 2224. It is similar to
604@samp{ftp:} except that it points to a file on a remote host that is
605handled by the automounter on the local host.
606
607@defvar url-nfs-automounter-directory-spec
608@end defvar
609A string saying how to invoke the NFS automounter. Certain @samp{%}
610sequences are recognized:
611
612@table @samp
613@item %h
614The hostname of the NFS server;
615@item %n
616The port number of the NFS server;
617@item %u
618The username to use to authenticate;
619@item %p
620The password to use to authenticate;
621@item %f
622The filename on the remote server;
623@item %%
624A literal @samp{%}.
625@end table
626
627Each can be used any number of times.
628
629@node cid
630@section cid
631@cindex Content-ID
632
633RFC 2111
634
635@node about
636@section about
637
638@node ldap
639@section ldap
640@cindex LDAP
641@cindex Lightweight Directory Access Protocol
642
643The LDAP scheme is defined in RFC 2255.
644
645@node imap
646@section imap
647@cindex IMAP
648
649RFC 2192
650
651@node man
652@section man
653@cindex @command{man}
654@cindex Unix man pages
655@findex man
656
657@example
658@samp{man:@var{page-spec}}
659@end example
660
661This is a non-standard scheme. @var{page-spec} is passed directly to
662the Lisp @code{man} function.
663
664@node Defining New URLs
665@chapter Defining New URLs
666
667@menu
668* Naming conventions::
669* Required functions::
670* Optional functions::
671* Asynchronous fetching::
672* Supporting file-name-handlers::
673@end menu
674
675@node Naming conventions
676@section Naming conventions
677
678@node Required functions
679@section Required functions
680
681@node Optional functions
682@section Optional functions
683
684@node Asynchronous fetching
685@section Asynchronous fetching
686
687@node Supporting file-name-handlers
688@section Supporting file-name-handlers
689
690@node General Facilities
691@chapter General Facilities
692
693@menu
694* Disk Caching::
695* Proxies::
696* Gateways in general::
697* History::
698@end menu
699
700@node Disk Caching
701@section Disk Caching
702@cindex Caching
703@cindex Persistent Cache
704@cindex Disk Cache
705
706The disk cache stores retrieved documents locally, whence they can be
707retrieved more quickly. When requesting a URL that is in the cache,
708the library checks to see if the page has changed since it was last
709retrieved from the remote machine. If not, the local copy is used,
710saving the transmission over the network.
711@cindex Cleaning the cache
712@cindex Clearing the cache
713@cindex Cache cleaning
714Currently the cache isn't cleared automatically.
715@c Running the @code{clean-cache} shell script
716@c fist is recommended, to allow for future cleaning of the cache. This
717@c shell script will remove all files that have not been accessed since it
718@c was last run. To keep the cache pared down, it is recommended that this
719@c script be run from @i{at} or @i{cron} (see the manual pages for
720@c crontab(5) or at(1) for more information)
721
722@defopt url-automatic-caching
723Setting this variable non-@code{nil} causes documents to be cached
724automatically.
725@end defopt
726
727@defopt url-cache-directory
728This variable specifies the
729directory to store the cache files. It defaults to sub-directory
730@file{cache} of @code{url-configuration-directory}.
731@end defopt
732
4009494e
GM
733@defopt url-cache-creation-function
734The cache relies on a scheme for mapping URLs to files in the cache.
735This variable names a function which sets the type of cache to use.
736It takes a URL as argument and returns the absolute file name of the
737corresponding cache file. The two supplied possibilities are
738@code{url-cache-create-filename-using-md5} and
739@code{url-cache-create-filename-human-readable}.
740@end defopt
741
742@defun url-cache-create-filename-using-md5 url
743Creates a cache file name from @var{url} using MD5 hashing.
744This is creates entries with very few cache collisions and is fast.
745@cindex MD5
746@smallexample
747(url-cache-create-filename-using-md5 "http://www.example.com/foo/bar")
748 @result{} "/home/fx/.url/cache/fx/http/com/example/www/b8a35774ad20db71c7c3409a5410e74f"
749@end smallexample
750@end defun
751
752@defun url-cache-create-filename-human-readable url
753Creates a cache file name from @var{url} more obviously connected to
754@var{url} than for @code{url-cache-create-filename-using-md5}, but
755more likely to conflict with other files.
756@smallexample
757(url-cache-create-filename-human-readable "http://www.example.com/foo/bar")
758 @result{} "/home/fx/.url/cache/fx/http/com/example/www/foo/bar"
759@end smallexample
760@end defun
761
9c766321 762@defun url-cache-expired
18778f71
GM
763This function returns non-nil if a cache entry has expired (or is absent).
764The arguments are a URL and optional expiration delay in seconds
765(default @var{url-cache-expire-time}).
9c766321
JD
766@end defun
767
18778f71
GM
768@defopt url-cache-expire-time
769This variable is the default number of seconds to use for the
770expire-time argument of the function @code{url-cache-expired}.
771@end defopt
772
9c766321 773@defun url-fetch-from-cache
18778f71
GM
774This function takes a URL as its argument and returns a buffer
775containing the data cached for that URL.
9c766321
JD
776@end defun
777
4009494e
GM
778@c Fixme: never actually used currently?
779@c @defopt url-standalone-mode
780@c @cindex Relying on cache
781@c @cindex Cache only mode
782@c @cindex Standalone mode
783@c If this variable is non-@code{nil}, the library relies solely on the
784@c cache for fetching documents and avoids checking if they have changed
785@c on remote servers.
786@c @end defopt
787
788@c With a large cache of documents on the local disk, it can be very handy
789@c when traveling, or any other time the network connection is not active
790@c (a laptop with a dial-on-demand PPP connection, etc). Emacs/W3 can rely
791@c solely on its cache, and avoid checking to see if the page has changed
792@c on the remote server. In the case of a dial-on-demand PPP connection,
793@c this will keep the phone line free as long as possible, only bringing up
794@c the PPP connection when asking for a page that is not located in the
795@c cache. This is very useful for demonstrations as well.
796
797@node Proxies
798@section Proxies and Gatewaying
799
800@c fixme: check/document url-ns stuff
801@cindex proxy servers
802@cindex proxies
803@cindex environment variables
804@vindex HTTP_PROXY
805Proxy servers are commonly used to provide gateways through firewalls
806or as caches serving some more-or-less local network. Each protocol
807(HTTP, FTP, etc.)@: can have a different gateway server. Proxying is
808conventionally configured commonly amongst different programs through
809environment variables of the form @code{@var{protocol}_proxy}, where
810@var{protocol} is one of the supported network protocols (@code{http},
811@code{ftp} etc.). The library recognizes such variables in either
812upper or lower case. Their values are of one of the forms:
813@itemize @bullet
814@item @code{@var{host}:@var{port}}
815@item A full URL;
816@item Simply a host name.
817@end itemize
818
819@vindex NO_PROXY
820The @code{NO_PROXY} environment variable specifies URLs that should be
821excluded from proxying (on servers that should be contacted directly).
822This should be a comma-separated list of hostnames, domain names, or a
823mixture of both. Asterisks can be used as wildcards, but other
824clients may not support that. Domain names may be indicated by a
825leading dot. For example:
826@example
827NO_PROXY="*.aventail.com,home.com,.seanet.com"
828@end example
829@noindent says to contact all machines in the @samp{aventail.com} and
830@samp{seanet.com} domains directly, as well as the machine named
831@samp{home.com}. If @code{NO_PROXY} isn't defined, @code{no_PROXY}
832and @code{no_proxy} are also tried, in that order.
833
834Proxies may also be specified directly in Lisp.
835
836@defopt url-proxy-services
837This variable is an alist of URL schemes and proxy servers that
838gateway them. The items are of the form @w{@code{(@var{scheme}
839. @var{host}:@var{portnumber})}}, says that the URL @var{scheme} is
840gatewayed through @var{portnumber} on the specified @var{host}. An
841exception is the pseudo scheme @code{"no_proxy"}, which is paired with
842a regexp matching host names not to be proxied. This variable is
843initialized from the environment as above.
844
845@example
846(setq url-proxy-services
847 '(("http" . "proxy.aventail.com:80")
848 ("no_proxy" . "^.*\\(aventail\\|seanet\\)\\.com")))
849@end example
850@end defopt
851
852@node Gateways in general
853@section Gateways in General
854@cindex gateways
855@cindex firewalls
856
857The library provides a general gateway layer through which all
858networking passes. It can both control access to the network and
859provide access through gateways in firewalls. This may make direct
860connections in some cases and pass through some sort of gateway in
861others.@footnote{Proxies (which only operate over HTTP) are
862implemented using this.} The library's basic function responsible for
863making connections is @code{url-open-stream}.
864
865@defun url-open-stream name buffer host service
866@cindex opening a stream
867@cindex stream, opening
868Open a stream to @var{host}, possibly via a gateway. The other
869arguments are as for @code{open-network-stream}. This will not make a
870connection if @code{url-gateway-unplugged} is non-@code{nil}.
871@end defun
872
873@defvar url-gateway-local-host-regexp
874This is a regular expression that matches local hosts that do not
875require the use of a gateway. If @code{nil}, all connections are made
876through the gateway.
877@end defvar
878
879@defvar url-gateway-method
880This variable controls which gateway method is used. It may be useful
881to bind it temporarily in some applications. It has values taken from
882a list of symbols. Possible values are:
883
884@table @code
885@item telnet
886@cindex @command{telnet}
887Use this method if you must first telnet and log into a gateway host,
888and then run telnet from that host to connect to outside machines.
889
890@item rlogin
891@cindex @command{rlogin}
892This method is identical to @code{telnet}, but uses @command{rlogin}
893to log into the remote machine without having to send the username and
894password over the wire every time.
895
896@item socks
897@cindex @sc{socks}
898Use if the firewall has a @sc{socks} gateway running on it. The
899@sc{socks} v5 protocol is defined in RFC 1928.
900
901@c @item ssl
902@c This probably shouldn't be documented
903@c Fixme: why not? -- fx
904
905@item native
906This method uses Emacs's builtin networking directly. This is the
907default. It can be used only if there is no firewall blocking access.
908@end table
909@end defvar
910
911The following variables control the gateway methods.
912
913@defopt url-gateway-telnet-host
914The gateway host to telnet to. Once logged in there, you then telnet
915out to the hosts you want to connect to.
916@end defopt
917@defopt url-gateway-telnet-parameters
918This should be a list of parameters to pass to the @command{telnet} program.
919@end defopt
920@defopt url-gateway-telnet-password-prompt
921This is a regular expression that matches the password prompt when
922logging in.
923@end defopt
924@defopt url-gateway-telnet-login-prompt
925This is a regular expression that matches the username prompt when
926logging in.
927@end defopt
928@defopt url-gateway-telnet-user-name
929The username to log in with.
930@end defopt
931@defopt url-gateway-telnet-password
932The password to send when logging in.
933@end defopt
934@defopt url-gateway-prompt-pattern
935This is a regular expression that matches the shell prompt.
936@end defopt
937
938@defopt url-gateway-rlogin-host
939Host to @samp{rlogin} to before telnetting out.
940@end defopt
941@defopt url-gateway-rlogin-parameters
942Parameters to pass to @samp{rsh}.
943@end defopt
944@defopt url-gateway-rlogin-user-name
945User name to use when logging in to the gateway.
946@end defopt
947@defopt url-gateway-prompt-pattern
948This is a regular expression that matches the shell prompt.
949@end defopt
950
951@defopt socks-server
952This specifies the default server, it takes the form
953@w{@code{("Default server" @var{server} @var{port} @var{version})}}
954where @var{version} can be either 4 or 5.
955@end defopt
956@defvar socks-password
957If this is @code{nil} then you will be asked for the password,
958otherwise it will be used as the password for authenticating you to
959the @sc{socks} server.
960@end defvar
961@defvar socks-username
962This is the username to use when authenticating yourself to the
963@sc{socks} server. By default this is your login name.
964@end defvar
965@defvar socks-timeout
966This controls how long, in seconds, to wait for responses from the
967@sc{socks} server; it is 5 by default.
968@end defvar
969@c fixme: these have been effectively commented-out in the code
970@c @defopt socks-server-aliases
971@c This a list of server aliases. It is a list of aliases of the form
972@c @var{(alias hostname port version)}.
973@c @end defopt
974@c @defopt socks-network-aliases
975@c This a list of network aliases. Each entry in the list takes the form
976@c @var{(alias (network))} where @var{alias} is a string that names the
977@c @var{network}. The networks can contain a pair (not a dotted pair) of
978@c @sc{ip} addresses which specify a range of @sc{ip} addresses, an @sc{ip}
979@c address and a netmask, a domain name or a unique hostname or @sc{ip}
980@c address.
981@c @end defopt
982@c @defopt socks-redirection-rules
983@c This a list of redirection rules. Each rule take the form
984@c @var{(Destination network Connection type)} where @var{Destination
985@c network} is a network alias from @code{socks-network-aliases} and
986@c @var{Connection type} can be @code{nil} in which case a direct
987@c connection is used, or it can be an alias from
988@c @code{socks-server-aliases} in which case that server is used as a
989@c proxy.
990@c @end defopt
991@defopt socks-nslookup-program
992@cindex @command{nslookup}
993This the @samp{nslookup} program. It is @code{"nslookup"} by default.
994@end defopt
995
996@menu
997* Suppressing network connections::
998@end menu
999@c * Broken hostname resolution::
1000
1001@node Suppressing network connections
1002@subsection Suppressing Network Connections
1003
1004@cindex network connections, suppressing
1005@cindex suppressing network connections
1006@cindex bugs, HTML
1007@cindex HTML `bugs'
1008In some circumstances it is desirable to suppress making network
1009connections. A typical case is when rendering HTML in a mail user
1010agent, when external URLs should not be activated, particularly to
1011avoid `bugs' which `call home' by fetch single-pixel images and the
1012like. To arrange this, bind the following variable for the duration
1013of such processing.
1014
1015@defvar url-gateway-unplugged
1016If this variable is non-@code{nil} new network connections are never
1017opened by the URL library.
1018@end defvar
1019
1020@c @node Broken hostname resolution
1021@c @subsection Broken Hostname Resolution
1022
1023@c @cindex hostname resolver
1024@c @cindex resolver, hostname
1025@c Some C libraries do not include the hostname resolver routines in
1026@c their static libraries. If Emacs was linked statically, and was not
1027@c linked with the resolver libraries, it will not be able to get to any
1028@c machines off the local network. This is characterized by being able
1029@c to reach someplace with a raw ip number, but not its hostname
1030@c (@url{http://129.79.254.191/} works, but
1031@c @url{http://www.cs.indiana.edu/} doesn't). This used to happen on
1032@c SunOS4 and Ultrix, but is now probably now rare. If Emacs can't be
1033@c rebuilt linked against the resolver library, it can use the external
1034@c @command{nslookup} program instead.
1035
1036@c @defopt url-gateway-broken-resolution
1037@c @cindex @code{nslookup} program
1038@c @cindex program, @code{nslookup}
1039@c If non-@code{nil}, this variable says to use the program specified by
1040@c @code{url-gateway-nslookup-program} program to do hostname resolution.
1041@c @end defopt
1042
1043@c @defopt url-gateway-nslookup-program
1044@c The name of the program to do hostname lookup if Emacs can't do it
1045@c directly. This program should expect a single argument on the command
1046@c line---the hostname to resolve---and should produce output similar to
1047@c the standard Unix @command{nslookup} program:
1048@c @example
1049@c Name: www.cs.indiana.edu
1050@c Address: 129.79.254.191
1051@c @end example
1052@c @end defopt
1053
1054@node History
1055@section History
1056
1057@findex url-do-setup
1058The library can maintain a global history list tracking URLs accessed.
1059URL completion can be done from it. The history mechanism is set up
1060automatically via @code{url-do-setup} when it is configured to be on.
1061Note that the size of the history list is currently not limited.
1062
1063@vindex url-history-hash-table
1064The history `list' is actually a hash table,
1065@code{url-history-hash-table}. It contains access times keyed by URL
1066strings. The times are in the format returned by @code{current-time}.
1067
1068@defun url-history-update-url url time
1069This function updates the history table with an entry for @var{url}
1070accessed at the given @var{time}.
1071@end defun
1072
1073@defopt url-history-track
1074If non-@code{nil}, the library will keep track of all the URLs
1075accessed. If it is @code{t}, the list is saved to disk at the end of
1076each Emacs session. The default is @code{nil}.
1077@end defopt
1078
1079@defopt url-history-file
1080The file storing the history list between sessions. It defaults to
1081@file{history} in @code{url-configuration-directory}.
1082@end defopt
1083
1084@defopt url-history-save-interval
1085@findex url-history-setup-save-timer
1086The number of seconds between automatic saves of the history list.
1087Default is one hour. Note that if you change this variable directly,
1088rather than using Custom, after @code{url-do-setup} has been run, you
1089need to run the function @code{url-history-setup-save-timer}.
1090@end defopt
1091
1092@defun url-history-parse-history &optional fname
1093Parses the history file @var{fname} (default @code{url-history-file})
1094and sets up the history list.
1095@end defun
1096
1097@defun url-history-save-history &optional fname
1098Saves the current history to file @var{fname} (default
1099@code{url-history-file}).
1100@end defun
1101
1102@defun url-completion-function string predicate function
1103You can use this function to do completion of URLs from the history.
1104@end defun
1105
1106@node Customization
1107@chapter Customization
1108
1109@section Environment Variables
1110
1111@cindex environment variables
1112The following environment variables affect the library's operation at
1113startup.
1114
1115@table @code
1116@item TMPDIR
1117@vindex TMPDIR
1118@vindex url-temporary-directory
1119If this is defined, @var{url-temporary-directory} is initialized from
1120it.
1121@end table
1122
1123@section General User Options
1124
1125The following user options, settable with Customize, affect the
1126general operation of the package.
1127
1128@defopt url-debug
1129@cindex debugging
da0bbbc4 1130Specifies the types of debug messages which are logged to
4009494e
GM
1131the @code{*URL-DEBUG*} buffer.
1132@code{t} means log all messages.
1133A number means log all messages and show them with @code{message}.
da0bbbc4 1134It may also be a list of the types of messages to be logged.
4009494e
GM
1135@end defopt
1136@defopt url-personal-mail-address
1137@end defopt
1138@defopt url-privacy-level
1139@end defopt
1140@defopt url-uncompressor-alist
1141@end defopt
1142@defopt url-passwd-entry-func
1143@end defopt
1144@defopt url-standalone-mode
1145@end defopt
1146@defopt url-bad-port-list
1147@end defopt
1148@defopt url-max-password-attempts
1149@end defopt
1150@defopt url-temporary-directory
1151@end defopt
1152@defopt url-show-status
1153@end defopt
1154@defopt url-confirmation-func
1155The function to use for asking yes or no functions. This is normally
1156either @code{y-or-n-p} or @code{yes-or-no-p}, but could be another
1157function taking a single argument (the prompt) and returning @code{t}
1158only if an affirmative answer is given.
1159@end defopt
1160@defopt url-gateway-method
1161@c fixme: describe gatewaying
1162A symbol specifying the type of gateway support to use for connections
1163from the local machine. The supported methods are:
1164
1165@table @code
1166@item telnet
1167Run telnet in a subprocess to connect;
1168@item rlogin
1169Rlogin to another machine to connect;
1170@item socks
1171Connect through a socks server;
1172@item ssl
1173Connect with SSL;
1174@item native
1175Connect directly.
1176@end table
1177@end defopt
1178
1179@node GNU Free Documentation License
1180@appendix GNU Free Documentation License
1181@include doclicense.texi
1182
1183@node Function Index
1184@unnumbered Command and Function Index
1185@printindex fn
1186
1187@node Variable Index
1188@unnumbered Variable Index
1189@printindex vr
1190
1191@node Concept Index
1192@unnumbered Concept Index
1193@printindex cp
1194
4009494e 1195@bye