ses.texi (The Basics): Mention how to create a new spreadsheet. Mention the
[bpt/emacs.git] / doc / misc / url.texi
CommitLineData
4009494e 1\input texinfo
db78a8cb 2@setfilename ../../info/url
4009494e
GM
3@settitle URL Programmer's Manual
4
5@iftex
6@c @finalout
7@end iftex
8@c @setchapternewpage odd
9@c @smallbook
10
11@tex
12\overfullrule=0pt
13%\global\baselineskip 30pt % for printing in double space
14@end tex
15@dircategory World Wide Web
16@dircategory GNU Emacs Lisp
17@direntry
18* URL: (url). URL loading package.
19@end direntry
20
21@ifnottex
22This file documents the URL loading package.
23
24Copyright @copyright{} 1993, 1994, 1995, 1996, 1997, 1998, 1999, 2002,
252004, 2005, 2006, 2007 Free Software Foundation, Inc.
26
27Permission is granted to copy, distribute and/or modify this document
28under the terms of the GNU Free Documentation License, Version 1.2 or
29any later version published by the Free Software Foundation; with the
30Invariant Sections being
31``GNU GENERAL PUBLIC LICENSE''. A copy of the
32license is included in the section entitled ``GNU Free Documentation
33License.''
34@end ifnottex
35
36@c
37@titlepage
38@sp 6
39@center @titlefont{URL}
40@center @titlefont{Programmer's Manual}
41@sp 4
42@center First Edition, URL Version 2.0
43@sp 1
44@c @center December 1999
45@sp 5
46@center William M. Perry
47@center @email{wmperry@@gnu.org}
48@center David Love
49@center @email{fx@@gnu.org}
50@page
51@vskip 0pt plus 1filll
52Copyright @copyright{} 1993, 1994, 1995, 1996, 1997, 1998, 1999, 2002,
532003, 2004, 2005, 2006, 2007 Free Software Foundation, Inc.
54
55Permission is granted to copy, distribute and/or modify this document
56under the terms of the GNU Free Documentation License, Version 1.2 or
57any later version published by the Free Software Foundation; with the
58Invariant Sections being
59``GNU GENERAL PUBLIC LICENSE''. A copy of the
60license is included in the section entitled ``GNU Free Documentation
61License.''
62@end titlepage
63@page
64@node Top
65@top URL
66
67
68
69@menu
70* Getting Started:: Preparing your program to use URLs.
71* Retrieving URLs:: How to use this package to retrieve a URL.
72* Supported URL Types:: Descriptions of URL types currently supported.
73* Defining New URLs:: How to define a URL loader for a new protocol.
74* General Facilities:: URLs can be cached, accessed via a gateway
75 and tracked in a history list.
76* Customization:: Variables you can alter.
77* GNU Free Documentation License:: The license for this documentation.
78* Function Index::
79* Variable Index::
80* Concept Index::
81@end menu
82
83@node Getting Started
84@chapter Getting Started
85@cindex URLs, definition
86@cindex URIs
87
88@dfn{Uniform Resource Locators} (URLs) are a specific form of
89@dfn{Uniform Resource Identifiers} (URI) described in RFC 2396 which
90updates RFC 1738 and RFC 1808. RFC 2016 defines uniform resource
91agents.
92
93URIs have the form @var{scheme}:@var{scheme-specific-part}, where the
94@var{scheme}s supported by this library are described below.
95@xref{Supported URL Types}.
96
97FTP, NFS, HTTP, HTTPS, @code{rlogin}, @code{telnet}, tn3270,
98IRC and gopher URLs all have the form
99
100@example
101@var{scheme}://@r{[}@var{userinfo}@@@r{]}@var{hostname}@r{[}:@var{port}@r{]}@r{[}/@var{path}@r{]}
102@end example
103@noindent
104where @samp{@r{[}} and @samp{@r{]}} delimit optional parts.
105@var{userinfo} sometimes takes the form @var{username}:@var{password}
106but you should beware of the security risks of sending cleartext
107passwords. @var{hostname} may be a domain name or a dotted decimal
108address. If the @samp{:@var{port}} is omitted then the library will
109use the `well known' port for that service when accessing URLs. With
110the possible exception of @code{telnet}, it is rare for ports to be
111specified, and it is possible using a non-standard port may have
112undesired consequences if a different service is listening on that
113port (e.g., an HTTP URL specifying the SMTP port can cause mail to be
114sent). @c , but @xref{Other Variables, url-bad-port-list}.
115The meaning of the @var{path} component depends on the service.
116
117@menu
118* Configuration::
119* Parsed URLs:: URLs are parsed into vector structures.
120@end menu
121
122@node Configuration
123@section Configuration
124
125@defvar url-configuration-directory
126@cindex @file{~/.url}
127@cindex configuration files
128The directory in which URL configuration files, the cache etc.,
129reside. Default @file{~/.url}.
130@end defvar
131
132@node Parsed URLs
133@section Parsed URLs
134@cindex parsed URLs
135The library functions typically operate on @dfn{parsed} versions of
136URLs. These are actually vectors of the form:
137
138@example
139[@var{type} @var{user} @var{password} @var{host} @var{port} @var{file} @var{target} @var{attributes} @var{full}]
140@end example
141
142@noindent where
143@table @var
144@item type
145is the type of the URL scheme, e.g., @code{http}
146@item user
147is the username associated with it, or @code{nil};
148@item password
149is the user password associated with it, or @code{nil};
150@item host
151is the host name associated with it, or @code{nil};
152@item port
153is the port number associated with it, or @code{nil};
154@item file
155is the `file' part of it, or @code{nil}. This doesn't necessarily
156actually refer to a file;
157@item target
158is the target part, or @code{nil};
159@item attributes
160is the attributes associated with it, or @code{nil};
161@item full
162is @code{t} for a fully-specified URL, with a host part indicated by
163@samp{//} after the scheme part.
164@end table
165
166@findex url-type
167@findex url-user
168@findex url-password
169@findex url-host
170@findex url-port
171@findex url-file
172@findex url-target
173@findex url-attributes
174@findex url-full
175@findex url-set-type
176@findex url-set-user
177@findex url-set-password
178@findex url-set-host
179@findex url-set-port
180@findex url-set-file
181@findex url-set-target
182@findex url-set-attributes
183@findex url-set-full
184These attributes have accessors named @code{url-@var{part}}, where
185@var{part} is the name of one of the elements above, e.g.,
186@code{url-host}. Similarly, there are setters of the form
187@code{url-set-@var{part}}.
188
189There are functions for parsing and unparsing between the string and
190vector forms.
191
192@defun url-generic-parse-url url
193Return a parsed version of the string @var{url}.
194@end defun
195
196@defun url-recreate-url url
197@cindex unparsing URLs
198Recreates a URL string from the parsed @var{url}.
199@end defun
200
201@node Retrieving URLs
202@chapter Retrieving URLs
203
204@defun url-retrieve-synchronously url
205Retrieve @var{url} synchronously and return a buffer containing the
206data. @var{url} is either a string or a parsed URL structure. Return
207@code{nil} if there are no data associated with it (the case for dired,
208info, or mailto URLs that need no further processing).
209@end defun
210
211@defun url-retrieve url callback &optional cbargs
212Retrieve @var{url} asynchronously and call @var{callback} with args
213@var{cbargs} when finished. The callback is called when the object
214has been completely retrieved, with the current buffer containing the
215object and any MIME headers associated with it. @var{url} is either a
216string or a parsed URL structure. Returns the buffer @var{url} will
217load into, or @code{nil} if the process has already completed.
218@end defun
219
220@node Supported URL Types
221@chapter Supported URL Types
222
223@menu
224* http/https:: Hypertext Transfer Protocol.
225* file/ftp:: Local files and FTP archives.
226* info:: Emacs `Info' pages.
227* mailto:: Sending email.
228* news/nntp/snews:: Usenet news.
229* rlogin/telnet/tn3270:: Remote host connectivity.
230* irc:: Internet Relay Chat.
231* data:: Embedded data URLs.
232* nfs:: Networked File System
233@c * finger::
234@c * gopher::
235@c * netrek::
236@c * prospero::
237* cid:: Content-ID.
238* about::
239* ldap:: Lightweight Directory Access Protocol
240* imap:: IMAP mailboxes.
241* man:: Unix man pages.
242@end menu
243
244@node http/https
245@section @code{http} and @code{https}
246
247The scheme @code{http} is Hypertext Transfer Protocol. The library
248supports version 1.1, specified in RFC 2616. (This supersedes 1.0,
249defined in RFC 1945) HTTP URLs have the following form, where most of
250the parts are optional:
251@example
252http://@var{user}:@var{password}@@@var{host}:@var{port}/@var{path}?@var{searchpart}#@var{fragment}
253@end example
254@c The @code{:@var{port}} part is optional, and @var{port} defaults to
255@c 80. The @code{/@var{path}} part, if present, is a slash-separated
256@c series elements. The @code{?@var{searchpart}}, if present, is the
257@c query for a search or the content of a form submission. The
258@c @code{#fragment} part, if present, is a location in the document.
259
260The scheme @code{https} is a secure version of @code{http}, with
261transmission via SSL. It is defined in RFC 2069. Its default port is
262443. This scheme depends on SSL support in Emacs via the
263@file{ssl.el} library and is actually implemented by forcing the
264@code{ssl} gateway method to be used. @xref{Gateways in general}.
265
266@defopt url-honor-refresh-requests
267This controls honouring of HTTP @samp{Refresh} headers by which
268servers can direct clients to reload documents from the same URL or a
269or different one. @code{nil} means they will not be honoured,
270@code{t} (the default) means they will always be honoured, and
271otherwise the user will be asked on each request.
272@end defopt
273
274
275@menu
276* Cookies::
277* HTTP language/coding::
278* HTTP URL Options::
279* Dealing with HTTP documents::
280@end menu
281
282@node Cookies
283@subsection Cookies
284
285@defopt url-cookie-file
286The file in which cookies are stored, defaulting to @file{cookies} in
287the directory specified by @code{url-configuration-directory}.
288@end defopt
289
290@defopt url-cookie-confirmation
291Specifies whether confirmation is require to accept cookies.
292@end defopt
293
294@defopt url-cookie-multiple-line
295Specifies whether to put all cookies for the server on one line in the
296HTTP request to satisfy broken servers like
297@url{http://www.hotmail.com}.
298@end defopt
299
300@defopt url-cookie-trusted-urls
301A list of regular expressions matching URLs from which to accept
302cookies always.
303@end defopt
304
305@defopt url-cookie-untrusted-urls
306A list of regular expressions matching URLs from which to reject
307cookies always.
308@end defopt
309
310@defopt url-cookie-save-interval
311The number of seconds between automatic saves of cookies to disk.
312Default is one hour.
313@end defopt
314
315
316@node HTTP language/coding
317@subsection Language and Encoding Preferences
318
319HTTP allows clients to express preferences for the language and
320encoding of documents which servers may honour. For each of these
321variables, the value is a string; it can specify a single choice, or
322it can be a comma-separated list.
323
324Normally this list ordered by descending preference. However, each
325element can be followed by @samp{;q=@var{priority}} to specify its
326preference level, a decimal number from 0 to 1; e.g., for
327@code{url-mime-language-string}, @w{@code{"de, en-gb;q=0.8,
328en;q=0.7"}}. An element that has no @samp{;q} specification has
329preference level 1.
330
331@defopt url-mime-charset-string
332@cindex character sets
333@cindex coding systems
334This variable specifies a preference for character sets when documents
335can be served in more than one encoding.
336
337HTTP allows specifying a series of MIME charsets which indicate your
338preferred character set encodings, e.g., Latin-9 or Big5, and these
339can be weighted. The default series is generated automatically from
340the associated MIME types of all defined coding systems, sorted by the
341coding system priority specified in Emacs. @xref{Recognize Coding, ,
342Recognizing Coding Systems, emacs, The GNU Emacs Manual}.
343@end defopt
344
345@defopt url-mime-language-string
346@cindex language preferences
347A string specifying the preferred language when servers can serve
348files in several languages. Use RFC 1766 abbreviations, e.g.,
349@samp{en} for English, @samp{de} for German.
350
351The string can be @code{"*"} to get the first available language (as
352opposed to the default).
353@end defopt
354
355@node HTTP URL Options
356@subsection HTTP URL Options
357
358HTTP supports an @samp{OPTIONS} method describing things supported by
359the URL@.
360
361@defun url-http-options url
362Returns a property list describing options available for URL. The
363property list members are:
364
365@table @code
366@item methods
367A list of symbols specifying what HTTP methods the resource
368supports.
369
370@item dav
371@cindex DAV
372A list of numbers specifying what DAV protocol/schema versions are
373supported.
374
375@item dasl
376@cindex DASL
377A list of supported DASL search types supported (string form).
378
379@item ranges
380A list of the units available for use in partial document fetches.
381
382@item p3p
383@cindex P3P
384The @dfn{Platform For Privacy Protection} description for the resource.
385Currently this is just the raw header contents.
386@end table
387
388@end defun
389
390@node Dealing with HTTP documents
391@subsection Dealing with HTTP documents
392
393HTTP URLs are retrieved into a buffer containing the HTTP headers
394followed by the body. Since the headers are quasi-MIME, they may be
395processed using the MIME library. @xref{Top,, Emacs MIME,
396emacs-mime, The Emacs MIME Manual}. The URL package provides a
397function to do this in general:
398
399@defun url-decode-text-part handle &optional coding
400This function decodes charset-encoded text in the current buffer. In
401Emacs, the buffer is expected to be unibyte initially and is set to
402multibyte after decoding.
403HANDLE is the MIME handle of the original part. CODING is an explicit
404coding to use, overriding what the MIME headers specify.
405The coding system used for the decoding is returned.
406
407Note that this function doesn't deal with @samp{http-equiv} charset
408specifications in HTML @samp{<meta>} elements.
409@end defun
410
411@node file/ftp
412@section file and ftp
413@cindex files
414@cindex FTP
415@cindex File Transfer Protocol
416@cindex compressed files
417@cindex dired
418
419@example
420ftp://@var{user}:@var{password}@@@var{host}:@var{port}/@var{file}
421file://@var{user}:@var{password}@@@var{host}:@var{port}/@var{file}
422@end example
423
424These schemes are defined in RFC 1808.
425@samp{ftp:} and @samp{file:} are synonymous in this library. They
426allow reading arbitrary files from hosts. Either @samp{ange-ftp}
427(Emacs) or @samp{efs} (XEmacs) is used to retrieve them from remote
428hosts. Local files are accessed directly.
429
430Compressed files are handled, but support is hard-coded so that
431@code{jka-compr-compression-info-list} and so on have no affect.
432Suffixes recognized are @samp{.z}, @samp{.gz}, @samp{.Z} and
433@samp{.bz2}.
434
435@defopt url-directory-index-file
436The filename to look for when indexing a directory, default
437@samp{"index.html"}. If this file exists, and is readable, then it
438will be viewed instead of using @code{dired} to view the directory.
439@end defopt
440
441@node info
442@section info
443@cindex Info
444@cindex Texinfo
445@findex Info-goto-node
446
447@example
448info:@var{file}#@var{node}
449@end example
450
451Info URLs are not officially defined. They invoke
452@code{Info-goto-node} with argument @samp{(@var{file})@var{node}}.
453@samp{#@var{node}} is optional, defaulting to @samp{Top}.
454
455@node mailto
456@section mailto
457
458@cindex mailto
459@cindex email
460A mailto URL will send an email message to the address in the
461URL, for example @samp{mailto:foo@@bar.com} would compose a
462message to @samp{foo@@bar.com}.
463
464@defopt url-mail-command
465@vindex mail-user-agent
466The function called whenever url needs to send mail. This should
467normally be left to default from @var{mail-user-agent}. @xref{Mail
468Methods, , Mail-Composition Methods, emacs, The GNU Emacs Manual}.
469@end defopt
470
471An @samp{X-Url-From} header field containing the URL of the document
472that contained the mailto URL is added if that URL is known.
473
474RFC 2368 extends the definition of mailto URLs in RFC 1738.
475The form of a mailto URL is
476@example
477@samp{mailto:@var{mailbox}[?@var{header}=@var{contents}[&@var{header}=@var{contents}]]}
478@end example
479@noindent where an arbitrary number of @var{header}s can be added. If the
480@var{header} is @samp{body}, then @var{contents} is put in the body
481otherwise a @var{header} header field is created with @var{contents}
482as its contents. Note that the URL library does not consider any
483headers `dangerous' so you should check them before sending the
484message.
485
486@c Fixme: update
487Email messages are defined in @sc{rfc}822.
488
489@node news/nntp/snews
490@section @code{news}, @code{nntp} and @code{snews}
491@cindex news
492@cindex network news
493@cindex usenet
494@cindex NNTP
495@cindex snews
496
497@c draft-gilman-news-url-01
498The network news URL scheme take the following forms following RFC
4991738 except that for compatibility with other clients, host and port
500fields may be included in news URLs though they are properly only
501allowed for nntp an snews.
502
503@table @samp
504@item news:@var{newsgroup}
505Retrieves a list of messages in @var{newsgroup};
506@item news:@var{message-id}
507Retrieves the message with the given @var{message-id};
508@item news:*
509Retrieves a list of all available newsgroups;
510@item nntp://@var{host}:@var{port}/@var{newsgroup}
511@itemx nntp://@var{host}:@var{port}/@var{message-id}
512@itemx nntp://@var{host}:@var{port}/*
513Similar to the @samp{news} versions.
514@end table
515
516@samp{:@var{port}} is optional and defaults to :119.
517
518@samp{snews} is the same as @samp{nntp} except that the default port
519is :563.
520@cindex SSL
521(It is tunneled through SSL.)
522
523An @samp{nntp} URL is the same as a news URL, except that the URL may
524specify an article by its number.
525
526@defopt url-news-server
527This variable can be used to override the default news server.
528Usually this will be set by the Gnus package, which is used to fetch
529news.
530@cindex environment variable
531@vindex NNTPSERVER
532It may be set from the conventional environment variable
533@code{NNTPSERVER}.
534@end defopt
535
536@node rlogin/telnet/tn3270
537@section rlogin, telnet and tn3270
538@cindex rlogin
539@cindex telnet
540@cindex tn3270
541@cindex terminal emulation
542@findex terminal-emulator
543
544These URL schemes from RFC 1738 for logon via a terminal emulator have
545the form
546@example
547telnet://@var{user}:@var{password}@@@var{host}:@var{port}
548@end example
549but the @code{:@var{password}} component is ignored.
550
551To handle rlogin, telnet and tn3270 URLs, a @code{rlogin},
552@code{telnet} or @code{tn3270} (the program names and arguments are
553hardcoded) session is run in a @code{terminal-emulator} buffer.
554Well-known ports are used if the URL does not specify a port.
555
556@node irc
557@section irc
558@cindex IRC
559@cindex Internet Relay Chat
560@cindex ZEN IRC
561@cindex ERC
562@cindex rcirc
563@c Fixme: reference (was http://www.w3.org/Addressing/draft-mirashi-url-irc-01.txt)
564@dfn{Internet Relay Chat} (IRC) is handled by handing off the @sc{irc}
565session to a function named in @code{url-irc-function}.
566
567@defopt url-irc-function
568A function to actually open an IRC connection.
569This function
570must take five arguments, @var{host}, @var{port}, @var{channel},
571@var{user} and @var{password}. The @var{channel} argument specifies the
572channel to join immediately, this can be @code{nil}. By default this is
573@code{url-irc-rcirc}.
574@end defopt
575@defun url-irc-rcirc host port channel user password
576Processes the arguments and lets @code{rcirc} handle the session.
577@end defun
578@defun url-irc-erc host port channel user password
579Processes the arguments and lets @code{ERC} handle the session.
580@end defun
581@defun url-irc-zenirc host port channel user password
582Processes the arguments and lets @code{zenirc} handle the session.
583@end defun
584
585@node data
586@section data
587@cindex data URLs
588
589@example
590data:@r{[}@var{media-type}@r{]}@r{[};@var{base64}@r{]},@var{data}
591@end example
592
593Data URLs contain MIME data in the URL itself. They are defined in
594RFC 2397.
595
596@var{media-type} is a MIME @samp{Content-Type} string, possibly
597including parameters. It defaults to
598@samp{text/plain;charset=US-ASCII}. The @samp{text/plain} can be
599omitted but the charset parameter supplied. If @samp{;base64} is
600present, the @var{data} are base64-encoded.
601
602@node nfs
603@section nfs
604@cindex NFS
605@cindex Network File System
606@cindex automounter
607
608@example
609nfs://@var{user}:@var{password}@@@var{host}:@var{port}/@var{file}
610@end example
611
612The @samp{nfs:} scheme is defined in RFC 2224. It is similar to
613@samp{ftp:} except that it points to a file on a remote host that is
614handled by the automounter on the local host.
615
616@defvar url-nfs-automounter-directory-spec
617@end defvar
618A string saying how to invoke the NFS automounter. Certain @samp{%}
619sequences are recognized:
620
621@table @samp
622@item %h
623The hostname of the NFS server;
624@item %n
625The port number of the NFS server;
626@item %u
627The username to use to authenticate;
628@item %p
629The password to use to authenticate;
630@item %f
631The filename on the remote server;
632@item %%
633A literal @samp{%}.
634@end table
635
636Each can be used any number of times.
637
638@node cid
639@section cid
640@cindex Content-ID
641
642RFC 2111
643
644@node about
645@section about
646
647@node ldap
648@section ldap
649@cindex LDAP
650@cindex Lightweight Directory Access Protocol
651
652The LDAP scheme is defined in RFC 2255.
653
654@node imap
655@section imap
656@cindex IMAP
657
658RFC 2192
659
660@node man
661@section man
662@cindex @command{man}
663@cindex Unix man pages
664@findex man
665
666@example
667@samp{man:@var{page-spec}}
668@end example
669
670This is a non-standard scheme. @var{page-spec} is passed directly to
671the Lisp @code{man} function.
672
673@node Defining New URLs
674@chapter Defining New URLs
675
676@menu
677* Naming conventions::
678* Required functions::
679* Optional functions::
680* Asynchronous fetching::
681* Supporting file-name-handlers::
682@end menu
683
684@node Naming conventions
685@section Naming conventions
686
687@node Required functions
688@section Required functions
689
690@node Optional functions
691@section Optional functions
692
693@node Asynchronous fetching
694@section Asynchronous fetching
695
696@node Supporting file-name-handlers
697@section Supporting file-name-handlers
698
699@node General Facilities
700@chapter General Facilities
701
702@menu
703* Disk Caching::
704* Proxies::
705* Gateways in general::
706* History::
707@end menu
708
709@node Disk Caching
710@section Disk Caching
711@cindex Caching
712@cindex Persistent Cache
713@cindex Disk Cache
714
715The disk cache stores retrieved documents locally, whence they can be
716retrieved more quickly. When requesting a URL that is in the cache,
717the library checks to see if the page has changed since it was last
718retrieved from the remote machine. If not, the local copy is used,
719saving the transmission over the network.
720@cindex Cleaning the cache
721@cindex Clearing the cache
722@cindex Cache cleaning
723Currently the cache isn't cleared automatically.
724@c Running the @code{clean-cache} shell script
725@c fist is recommended, to allow for future cleaning of the cache. This
726@c shell script will remove all files that have not been accessed since it
727@c was last run. To keep the cache pared down, it is recommended that this
728@c script be run from @i{at} or @i{cron} (see the manual pages for
729@c crontab(5) or at(1) for more information)
730
731@defopt url-automatic-caching
732Setting this variable non-@code{nil} causes documents to be cached
733automatically.
734@end defopt
735
736@defopt url-cache-directory
737This variable specifies the
738directory to store the cache files. It defaults to sub-directory
739@file{cache} of @code{url-configuration-directory}.
740@end defopt
741
742@c Fixme: function v. option, but neither used.
743@c @findex url-cache-expired
744@c @defopt url-cache-expired
745@c This is a function to decide whether or not a cache entry has expired.
746@c It takes two times as it parameters and returns non-@code{nil} if the
747@c second time is ``too old'' when compared with the first time.
748@c @end defopt
749
750@defopt url-cache-creation-function
751The cache relies on a scheme for mapping URLs to files in the cache.
752This variable names a function which sets the type of cache to use.
753It takes a URL as argument and returns the absolute file name of the
754corresponding cache file. The two supplied possibilities are
755@code{url-cache-create-filename-using-md5} and
756@code{url-cache-create-filename-human-readable}.
757@end defopt
758
759@defun url-cache-create-filename-using-md5 url
760Creates a cache file name from @var{url} using MD5 hashing.
761This is creates entries with very few cache collisions and is fast.
762@cindex MD5
763@smallexample
764(url-cache-create-filename-using-md5 "http://www.example.com/foo/bar")
765 @result{} "/home/fx/.url/cache/fx/http/com/example/www/b8a35774ad20db71c7c3409a5410e74f"
766@end smallexample
767@end defun
768
769@defun url-cache-create-filename-human-readable url
770Creates a cache file name from @var{url} more obviously connected to
771@var{url} than for @code{url-cache-create-filename-using-md5}, but
772more likely to conflict with other files.
773@smallexample
774(url-cache-create-filename-human-readable "http://www.example.com/foo/bar")
775 @result{} "/home/fx/.url/cache/fx/http/com/example/www/foo/bar"
776@end smallexample
777@end defun
778
779@c Fixme: never actually used currently?
780@c @defopt url-standalone-mode
781@c @cindex Relying on cache
782@c @cindex Cache only mode
783@c @cindex Standalone mode
784@c If this variable is non-@code{nil}, the library relies solely on the
785@c cache for fetching documents and avoids checking if they have changed
786@c on remote servers.
787@c @end defopt
788
789@c With a large cache of documents on the local disk, it can be very handy
790@c when traveling, or any other time the network connection is not active
791@c (a laptop with a dial-on-demand PPP connection, etc). Emacs/W3 can rely
792@c solely on its cache, and avoid checking to see if the page has changed
793@c on the remote server. In the case of a dial-on-demand PPP connection,
794@c this will keep the phone line free as long as possible, only bringing up
795@c the PPP connection when asking for a page that is not located in the
796@c cache. This is very useful for demonstrations as well.
797
798@node Proxies
799@section Proxies and Gatewaying
800
801@c fixme: check/document url-ns stuff
802@cindex proxy servers
803@cindex proxies
804@cindex environment variables
805@vindex HTTP_PROXY
806Proxy servers are commonly used to provide gateways through firewalls
807or as caches serving some more-or-less local network. Each protocol
808(HTTP, FTP, etc.)@: can have a different gateway server. Proxying is
809conventionally configured commonly amongst different programs through
810environment variables of the form @code{@var{protocol}_proxy}, where
811@var{protocol} is one of the supported network protocols (@code{http},
812@code{ftp} etc.). The library recognizes such variables in either
813upper or lower case. Their values are of one of the forms:
814@itemize @bullet
815@item @code{@var{host}:@var{port}}
816@item A full URL;
817@item Simply a host name.
818@end itemize
819
820@vindex NO_PROXY
821The @code{NO_PROXY} environment variable specifies URLs that should be
822excluded from proxying (on servers that should be contacted directly).
823This should be a comma-separated list of hostnames, domain names, or a
824mixture of both. Asterisks can be used as wildcards, but other
825clients may not support that. Domain names may be indicated by a
826leading dot. For example:
827@example
828NO_PROXY="*.aventail.com,home.com,.seanet.com"
829@end example
830@noindent says to contact all machines in the @samp{aventail.com} and
831@samp{seanet.com} domains directly, as well as the machine named
832@samp{home.com}. If @code{NO_PROXY} isn't defined, @code{no_PROXY}
833and @code{no_proxy} are also tried, in that order.
834
835Proxies may also be specified directly in Lisp.
836
837@defopt url-proxy-services
838This variable is an alist of URL schemes and proxy servers that
839gateway them. The items are of the form @w{@code{(@var{scheme}
840. @var{host}:@var{portnumber})}}, says that the URL @var{scheme} is
841gatewayed through @var{portnumber} on the specified @var{host}. An
842exception is the pseudo scheme @code{"no_proxy"}, which is paired with
843a regexp matching host names not to be proxied. This variable is
844initialized from the environment as above.
845
846@example
847(setq url-proxy-services
848 '(("http" . "proxy.aventail.com:80")
849 ("no_proxy" . "^.*\\(aventail\\|seanet\\)\\.com")))
850@end example
851@end defopt
852
853@node Gateways in general
854@section Gateways in General
855@cindex gateways
856@cindex firewalls
857
858The library provides a general gateway layer through which all
859networking passes. It can both control access to the network and
860provide access through gateways in firewalls. This may make direct
861connections in some cases and pass through some sort of gateway in
862others.@footnote{Proxies (which only operate over HTTP) are
863implemented using this.} The library's basic function responsible for
864making connections is @code{url-open-stream}.
865
866@defun url-open-stream name buffer host service
867@cindex opening a stream
868@cindex stream, opening
869Open a stream to @var{host}, possibly via a gateway. The other
870arguments are as for @code{open-network-stream}. This will not make a
871connection if @code{url-gateway-unplugged} is non-@code{nil}.
872@end defun
873
874@defvar url-gateway-local-host-regexp
875This is a regular expression that matches local hosts that do not
876require the use of a gateway. If @code{nil}, all connections are made
877through the gateway.
878@end defvar
879
880@defvar url-gateway-method
881This variable controls which gateway method is used. It may be useful
882to bind it temporarily in some applications. It has values taken from
883a list of symbols. Possible values are:
884
885@table @code
886@item telnet
887@cindex @command{telnet}
888Use this method if you must first telnet and log into a gateway host,
889and then run telnet from that host to connect to outside machines.
890
891@item rlogin
892@cindex @command{rlogin}
893This method is identical to @code{telnet}, but uses @command{rlogin}
894to log into the remote machine without having to send the username and
895password over the wire every time.
896
897@item socks
898@cindex @sc{socks}
899Use if the firewall has a @sc{socks} gateway running on it. The
900@sc{socks} v5 protocol is defined in RFC 1928.
901
902@c @item ssl
903@c This probably shouldn't be documented
904@c Fixme: why not? -- fx
905
906@item native
907This method uses Emacs's builtin networking directly. This is the
908default. It can be used only if there is no firewall blocking access.
909@end table
910@end defvar
911
912The following variables control the gateway methods.
913
914@defopt url-gateway-telnet-host
915The gateway host to telnet to. Once logged in there, you then telnet
916out to the hosts you want to connect to.
917@end defopt
918@defopt url-gateway-telnet-parameters
919This should be a list of parameters to pass to the @command{telnet} program.
920@end defopt
921@defopt url-gateway-telnet-password-prompt
922This is a regular expression that matches the password prompt when
923logging in.
924@end defopt
925@defopt url-gateway-telnet-login-prompt
926This is a regular expression that matches the username prompt when
927logging in.
928@end defopt
929@defopt url-gateway-telnet-user-name
930The username to log in with.
931@end defopt
932@defopt url-gateway-telnet-password
933The password to send when logging in.
934@end defopt
935@defopt url-gateway-prompt-pattern
936This is a regular expression that matches the shell prompt.
937@end defopt
938
939@defopt url-gateway-rlogin-host
940Host to @samp{rlogin} to before telnetting out.
941@end defopt
942@defopt url-gateway-rlogin-parameters
943Parameters to pass to @samp{rsh}.
944@end defopt
945@defopt url-gateway-rlogin-user-name
946User name to use when logging in to the gateway.
947@end defopt
948@defopt url-gateway-prompt-pattern
949This is a regular expression that matches the shell prompt.
950@end defopt
951
952@defopt socks-server
953This specifies the default server, it takes the form
954@w{@code{("Default server" @var{server} @var{port} @var{version})}}
955where @var{version} can be either 4 or 5.
956@end defopt
957@defvar socks-password
958If this is @code{nil} then you will be asked for the password,
959otherwise it will be used as the password for authenticating you to
960the @sc{socks} server.
961@end defvar
962@defvar socks-username
963This is the username to use when authenticating yourself to the
964@sc{socks} server. By default this is your login name.
965@end defvar
966@defvar socks-timeout
967This controls how long, in seconds, to wait for responses from the
968@sc{socks} server; it is 5 by default.
969@end defvar
970@c fixme: these have been effectively commented-out in the code
971@c @defopt socks-server-aliases
972@c This a list of server aliases. It is a list of aliases of the form
973@c @var{(alias hostname port version)}.
974@c @end defopt
975@c @defopt socks-network-aliases
976@c This a list of network aliases. Each entry in the list takes the form
977@c @var{(alias (network))} where @var{alias} is a string that names the
978@c @var{network}. The networks can contain a pair (not a dotted pair) of
979@c @sc{ip} addresses which specify a range of @sc{ip} addresses, an @sc{ip}
980@c address and a netmask, a domain name or a unique hostname or @sc{ip}
981@c address.
982@c @end defopt
983@c @defopt socks-redirection-rules
984@c This a list of redirection rules. Each rule take the form
985@c @var{(Destination network Connection type)} where @var{Destination
986@c network} is a network alias from @code{socks-network-aliases} and
987@c @var{Connection type} can be @code{nil} in which case a direct
988@c connection is used, or it can be an alias from
989@c @code{socks-server-aliases} in which case that server is used as a
990@c proxy.
991@c @end defopt
992@defopt socks-nslookup-program
993@cindex @command{nslookup}
994This the @samp{nslookup} program. It is @code{"nslookup"} by default.
995@end defopt
996
997@menu
998* Suppressing network connections::
999@end menu
1000@c * Broken hostname resolution::
1001
1002@node Suppressing network connections
1003@subsection Suppressing Network Connections
1004
1005@cindex network connections, suppressing
1006@cindex suppressing network connections
1007@cindex bugs, HTML
1008@cindex HTML `bugs'
1009In some circumstances it is desirable to suppress making network
1010connections. A typical case is when rendering HTML in a mail user
1011agent, when external URLs should not be activated, particularly to
1012avoid `bugs' which `call home' by fetch single-pixel images and the
1013like. To arrange this, bind the following variable for the duration
1014of such processing.
1015
1016@defvar url-gateway-unplugged
1017If this variable is non-@code{nil} new network connections are never
1018opened by the URL library.
1019@end defvar
1020
1021@c @node Broken hostname resolution
1022@c @subsection Broken Hostname Resolution
1023
1024@c @cindex hostname resolver
1025@c @cindex resolver, hostname
1026@c Some C libraries do not include the hostname resolver routines in
1027@c their static libraries. If Emacs was linked statically, and was not
1028@c linked with the resolver libraries, it will not be able to get to any
1029@c machines off the local network. This is characterized by being able
1030@c to reach someplace with a raw ip number, but not its hostname
1031@c (@url{http://129.79.254.191/} works, but
1032@c @url{http://www.cs.indiana.edu/} doesn't). This used to happen on
1033@c SunOS4 and Ultrix, but is now probably now rare. If Emacs can't be
1034@c rebuilt linked against the resolver library, it can use the external
1035@c @command{nslookup} program instead.
1036
1037@c @defopt url-gateway-broken-resolution
1038@c @cindex @code{nslookup} program
1039@c @cindex program, @code{nslookup}
1040@c If non-@code{nil}, this variable says to use the program specified by
1041@c @code{url-gateway-nslookup-program} program to do hostname resolution.
1042@c @end defopt
1043
1044@c @defopt url-gateway-nslookup-program
1045@c The name of the program to do hostname lookup if Emacs can't do it
1046@c directly. This program should expect a single argument on the command
1047@c line---the hostname to resolve---and should produce output similar to
1048@c the standard Unix @command{nslookup} program:
1049@c @example
1050@c Name: www.cs.indiana.edu
1051@c Address: 129.79.254.191
1052@c @end example
1053@c @end defopt
1054
1055@node History
1056@section History
1057
1058@findex url-do-setup
1059The library can maintain a global history list tracking URLs accessed.
1060URL completion can be done from it. The history mechanism is set up
1061automatically via @code{url-do-setup} when it is configured to be on.
1062Note that the size of the history list is currently not limited.
1063
1064@vindex url-history-hash-table
1065The history `list' is actually a hash table,
1066@code{url-history-hash-table}. It contains access times keyed by URL
1067strings. The times are in the format returned by @code{current-time}.
1068
1069@defun url-history-update-url url time
1070This function updates the history table with an entry for @var{url}
1071accessed at the given @var{time}.
1072@end defun
1073
1074@defopt url-history-track
1075If non-@code{nil}, the library will keep track of all the URLs
1076accessed. If it is @code{t}, the list is saved to disk at the end of
1077each Emacs session. The default is @code{nil}.
1078@end defopt
1079
1080@defopt url-history-file
1081The file storing the history list between sessions. It defaults to
1082@file{history} in @code{url-configuration-directory}.
1083@end defopt
1084
1085@defopt url-history-save-interval
1086@findex url-history-setup-save-timer
1087The number of seconds between automatic saves of the history list.
1088Default is one hour. Note that if you change this variable directly,
1089rather than using Custom, after @code{url-do-setup} has been run, you
1090need to run the function @code{url-history-setup-save-timer}.
1091@end defopt
1092
1093@defun url-history-parse-history &optional fname
1094Parses the history file @var{fname} (default @code{url-history-file})
1095and sets up the history list.
1096@end defun
1097
1098@defun url-history-save-history &optional fname
1099Saves the current history to file @var{fname} (default
1100@code{url-history-file}).
1101@end defun
1102
1103@defun url-completion-function string predicate function
1104You can use this function to do completion of URLs from the history.
1105@end defun
1106
1107@node Customization
1108@chapter Customization
1109
1110@section Environment Variables
1111
1112@cindex environment variables
1113The following environment variables affect the library's operation at
1114startup.
1115
1116@table @code
1117@item TMPDIR
1118@vindex TMPDIR
1119@vindex url-temporary-directory
1120If this is defined, @var{url-temporary-directory} is initialized from
1121it.
1122@end table
1123
1124@section General User Options
1125
1126The following user options, settable with Customize, affect the
1127general operation of the package.
1128
1129@defopt url-debug
1130@cindex debugging
1131Specifies the types of debug messages the library which are logged to
1132the @code{*URL-DEBUG*} buffer.
1133@code{t} means log all messages.
1134A number means log all messages and show them with @code{message}.
1135If may also be a list of the types of messages to be logged.
1136@end defopt
1137@defopt url-personal-mail-address
1138@end defopt
1139@defopt url-privacy-level
1140@end defopt
1141@defopt url-uncompressor-alist
1142@end defopt
1143@defopt url-passwd-entry-func
1144@end defopt
1145@defopt url-standalone-mode
1146@end defopt
1147@defopt url-bad-port-list
1148@end defopt
1149@defopt url-max-password-attempts
1150@end defopt
1151@defopt url-temporary-directory
1152@end defopt
1153@defopt url-show-status
1154@end defopt
1155@defopt url-confirmation-func
1156The function to use for asking yes or no functions. This is normally
1157either @code{y-or-n-p} or @code{yes-or-no-p}, but could be another
1158function taking a single argument (the prompt) and returning @code{t}
1159only if an affirmative answer is given.
1160@end defopt
1161@defopt url-gateway-method
1162@c fixme: describe gatewaying
1163A symbol specifying the type of gateway support to use for connections
1164from the local machine. The supported methods are:
1165
1166@table @code
1167@item telnet
1168Run telnet in a subprocess to connect;
1169@item rlogin
1170Rlogin to another machine to connect;
1171@item socks
1172Connect through a socks server;
1173@item ssl
1174Connect with SSL;
1175@item native
1176Connect directly.
1177@end table
1178@end defopt
1179
1180@node GNU Free Documentation License
1181@appendix GNU Free Documentation License
1182@include doclicense.texi
1183
1184@node Function Index
1185@unnumbered Command and Function Index
1186@printindex fn
1187
1188@node Variable Index
1189@unnumbered Variable Index
1190@printindex vr
1191
1192@node Concept Index
1193@unnumbered Concept Index
1194@printindex cp
1195
1196@setchapternewpage odd
1197@contents
1198@bye
1199
1200@ignore
1201 arch-tag: c96be356-7e2d-4196-bcda-b13246c5c3f0
1202@end ignore