web server: do not provide a response body where it is not permitted
[bpt/guile.git] / doc / ref / web.texi
CommitLineData
8db7e094
AW
1@c -*-texinfo-*-
2@c This is part of the GNU Guile Reference Manual.
164a78b3 3@c Copyright (C) 2010, 2011, 2012 Free Software Foundation, Inc.
8db7e094
AW
4@c See the file guile.texi for copying conditions.
5
6@node Web
7@section @acronym{HTTP}, the Web, and All That
8@cindex Web
9@cindex WWW
10@cindex HTTP
11
d75a81b1
AW
12It has always been possible to connect computers together and share
13information between them, but the rise of the World-Wide Web over the
14last couple of decades has made it much easier to do so. The result is
15a richly connected network of computation, in which Guile forms a part.
8db7e094 16
d75a81b1
AW
17By ``the web'', we mean the HTTP protocol@footnote{Yes, the P is for
18protocol, but this phrase appears repeatedly in RFC 2616.} as handled by
19servers, clients, proxies, caches, and the various kinds of messages and
20message components that can be sent and received by that protocol,
21notably HTML.
8db7e094 22
d75a81b1
AW
23On one level, the web is text in motion: the protocols themselves are
24textual (though the payload may be binary), and it's possible to create
25a socket and speak text to the web. But such an approach is obviously
26primitive. This section details the higher-level data types and
27operations provided by Guile: URIs, HTTP request and response records,
28and a conventional web server implementation.
8db7e094 29
d75a81b1
AW
30The material in this section is arranged in ascending order, in which
31later concepts build on previous ones. If you prefer to start with the
32highest-level perspective, @pxref{Web Examples}, and work your way
33back.
8db7e094
AW
34
35@menu
d75a81b1 36* Types and the Web:: Types prevent bugs and security problems.
8db7e094
AW
37* URIs:: Universal Resource Identifiers.
38* HTTP:: The Hyper-Text Transfer Protocol.
1148d029 39* HTTP Headers:: How Guile represents specific header values.
8db7e094
AW
40* Requests:: HTTP requests.
41* Responses:: HTTP responses.
ec811439 42* Web Client:: Accessing web resources over HTTP.
8db7e094 43* Web Server:: Serving HTTP to the internet.
e471a3ee 44* Web Examples:: How to use this thing.
8db7e094
AW
45@end menu
46
d75a81b1
AW
47@node Types and the Web
48@subsection Types and the Web
49
50It is a truth universally acknowledged, that a program with good use of
51data types, will be free from many common bugs. Unfortunately, the
52common practice in web programming seems to ignore this maxim. This
53subsection makes the case for expressive data types in web programming.
54
55By ``expressive data types'', we mean that the data types @emph{say}
56something about how a program solves a problem. For example, if we
57choose to represent dates using SRFI 19 date records (@pxref{SRFI-19}),
58this indicates that there is a part of the program that will always have
59valid dates. Error handling for a number of basic cases, like invalid
60dates, occurs on the boundary in which we produce a SRFI 19 date record
61from other types, like strings.
62
5ec48b70
NJ
63With regards to the web, data types are helpful in the two broad phases
64of HTTP messages: parsing and generation.
d75a81b1
AW
65
66Consider a server, which has to parse a request, and produce a response.
67Guile will parse the request into an HTTP request object
68(@pxref{Requests}), with each header parsed into an appropriate Scheme
69data type. This transition from an incoming stream of characters to
70typed data is a state change in a program---the strings might parse, or
71they might not, and something has to happen if they do not. (Guile
72throws an error in this case.) But after you have the parsed request,
73``client'' code (code built on top of the Guile web framework) will not
74have to check for syntactic validity. The types already make this
75information manifest.
76
77This state change on the parsing boundary makes programs more robust,
78as they themselves are freed from the need to do a number of common
79error checks, and they can use normal Scheme procedures to handle a
80request instead of ad-hoc string parsers.
81
82The need for types on the response generation side (in a server) is more
83subtle, though not less important. Consider the example of a POST
84handler, which prints out the text that a user submits from a form.
85Such a handler might include a procedure like this:
86
87@example
88;; First, a helper procedure
89(define (para . contents)
90 (string-append "<p>" (string-concatenate contents) "</p>"))
91
92;; Now the meat of our simple web application
93(define (you-said text)
94 (para "You said: " text))
95
96(display (you-said "Hi!"))
97@print{} <p>You said: Hi!</p>
98@end example
99
100This is a perfectly valid implementation, provided that the incoming
101text does not contain the special HTML characters @samp{<}, @samp{>}, or
102@samp{&}. But this provision of a restricted character set is not
103reflected anywhere in the program itself: we must @emph{assume} that the
104programmer understands this, and performs the check elsewhere.
105
106Unfortunately, the short history of the practice of programming does not
107bear out this assumption. A @dfn{cross-site scripting} (@acronym{XSS})
108vulnerability is just such a common error in which unfiltered user input
109is allowed into the output. A user could submit a crafted comment to
110your web site which results in visitors running malicious Javascript,
111within the security context of your domain:
112
113@example
114(display (you-said "<script src=\"http://bad.com/nasty.js\" />"))
115@print{} <p>You said: <script src="http://bad.com/nasty.js" /></p>
116@end example
117
118The fundamental problem here is that both user data and the program
119template are represented using strings. This identity means that types
120can't help the programmer to make a distinction between these two, so
121they get confused.
122
123There are a number of possible solutions, but perhaps the best is to
124treat HTML not as strings, but as native s-expressions: as SXML. The
125basic idea is that HTML is either text, represented by a string, or an
126element, represented as a tagged list. So @samp{foo} becomes
127@samp{"foo"}, and @samp{<b>foo</b>} becomes @samp{(b "foo")}.
128Attributes, if present, go in a tagged list headed by @samp{@@}, like
129@samp{(img (@@ (src "http://example.com/foo.png")))}. @xref{sxml
130simple}, for more information.
131
132The good thing about SXML is that HTML elements cannot be confused with
133text. Let's make a new definition of @code{para}:
134
135@example
136(define (para . contents)
137 `(p ,@@contents))
138
139(use-modules (sxml simple))
140(sxml->xml (you-said "Hi!"))
141@print{} <p>You said: Hi!</p>
142
143(sxml->xml (you-said "<i>Rats, foiled again!</i>"))
144@print{} <p>You said: &lt;i&gt;Rats, foiled again!&lt;/i&gt;</p>
145@end example
146
147So we see in the second example that HTML elements cannot be unwittingly
569269b4
AW
148introduced into the output. However it is now perfectly acceptable to
149pass SXML to @code{you-said}; in fact, that is the big advantage of SXML
150over everything-as-a-string.
d75a81b1
AW
151
152@example
153(sxml->xml (you-said (you-said "<Hi!>")))
154@print{} <p>You said: <p>You said: &lt;Hi!&gt;</p></p>
155@end example
156
157The SXML types allow procedures to @emph{compose}. The types make
158manifest which parts are HTML elements, and which are text. So you
159needn't worry about escaping user input; the type transition back to a
160string handles that for you. @acronym{XSS} vulnerabilities are a thing
161of the past.
162
163Well. That's all very nice and opinionated and such, but how do I use
164the thing? Read on!
165
8db7e094
AW
166@node URIs
167@subsection Universal Resource Identifiers
168
299cd1a2
AW
169Guile provides a standard data type for Universal Resource Identifiers
170(URIs), as defined in RFC 3986.
8db7e094 171
299cd1a2 172The generic URI syntax is as follows:
8db7e094 173
299cd1a2 174@example
ac7f17e3 175URI := scheme ":" ["//" [userinfo "@@"] host [":" port]] path \
299cd1a2
AW
176 [ "?" query ] [ "#" fragment ]
177@end example
8db7e094 178
b3f94448
AW
179For example, in the URI, @indicateurl{http://www.gnu.org/help/}, the
180scheme is @code{http}, the host is @code{www.gnu.org}, the path is
181@code{/help/}, and there is no userinfo, port, query, or path. All URIs
182have a scheme and a path (though the path might be empty). Some URIs
183have a host, and some of those have ports and userinfo. Any URI might
184have a query part or a fragment.
8db7e094 185
299cd1a2 186Userinfo is something of an abstraction, as some legacy URI schemes
b3f94448
AW
187allowed userinfo of the form @code{@var{username}:@var{passwd}}. But
188since passwords do not belong in URIs, the RFC does not want to condone
189this practice, so it calls anything before the @code{@@} sign
299cd1a2 190@dfn{userinfo}.
8db7e094 191
b3f94448
AW
192Properly speaking, a fragment is not part of a URI. For example, when a
193web browser follows a link to @indicateurl{http://example.com/#foo}, it
194sends a request for @indicateurl{http://example.com/}, then looks in the
195resulting page for the fragment identified @code{foo} reference. A
196fragment identifies a part of a resource, not the resource itself. But
197it is useful to have a fragment field in the URI record itself, so we
198hope you will forgive the inconsistency.
8db7e094 199
299cd1a2
AW
200@example
201(use-modules (web uri))
202@end example
8db7e094 203
299cd1a2
AW
204The following procedures can be found in the @code{(web uri)}
205module. Load it into your Guile, using a form like the above, to have
206access to them.
8db7e094 207
2e6f5ea4 208@deffn {Scheme Procedure} build-uri scheme [#:userinfo=@code{#f}] [#:host=@code{#f}] @
569269b4
AW
209 [#:port=@code{#f}] [#:path=@code{""}] [#:query=@code{#f}] @
210 [#:fragment=@code{#f}] [#:validate?=@code{#t}]
211Construct a URI object. @var{scheme} should be a symbol, and the rest
212of the fields are either strings or @code{#f}. If @var{validate?} is
213true, also run some consistency checks to make sure that the constructed
214URI is valid.
2e6f5ea4
AW
215@end deffn
216
217@deffn {Scheme Procedure} uri? x
218@deffnx {Scheme Procedure} uri-scheme uri
219@deffnx {Scheme Procedure} uri-userinfo uri
220@deffnx {Scheme Procedure} uri-host uri
221@deffnx {Scheme Procedure} uri-port uri
222@deffnx {Scheme Procedure} uri-path uri
223@deffnx {Scheme Procedure} uri-query uri
224@deffnx {Scheme Procedure} uri-fragment uri
569269b4
AW
225A predicate and field accessors for the URI record type. The URI scheme
226will be a symbol, and the rest either strings or @code{#f} if not
227present.
2e6f5ea4 228@end deffn
299cd1a2 229
2e6f5ea4 230@deffn {Scheme Procedure} string->uri string
569269b4
AW
231Parse @var{string} into a URI object. Return @code{#f} if the string
232could not be parsed.
2e6f5ea4 233@end deffn
8db7e094 234
2e6f5ea4 235@deffn {Scheme Procedure} uri->string uri
569269b4
AW
236Serialize @var{uri} to a string. If the URI has a port that is the
237default port for its scheme, the port is not included in the
238serialization.
2e6f5ea4 239@end deffn
8db7e094 240
2e6f5ea4 241@deffn {Scheme Procedure} declare-default-port! scheme port
569269b4 242Declare a default port for the given URI scheme.
2e6f5ea4 243@end deffn
8db7e094 244
2e6f5ea4 245@deffn {Scheme Procedure} uri-decode str [#:encoding=@code{"utf-8"}]
569269b4
AW
246Percent-decode the given @var{str}, according to @var{encoding}, which
247should be the name of a character encoding.
8db7e094
AW
248
249Note that this function should not generally be applied to a full URI
250string. For paths, use split-and-decode-uri-path instead. For query
251strings, split the query on @code{&} and @code{=} boundaries, and decode
252the components separately.
253
569269b4
AW
254Note also that percent-encoded strings encode @emph{bytes}, not
255characters. There is no guarantee that a given byte sequence is a valid
256string encoding. Therefore this routine may signal an error if the
257decoded bytes are not valid for the given encoding. Pass @code{#f} for
258@var{encoding} if you want decoded bytes as a bytevector directly.
259@xref{Ports, @code{set-port-encoding!}}, for more information on
260character encodings.
261
262Returns a string of the decoded characters, or a bytevector if
263@var{encoding} was @code{#f}.
2e6f5ea4 264@end deffn
8db7e094 265
569269b4
AW
266Fixme: clarify return type. indicate default values. type of
267unescaped-chars.
8db7e094 268
2e6f5ea4 269@deffn {Scheme Procedure} uri-encode str [#:encoding=@code{"utf-8"}] [#:unescaped-chars]
569269b4
AW
270Percent-encode any character not in the character set,
271@var{unescaped-chars}.
272
273The default character set includes alphanumerics from ASCII, as well as
274the special characters @samp{-}, @samp{.}, @samp{_}, and @samp{~}. Any
275other character will be percent-encoded, by writing out the character to
276a bytevector within the given @var{encoding}, then encoding each byte as
8db7e094
AW
277@code{%@var{HH}}, where @var{HH} is the hexadecimal representation of
278the byte.
2e6f5ea4 279@end deffn
8db7e094 280
2e6f5ea4 281@deffn {Scheme Procedure} split-and-decode-uri-path path
8db7e094
AW
282Split @var{path} into its components, and decode each component,
283removing empty components.
284
569269b4
AW
285For example, @code{"/foo/bar%20baz/"} decodes to the two-element list,
286@code{("foo" "bar baz")}.
2e6f5ea4 287@end deffn
8db7e094 288
2e6f5ea4 289@deffn {Scheme Procedure} encode-and-join-uri-path parts
8db7e094
AW
290URI-encode each element of @var{parts}, which should be a list of
291strings, and join the parts together with @code{/} as a delimiter.
569269b4
AW
292
293For example, the list @code{("scrambled eggs" "biscuits&gravy")} encodes
294as @code{"scrambled%20eggs/biscuits%26gravy"}.
2e6f5ea4 295@end deffn
8db7e094
AW
296
297@node HTTP
298@subsection The Hyper-Text Transfer Protocol
299
299cd1a2
AW
300The initial motivation for including web functionality in Guile, rather
301than rely on an external package, was to establish a standard base on
302which people can share code. To that end, we continue the focus on data
303types by providing a number of low-level parsers and unparsers for
304elements of the HTTP protocol.
305
306If you are want to skip the low-level details for now and move on to web
ec811439
AW
307pages, @pxref{Web Client}, and @pxref{Web Server}. Otherwise, load the
308HTTP module, and read on.
299cd1a2 309
8db7e094
AW
310@example
311(use-modules (web http))
312@end example
313
299cd1a2
AW
314The focus of the @code{(web http)} module is to parse and unparse
315standard HTTP headers, representing them to Guile as native data
316structures. For example, a @code{Date:} header will be represented as a
317SRFI-19 date record (@pxref{SRFI-19}), rather than as a string.
318
319Guile tries to follow RFCs fairly strictly---the road to perdition being
320paved with compatibility hacks---though some allowances are made for
321not-too-divergent texts.
322
32de1aa7
AW
323Header names are represented as lower-case symbols.
324
2e6f5ea4 325@deffn {Scheme Procedure} string->header name
32de1aa7 326Parse @var{name} to a symbolic header name.
2e6f5ea4 327@end deffn
8db7e094 328
2e6f5ea4 329@deffn {Scheme Procedure} header->string sym
32de1aa7 330Return the string form for the header named @var{sym}.
2e6f5ea4 331@end deffn
32de1aa7
AW
332
333For example:
334
335@example
336(string->header "Content-Length")
337@result{} content-length
338(header->string 'content-length)
339@result{} "Content-Length"
340
341(string->header "FOO")
342@result{} foo
5ec48b70 343(header->string 'foo)
32de1aa7
AW
344@result{} "Foo"
345@end example
346
347Guile keeps a registry of known headers, their string names, and some
348parsing and serialization procedures. If a header is unknown, its
349string name is simply its symbol name in title-case.
350
2e6f5ea4 351@deffn {Scheme Procedure} known-header? sym
32de1aa7
AW
352Return @code{#t} iff @var{sym} is a known header, with associated
353parsers and serialization procedures.
2e6f5ea4 354@end deffn
32de1aa7 355
2e6f5ea4 356@deffn {Scheme Procedure} header-parser sym
32de1aa7
AW
357Return the value parser for headers named @var{sym}. The result is a
358procedure that takes one argument, a string, and returns the parsed
359value. If the header isn't known to Guile, a default parser is returned
360that passes through the string unchanged.
2e6f5ea4 361@end deffn
32de1aa7 362
2e6f5ea4 363@deffn {Scheme Procedure} header-validator sym
32de1aa7
AW
364Return a predicate which returns @code{#t} if the given value is valid
365for headers named @var{sym}. The default validator for unknown headers
366is @code{string?}.
2e6f5ea4 367@end deffn
32de1aa7 368
2e6f5ea4 369@deffn {Scheme Procedure} header-writer sym
32de1aa7
AW
370Return a procedure that writes values for headers named @var{sym} to a
371port. The resulting procedure takes two arguments: a value and a port.
372The default writer is @code{display}.
2e6f5ea4 373@end deffn
32de1aa7
AW
374
375For more on the set of headers that Guile knows about out of the box,
376@pxref{HTTP Headers}. To add your own, use the @code{declare-header!}
377procedure:
378
2e6f5ea4 379@deffn {Scheme Procedure} declare-header! name parser validator writer [#:multiple?=@code{#f}]
32de1aa7 380Declare a parser, validator, and writer for a given header.
2e6f5ea4 381@end deffn
8db7e094 382
929ccf48
AW
383For example, let's say you are running a web server behind some sort of
384proxy, and your proxy adds an @code{X-Client-Address} header, indicating
385the IPv4 address of the original client. You would like for the HTTP
386request record to parse out this header to a Scheme value, instead of
387leaving it as a string. You could register this header with Guile's
388HTTP stack like this:
389
390@example
32de1aa7
AW
391(declare-header! "X-Client-Address"
392 (lambda (str)
393 (inet-aton str))
394 (lambda (ip)
395 (and (integer? ip) (exact? ip) (<= 0 ip #xffffffff)))
396 (lambda (ip port)
397 (display (inet-ntoa ip) port)))
929ccf48
AW
398@end example
399
2e6f5ea4 400@deffn {Scheme Procedure} valid-header? sym val
929ccf48
AW
401Return a true value iff @var{val} is a valid Scheme value for the header
402with name @var{sym}.
2e6f5ea4 403@end deffn
8db7e094 404
299cd1a2
AW
405Now that we have a generic interface for reading and writing headers, we
406do just that.
407
2e6f5ea4 408@deffn {Scheme Procedure} read-header port
929ccf48 409Read one HTTP header from @var{port}. Return two values: the header
8db7e094
AW
410name and the parsed Scheme value. May raise an exception if the header
411was known but the value was invalid.
412
929ccf48
AW
413Returns the end-of-file object for both values if the end of the message
414body was reached (i.e., a blank line).
2e6f5ea4 415@end deffn
8db7e094 416
2e6f5ea4 417@deffn {Scheme Procedure} parse-header name val
8db7e094 418Parse @var{val}, a string, with the parser for the header named
32de1aa7 419@var{name}. Returns the parsed value.
2e6f5ea4 420@end deffn
8db7e094 421
2e6f5ea4 422@deffn {Scheme Procedure} write-header name val port
32de1aa7
AW
423Write the given header name and value to @var{port}, using the writer
424from @code{header-writer}.
2e6f5ea4 425@end deffn
8db7e094 426
2e6f5ea4 427@deffn {Scheme Procedure} read-headers port
929ccf48
AW
428Read the headers of an HTTP message from @var{port}, returning the
429headers as an ordered alist.
2e6f5ea4 430@end deffn
8db7e094 431
2e6f5ea4 432@deffn {Scheme Procedure} write-headers headers port
8db7e094 433Write the given header alist to @var{port}. Doesn't write the final
32de1aa7 434@samp{\r\n}, as the user might want to add another header.
2e6f5ea4 435@end deffn
8db7e094 436
299cd1a2
AW
437The @code{(web http)} module also has some utility procedures to read
438and write request and response lines.
439
2e6f5ea4 440@deffn {Scheme Procedure} parse-http-method str [start] [end]
8db7e094
AW
441Parse an HTTP method from @var{str}. The result is an upper-case symbol,
442like @code{GET}.
2e6f5ea4 443@end deffn
8db7e094 444
2e6f5ea4 445@deffn {Scheme Procedure} parse-http-version str [start] [end]
8db7e094
AW
446Parse an HTTP version from @var{str}, returning it as a major-minor
447pair. For example, @code{HTTP/1.1} parses as the pair of integers,
448@code{(1 . 1)}.
2e6f5ea4 449@end deffn
8db7e094 450
2e6f5ea4 451@deffn {Scheme Procedure} parse-request-uri str [start] [end]
8db7e094
AW
452Parse a URI from an HTTP request line. Note that URIs in requests do not
453have to have a scheme or host name. The result is a URI object.
2e6f5ea4 454@end deffn
8db7e094 455
2e6f5ea4 456@deffn {Scheme Procedure} read-request-line port
8db7e094
AW
457Read the first line of an HTTP request from @var{port}, returning three
458values: the method, the URI, and the version.
2e6f5ea4 459@end deffn
8db7e094 460
2e6f5ea4 461@deffn {Scheme Procedure} write-request-line method uri version port
8db7e094 462Write the first line of an HTTP request to @var{port}.
2e6f5ea4 463@end deffn
8db7e094 464
2e6f5ea4 465@deffn {Scheme Procedure} read-response-line port
8db7e094
AW
466Read the first line of an HTTP response from @var{port}, returning three
467values: the HTTP version, the response code, and the "reason phrase".
2e6f5ea4 468@end deffn
8db7e094 469
2e6f5ea4 470@deffn {Scheme Procedure} write-response-line version code reason-phrase port
8db7e094 471Write the first line of an HTTP response to @var{port}.
2e6f5ea4 472@end deffn
8db7e094
AW
473
474
1148d029
AW
475@node HTTP Headers
476@subsection HTTP Headers
477
ff8339db
AW
478In addition to defining the infrastructure to parse headers, the
479@code{(web http)} module defines specific parsers and unparsers for all
480headers defined in the HTTP/1.1 standard.
1148d029 481
ff8339db
AW
482For example, if you receive a header named @samp{Accept-Language} with a
483value @samp{en, es;q=0.8}, Guile parses it as a quality list (defined
484below):
485
486@example
487(parse-header 'accept-language "en, es;q=0.8")
488@result{} ((1000 . "en") (800 . "es"))
489@end example
490
491The format of the value for @samp{Accept-Language} headers is defined
492below, along with all other headers defined in the HTTP standard. (If
493the header were unknown, the value would have been returned as a
494string.)
495
496For brevity, the header definitions below are given in the form,
497@var{Type} @code{@var{name}}, indicating that values for the header
498@code{@var{name}} will be of the given @var{Type}. Since Guile
499internally treats header names in lower case, in this document we give
500types title-cased names. A short description of the each header's
501purpose and an example follow.
502
503For full details on the meanings of all of these headers, see the HTTP
5041.1 standard, RFC 2616.
505
506@subsubsection HTTP Header Types
507
508Here we define the types that are used below, when defining headers.
509
510@deftp {HTTP Header Type} Date
511A SRFI-19 date.
512@end deftp
513
514@deftp {HTTP Header Type} KVList
515A list whose elements are keys or key-value pairs. Keys are parsed to
516symbols. Values are strings by default. Non-string values are the
517exception, and are mentioned explicitly below, as appropriate.
518@end deftp
519
520@deftp {HTTP Header Type} SList
521A list of strings.
522@end deftp
523
524@deftp {HTTP Header Type} Quality
525An exact integer between 0 and 1000. Qualities are used to express
526preference, given multiple options. An option with a quality of 870,
527for example, is preferred over an option with quality 500.
528
529(Qualities are written out over the wire as numbers between 0.0 and
5301.0, but since the standard only allows three digits after the decimal,
531it's equivalent to integers between 0 and 1000, so that's what Guile
532uses.)
533@end deftp
534
535@deftp {HTTP Header Type} QList
536A quality list: a list of pairs, the car of which is a quality, and the
537cdr a string. Used to express a list of options, along with their
538qualities.
539@end deftp
540
541@deftp {HTTP Header Type} ETag
542An entity tag, represented as a pair. The car of the pair is an opaque
543string, and the cdr is @code{#t} if the entity tag is a ``strong'' entity
544tag, and @code{#f} otherwise.
545@end deftp
1148d029
AW
546
547@subsubsection General Headers
548
ff8339db
AW
549General HTTP headers may be present in any HTTP message.
550
551@deftypevr {HTTP Header} KVList cache-control
552A key-value list of cache-control directives. See RFC 2616, for more
553details.
1148d029
AW
554
555If present, parameters to @code{max-age}, @code{max-stale},
556@code{min-fresh}, and @code{s-maxage} are all parsed as non-negative
557integers.
558
559If present, parameters to @code{private} and @code{no-cache} are parsed
ff8339db 560as lists of header names, as symbols.
1148d029 561
ff8339db
AW
562@example
563(parse-header 'cache-control "no-cache,no-store"
564@result{} (no-cache no-store)
565(parse-header 'cache-control "no-cache=\"Authorization,Date\",no-store"
566@result{} ((no-cache . (authorization date)) no-store)
567(parse-header 'cache-control "no-cache=\"Authorization,Date\",max-age=10"
568@result{} ((no-cache . (authorization date)) (max-age . 10))
569@end example
570@end deftypevr
1148d029 571
ff8339db
AW
572@deftypevr {HTTP Header} List connection
573A list of header names that apply only to this HTTP connection, as
574symbols. Additionally, the symbol @samp{close} may be present, to
575indicate that the server should close the connection after responding to
576the request.
577@example
578(parse-header 'connection "close")
579@result{} (close)
580@end example
581@end deftypevr
1148d029 582
ff8339db
AW
583@deftypevr {HTTP Header} Date date
584The date that a given HTTP message was originated.
585@example
586(parse-header 'date "Tue, 15 Nov 1994 08:12:31 GMT")
587@result{} #<date ...>
588@end example
589@end deftypevr
1148d029 590
ff8339db
AW
591@deftypevr {HTTP Header} KVList pragma
592A key-value list of implementation-specific directives.
593@example
594(parse-header 'pragma "no-cache, broccoli=tasty")
595@result{} (no-cache (broccoli . "tasty"))
596@end example
597@end deftypevr
1148d029 598
ff8339db
AW
599@deftypevr {HTTP Header} List trailer
600A list of header names which will appear after the message body, instead
601of with the message headers.
602@example
603(parse-header 'trailer "ETag")
604@result{} (etag)
605@end example
606@end deftypevr
1148d029 607
ff8339db
AW
608@deftypevr {HTTP Header} List transfer-encoding
609A list of transfer codings, expressed as key-value lists. The only
610transfer coding defined by the specification is @code{chunked}.
611@example
612(parse-header 'transfer-encoding "chunked")
07154109 613@result{} ((chunked))
ff8339db
AW
614@end example
615@end deftypevr
1148d029 616
ff8339db
AW
617@deftypevr {HTTP Header} List upgrade
618A list of strings, indicating additional protocols that a server could use
619in response to a request.
620@example
621(parse-header 'upgrade "WebSocket")
622@result{} ("WebSocket")
623@end example
624@end deftypevr
1148d029 625
ff8339db
AW
626FIXME: parse out more fully?
627@deftypevr {HTTP Header} List via
628A list of strings, indicating the protocol versions and hosts of
629intermediate servers and proxies. There may be multiple @code{via}
630headers in one message.
631@example
632(parse-header 'via "1.0 venus, 1.1 mars")
633@result{} ("1.0 venus" "1.1 mars")
634@end example
635@end deftypevr
636
637@deftypevr {HTTP Header} List warning
638A list of warnings given by a server or intermediate proxy. Each
639warning is a itself a list of four elements: a code, as an exact integer
640between 0 and 1000, a host as a string, the warning text as a string,
641and either @code{#f} or a SRFI-19 date.
1148d029
AW
642
643There may be multiple @code{warning} headers in one message.
ff8339db
AW
644@example
645(parse-header 'warning "123 foo \"core breach imminent\"")
646@result{} ((123 "foo" "core-breach imminent" #f))
647@end example
648@end deftypevr
1148d029
AW
649
650
651@subsubsection Entity Headers
652
ff8339db
AW
653Entity headers may be present in any HTTP message, and refer to the
654resource referenced in the HTTP request or response.
1148d029 655
ff8339db
AW
656@deftypevr {HTTP Header} List allow
657A list of allowed methods on a given resource, as symbols.
658@example
659(parse-header 'allow "GET, HEAD")
660@result{} (GET HEAD)
661@end example
662@end deftypevr
1148d029 663
ff8339db
AW
664@deftypevr {HTTP Header} List content-encoding
665A list of content codings, as symbols.
666@example
667(parse-header 'content-encoding "gzip")
668@result{} (GET HEAD)
669@end example
670@end deftypevr
1148d029 671
ff8339db
AW
672@deftypevr {HTTP Header} List content-language
673The languages that a resource is in, as strings.
674@example
675(parse-header 'content-language "en")
676@result{} ("en")
677@end example
678@end deftypevr
1148d029 679
ff8339db
AW
680@deftypevr {HTTP Header} UInt content-length
681The number of bytes in a resource, as an exact, non-negative integer.
682@example
683(parse-header 'content-length "300")
684@result{} 300
685@end example
686@end deftypevr
1148d029 687
ff8339db
AW
688@deftypevr {HTTP Header} URI content-location
689The canonical URI for a resource, in the case that it is also accessible
690from a different URI.
691@example
692(parse-header 'content-location "http://example.com/foo")
693@result{} #<<uri> ...>
694@end example
695@end deftypevr
1148d029 696
ff8339db
AW
697@deftypevr {HTTP Header} String content-md5
698The MD5 digest of a resource.
699@example
700(parse-header 'content-md5 "ffaea1a79810785575e29e2bd45e2fa5")
701@result{} "ffaea1a79810785575e29e2bd45e2fa5"
702@end example
703@end deftypevr
704
705@deftypevr {HTTP Header} List content-range
706A range specification, as a list of three elements: the symbol
707@code{bytes}, either the symbol @code{*} or a pair of integers,
708indicating the byte rage, and either @code{*} or an integer, for the
709instance length. Used to indicate that a response only includes part of
710a resource.
711@example
712(parse-header 'content-range "bytes 10-20/*")
713@result{} (bytes (10 . 20) *)
714@end example
715@end deftypevr
1148d029 716
ff8339db
AW
717@deftypevr {HTTP Header} List content-type
718The MIME type of a resource, as a symbol, along with any parameters.
719@example
720(parse-header 'content-length "text/plain")
721@result{} (text/plain)
722(parse-header 'content-length "text/plain;charset=utf-8")
723@result{} (text/plain (charset . "utf-8"))
724@end example
725Note that the @code{charset} parameter is something is a misnomer, and
726the HTTP specification admits this. It specifies the @emph{encoding} of
727the characters, not the character set.
728@end deftypevr
729
730@deftypevr {HTTP Header} Date expires
731The date/time after which the resource given in a response is considered
732stale.
733@example
734(parse-header 'expires "Tue, 15 Nov 1994 08:12:31 GMT")
735@result{} #<date ...>
736@end example
737@end deftypevr
58baff08 738
ff8339db
AW
739@deftypevr {HTTP Header} Date last-modified
740The date/time on which the resource given in a response was last
741modified.
742@example
743(parse-header 'expires "Tue, 15 Nov 1994 08:12:31 GMT")
744@result{} #<date ...>
745@end example
746@end deftypevr
1148d029
AW
747
748
749@subsubsection Request Headers
750
ff8339db 751Request headers may only appear in an HTTP request, not in a response.
1148d029 752
ff8339db
AW
753@deftypevr {HTTP Header} List accept
754A list of preferred media types for a response. Each element of the
755list is itself a list, in the same format as @code{content-type}.
756@example
757(parse-header 'accept "text/html,text/plain;charset=utf-8")
758@result{} ((text/html) (text/plain (charset . "utf-8")))
759@end example
ecb87335 760Preference is expressed with quality values:
ff8339db
AW
761@example
762(parse-header 'accept "text/html;q=0.8,text/plain;q=0.6")
763@result{} ((text/html (q . 800)) (text/plain (q . 600)))
764@end example
765@end deftypevr
1148d029 766
ff8339db
AW
767@deftypevr {HTTP Header} QList accept-charset
768A quality list of acceptable charsets. Note again that what HTTP calls
769a ``charset'' is what Guile calls a ``character encoding''.
770@example
771(parse-header 'accept-charset "iso-8859-5, unicode-1-1;q=0.8")
772@result{} ((1000 . "iso-8859-5") (800 . "unicode-1-1"))
773@end example
774@end deftypevr
1148d029 775
ff8339db
AW
776@deftypevr {HTTP Header} QList accept-encoding
777A quality list of acceptable content codings.
778@example
779(parse-header 'accept-encoding "gzip,identity=0.8")
780@result{} ((1000 . "gzip") (800 . "identity"))
781@end example
782@end deftypevr
1148d029 783
ff8339db
AW
784@deftypevr {HTTP Header} QList accept-language
785A quality list of acceptable languages.
786@example
787(parse-header 'accept-language "cn,en=0.75")
788@result{} ((1000 . "cn") (750 . "en"))
789@end example
790@end deftypevr
791
792@deftypevr {HTTP Header} Pair authorization
793Authorization credentials. The car of the pair indicates the
794authentication scheme, like @code{basic}. For basic authentication, the
795cdr of the pair will be the base64-encoded @samp{@var{user}:@var{pass}}
796string. For other authentication schemes, like @code{digest}, the cdr
797will be a key-value list of credentials.
798@example
799(parse-header 'authorization "Basic QWxhZGRpbjpvcGVuIHNlc2FtZQ=="
800@result{} (basic . "QWxhZGRpbjpvcGVuIHNlc2FtZQ==")
801@end example
802@end deftypevr
1148d029 803
ff8339db
AW
804@deftypevr {HTTP Header} List expect
805A list of expectations that a client has of a server. The expectations
806are key-value lists.
807@example
808(parse-header 'expect "100-continue")
809@result{} ((100-continue))
810@end example
811@end deftypevr
1148d029 812
ff8339db
AW
813@deftypevr {HTTP Header} String from
814The email address of a user making an HTTP request.
815@example
816(parse-header 'from "bob@@example.com")
817@result{} "bob@@example.com"
818@end example
819@end deftypevr
1148d029 820
ff8339db
AW
821@deftypevr {HTTP Header} Pair host
822The host for the resource being requested, as a hostname-port pair. If
823no port is given, the port is @code{#f}.
824@example
825(parse-header 'host "gnu.org:80")
826@result{} ("gnu.org" . 80)
827(parse-header 'host "gnu.org")
828@result{} ("gnu.org" . #f)
829@end example
830@end deftypevr
1148d029 831
ff8339db
AW
832@deftypevr {HTTP Header} *|List if-match
833A set of etags, indicating that the request should proceed if and only
834if the etag of the resource is in that set. Either the symbol @code{*},
835indicating any etag, or a list of entity tags.
836@example
837(parse-header 'if-match "*")
838@result{} *
839(parse-header 'if-match "asdfadf")
840@result{} (("asdfadf" . #t))
841(parse-header 'if-match W/"asdfadf")
842@result{} (("asdfadf" . #f))
843@end example
844@end deftypevr
1148d029 845
ff8339db
AW
846@deftypevr {HTTP Header} Date if-modified-since
847Indicates that a response should proceed if and only if the resource has
848been modified since the given date.
849@example
654ef4cf 850(parse-header 'if-modified-since "Tue, 15 Nov 1994 08:12:31 GMT")
ff8339db
AW
851@result{} #<date ...>
852@end example
853@end deftypevr
1148d029 854
ff8339db
AW
855@deftypevr {HTTP Header} *|List if-none-match
856A set of etags, indicating that the request should proceed if and only
857if the etag of the resource is not in the set. Either the symbol
858@code{*}, indicating any etag, or a list of entity tags.
859@example
860(parse-header 'if-none-match "*")
861@result{} *
862@end example
863@end deftypevr
1148d029 864
ff8339db
AW
865@deftypevr {HTTP Header} ETag|Date if-range
866Indicates that the range request should proceed if and only if the
867resource matches a modification date or an etag. Either an entity tag,
868or a SRFI-19 date.
869@example
870(parse-header 'if-range "\"original-etag\"")
871@result{} ("original-etag" . #t)
872@end example
873@end deftypevr
1148d029 874
ff8339db
AW
875@deftypevr {HTTP Header} Date if-unmodified-since
876Indicates that a response should proceed if and only if the resource has
877not been modified since the given date.
878@example
879(parse-header 'if-not-modified-since "Tue, 15 Nov 1994 08:12:31 GMT")
880@result{} #<date ...>
881@end example
882@end deftypevr
1148d029 883
ff8339db
AW
884@deftypevr {HTTP Header} UInt max-forwards
885The maximum number of proxy or gateway hops that a request should be
886subject to.
887@example
888(parse-header 'max-forwards "10")
889@result{} 10
890@end example
891@end deftypevr
1148d029 892
ff8339db
AW
893@deftypevr {HTTP Header} Pair proxy-authorization
894Authorization credentials for a proxy connection. See the documentation
895for @code{authorization} above for more information on the format.
896@example
897(parse-header 'proxy-authorization "Digest foo=bar,baz=qux"
898@result{} (digest (foo . "bar") (baz . "qux"))
899@end example
900@end deftypevr
901
902@deftypevr {HTTP Header} Pair range
903A range request, indicating that the client wants only part of a
904resource. The car of the pair is the symbol @code{bytes}, and the cdr
905is a list of pairs. Each element of the cdr indicates a range; the car
906is the first byte position and the cdr is the last byte position, as
907integers, or @code{#f} if not given.
908@example
909(parse-header 'range "bytes=10-30,50-")
910@result{} (bytes (10 . 30) (50 . #f))
911@end example
912@end deftypevr
1148d029 913
ff8339db
AW
914@deftypevr {HTTP Header} URI referer
915The URI of the resource that referred the user to this resource. The
916name of the header is a misspelling, but we are stuck with it.
917@example
918(parse-header 'referer "http://www.gnu.org/")
919@result{} #<uri ...>
920@end example
921@end deftypevr
1148d029 922
ff8339db
AW
923@deftypevr {HTTP Header} List te
924A list of transfer codings, expressed as key-value lists. A common
925transfer coding is @code{trailers}.
926@example
927(parse-header 'te "trailers")
928@result{} ((trailers))
929@end example
930@end deftypevr
1148d029 931
ff8339db
AW
932@deftypevr {HTTP Header} String user-agent
933A string indicating the user agent making the request. The
934specification defines a structured format for this header, but it is
935widely disregarded, so Guile does not attempt to parse strictly.
936@example
937(parse-header 'user-agent "Mozilla/5.0")
938@result{} "Mozilla/5.0"
939@end example
940@end deftypevr
1148d029
AW
941
942
943@subsubsection Response Headers
944
ff8339db
AW
945@deftypevr {HTTP Header} List accept-ranges
946A list of range units that the server supports, as symbols.
947@example
948(parse-header 'accept-ranges "bytes")
949@result{} (bytes)
950@end example
951@end deftypevr
1148d029 952
ff8339db
AW
953@deftypevr {HTTP Header} UInt age
954The age of a cached response, in seconds.
955@example
956(parse-header 'age "3600")
957@result{} 3600
958@end example
959@end deftypevr
1148d029 960
ff8339db
AW
961@deftypevr {HTTP Header} ETag etag
962The entity-tag of the resource.
963@example
964(parse-header 'etag "\"foo\"")
965@result{} ("foo" . #t)
966@end example
967@end deftypevr
1148d029 968
ff8339db
AW
969@deftypevr {HTTP Header} URI location
970A URI on which a request may be completed. Used in combination with a
971redirecting status code to perform client-side redirection.
972@example
973(parse-header 'location "http://example.com/other")
974@result{} #<uri ...>
975@end example
976@end deftypevr
1148d029 977
ff8339db
AW
978@deftypevr {HTTP Header} List proxy-authenticate
979A list of challenges to a proxy, indicating the need for authentication.
980@example
981(parse-header 'proxy-authenticate "Basic realm=\"foo\"")
982@result{} ((basic (realm . "foo")))
983@end example
984@end deftypevr
1148d029 985
ff8339db
AW
986@deftypevr {HTTP Header} UInt|Date retry-after
987Used in combination with a server-busy status code, like 503, to
988indicate that a client should retry later. Either a number of seconds,
989or a date.
990@example
991(parse-header 'retry-after "60")
992@result{} 60
993@end example
994@end deftypevr
1148d029 995
ff8339db
AW
996@deftypevr {HTTP Header} String server
997A string identifying the server.
998@example
999(parse-header 'server "My first web server")
1000@result{} "My first web server"
1001@end example
1002@end deftypevr
1148d029 1003
ff8339db
AW
1004@deftypevr {HTTP Header} *|List vary
1005A set of request headers that were used in computing this response.
ecb87335 1006Used to indicate that server-side content negotiation was performed, for
ff8339db
AW
1007example in response to the @code{accept-language} header. Can also be
1008the symbol @code{*}, indicating that all headers were considered.
1009@example
1010(parse-header 'vary "Accept-Language, Accept")
1011@result{} (accept-language accept)
1012@end example
1013@end deftypevr
1148d029 1014
ff8339db
AW
1015@deftypevr {HTTP Header} List www-authenticate
1016A list of challenges to a user, indicating the need for authentication.
1017@example
1018(parse-header 'www-authenticate "Basic realm=\"foo\"")
1019@result{} ((basic (realm . "foo")))
1020@end example
1021@end deftypevr
1148d029
AW
1022
1023
8db7e094
AW
1024@node Requests
1025@subsection HTTP Requests
1026
1027@example
1028(use-modules (web request))
1029@end example
1030
de54fb6d 1031The request module contains a data type for HTTP requests.
8db7e094 1032
de54fb6d
AW
1033@subsubsection An Important Note on Character Sets
1034
1035HTTP requests consist of two parts: the request proper, consisting of a
1036request line and a set of headers, and (optionally) a body. The body
1037might have a binary content-type, and even in the textual case its
1038length is specified in bytes, not characters.
1039
1040Therefore, HTTP is a fundamentally binary protocol. However the request
1041line and headers are specified to be in a subset of ASCII, so they can
1042be treated as text, provided that the port's encoding is set to an
1043ASCII-compatible one-byte-per-character encoding. ISO-8859-1 (latin-1)
1044is just such an encoding, and happens to be very efficient for Guile.
1045
1046So what Guile does when reading requests from the wire, or writing them
1047out, is to set the port's encoding to latin-1, and treating the request
1048headers as text.
1049
1050The request body is another issue. For binary data, the data is
1051probably in a bytevector, so we use the R6RS binary output procedures to
1052write out the binary payload. Textual data usually has to be written
1053out to some character encoding, usually UTF-8, and then the resulting
1054bytevector is written out to the port.
1055
1056In summary, Guile reads and writes HTTP over latin-1 sockets, without
1057any loss of generality.
1058
1059@subsubsection Request API
8db7e094 1060
2e6f5ea4
AW
1061@deffn {Scheme Procedure} request?
1062@deffnx {Scheme Procedure} request-method
1063@deffnx {Scheme Procedure} request-uri
1064@deffnx {Scheme Procedure} request-version
1065@deffnx {Scheme Procedure} request-headers
1066@deffnx {Scheme Procedure} request-meta
1067@deffnx {Scheme Procedure} request-port
e471a3ee
AW
1068A predicate and field accessors for the request type. The fields are as
1069follows:
1070@table @code
1071@item method
1072The HTTP method, for example, @code{GET}.
1073@item uri
1074The URI as a URI record.
1075@item version
1076The HTTP version pair, like @code{(1 . 1)}.
1077@item headers
1078The request headers, as an alist of parsed values.
1079@item meta
1080An arbitrary alist of other data, for example information returned in
1081the @code{sockaddr} from @code{accept} (@pxref{Network Sockets and
1082Communication}).
1083@item port
1084The port on which to read or write a request body, if any.
1085@end table
2e6f5ea4 1086@end deffn
8db7e094 1087
2e6f5ea4 1088@deffn {Scheme Procedure} read-request port [meta='()]
8db7e094
AW
1089Read an HTTP request from @var{port}, optionally attaching the given
1090metadata, @var{meta}.
1091
1092As a side effect, sets the encoding on @var{port} to ISO-8859-1
1093(latin-1), so that reading one character reads one byte. See the
de54fb6d
AW
1094discussion of character sets above, for more information.
1095
1096Note that the body is not part of the request. Once you have read a
1097request, you may read the body separately, and likewise for writing
1098requests.
2e6f5ea4 1099@end deffn
de54fb6d 1100
2e6f5ea4 1101@deffn {Scheme Procedure} build-request uri [#:method='GET] [#:version='(1 . 1)] [#:headers='()] [#:port=#f] [#:meta='()] [#:validate-headers?=#t]
de54fb6d
AW
1102Construct an HTTP request object. If @var{validate-headers?} is true,
1103the headers are each run through their respective validators.
2e6f5ea4 1104@end deffn
8db7e094 1105
2e6f5ea4 1106@deffn {Scheme Procedure} write-request r port
8db7e094
AW
1107Write the given HTTP request to @var{port}.
1108
de54fb6d 1109Return a new request, whose @code{request-port} will continue writing
8db7e094 1110on @var{port}, perhaps using some transfer encoding.
2e6f5ea4 1111@end deffn
8db7e094 1112
2e6f5ea4 1113@deffn {Scheme Procedure} read-request-body r
de54fb6d 1114Reads the request body from @var{r}, as a bytevector. Return @code{#f}
8db7e094 1115if there was no request body.
2e6f5ea4 1116@end deffn
8db7e094 1117
2e6f5ea4 1118@deffn {Scheme Procedure} write-request-body r bv
64de6db5 1119Write @var{bv}, a bytevector, to the port corresponding to the HTTP
8db7e094 1120request @var{r}.
2e6f5ea4 1121@end deffn
8db7e094 1122
e471a3ee
AW
1123The various headers that are typically associated with HTTP requests may
1124be accessed with these dedicated accessors. @xref{HTTP Headers}, for
1125more information on the format of parsed headers.
1126
2e6f5ea4
AW
1127@deffn {Scheme Procedure} request-accept request [default='()]
1128@deffnx {Scheme Procedure} request-accept-charset request [default='()]
1129@deffnx {Scheme Procedure} request-accept-encoding request [default='()]
1130@deffnx {Scheme Procedure} request-accept-language request [default='()]
1131@deffnx {Scheme Procedure} request-allow request [default='()]
1132@deffnx {Scheme Procedure} request-authorization request [default=#f]
1133@deffnx {Scheme Procedure} request-cache-control request [default='()]
1134@deffnx {Scheme Procedure} request-connection request [default='()]
1135@deffnx {Scheme Procedure} request-content-encoding request [default='()]
1136@deffnx {Scheme Procedure} request-content-language request [default='()]
1137@deffnx {Scheme Procedure} request-content-length request [default=#f]
1138@deffnx {Scheme Procedure} request-content-location request [default=#f]
1139@deffnx {Scheme Procedure} request-content-md5 request [default=#f]
1140@deffnx {Scheme Procedure} request-content-range request [default=#f]
1141@deffnx {Scheme Procedure} request-content-type request [default=#f]
1142@deffnx {Scheme Procedure} request-date request [default=#f]
1143@deffnx {Scheme Procedure} request-expect request [default='()]
1144@deffnx {Scheme Procedure} request-expires request [default=#f]
1145@deffnx {Scheme Procedure} request-from request [default=#f]
1146@deffnx {Scheme Procedure} request-host request [default=#f]
1147@deffnx {Scheme Procedure} request-if-match request [default=#f]
1148@deffnx {Scheme Procedure} request-if-modified-since request [default=#f]
1149@deffnx {Scheme Procedure} request-if-none-match request [default=#f]
1150@deffnx {Scheme Procedure} request-if-range request [default=#f]
1151@deffnx {Scheme Procedure} request-if-unmodified-since request [default=#f]
1152@deffnx {Scheme Procedure} request-last-modified request [default=#f]
1153@deffnx {Scheme Procedure} request-max-forwards request [default=#f]
1154@deffnx {Scheme Procedure} request-pragma request [default='()]
1155@deffnx {Scheme Procedure} request-proxy-authorization request [default=#f]
1156@deffnx {Scheme Procedure} request-range request [default=#f]
1157@deffnx {Scheme Procedure} request-referer request [default=#f]
1158@deffnx {Scheme Procedure} request-te request [default=#f]
1159@deffnx {Scheme Procedure} request-trailer request [default='()]
1160@deffnx {Scheme Procedure} request-transfer-encoding request [default='()]
1161@deffnx {Scheme Procedure} request-upgrade request [default='()]
1162@deffnx {Scheme Procedure} request-user-agent request [default=#f]
1163@deffnx {Scheme Procedure} request-via request [default='()]
1164@deffnx {Scheme Procedure} request-warning request [default='()]
e471a3ee 1165Return the given request header, or @var{default} if none was present.
2e6f5ea4 1166@end deffn
8db7e094 1167
2e6f5ea4 1168@deffn {Scheme Procedure} request-absolute-uri r [default-host=#f] [default-port=#f]
e471a3ee
AW
1169A helper routine to determine the absolute URI of a request, using the
1170@code{host} header and the default host and port.
2e6f5ea4 1171@end deffn
8db7e094
AW
1172
1173
8db7e094
AW
1174@node Responses
1175@subsection HTTP Responses
1176
1177@example
1178(use-modules (web response))
1179@end example
1180
e471a3ee
AW
1181As with requests (@pxref{Requests}), Guile offers a data type for HTTP
1182responses. Again, the body is represented separately from the request.
8db7e094 1183
2e6f5ea4
AW
1184@deffn {Scheme Procedure} response?
1185@deffnx {Scheme Procedure} response-version
1186@deffnx {Scheme Procedure} response-code
1187@deffnx {Scheme Procedure} response-reason-phrase response
1188@deffnx {Scheme Procedure} response-headers
1189@deffnx {Scheme Procedure} response-port
e471a3ee
AW
1190A predicate and field accessors for the response type. The fields are as
1191follows:
1192@table @code
1193@item version
1194The HTTP version pair, like @code{(1 . 1)}.
1195@item code
1196The HTTP response code, like @code{200}.
1197@item reason-phrase
1198The reason phrase, or the standard reason phrase for the response's
1199code.
1200@item headers
1201The response headers, as an alist of parsed values.
1202@item port
1203The port on which to read or write a response body, if any.
1204@end table
2e6f5ea4 1205@end deffn
8db7e094 1206
2e6f5ea4 1207@deffn {Scheme Procedure} read-response port
de54fb6d 1208Read an HTTP response from @var{port}.
8db7e094
AW
1209
1210As a side effect, sets the encoding on @var{port} to ISO-8859-1
1211(latin-1), so that reading one character reads one byte. See the
de54fb6d 1212discussion of character sets in @ref{Responses}, for more information.
2e6f5ea4 1213@end deffn
8db7e094 1214
64de6db5 1215@deffn {Scheme Procedure} build-response [#:version='(1 . 1)] [#:code=200] [#:reason-phrase=#f] [#:headers='()] [#:port=#f] [#:validate-headers?=#t]
8db7e094
AW
1216Construct an HTTP response object. If @var{validate-headers?} is true,
1217the headers are each run through their respective validators.
2e6f5ea4 1218@end deffn
8db7e094 1219
2e6f5ea4 1220@deffn {Scheme Procedure} adapt-response-version response version
de54fb6d 1221Adapt the given response to a different HTTP version. Return a new HTTP
8db7e094
AW
1222response.
1223
1224The idea is that many applications might just build a response for the
1225default HTTP version, and this method could handle a number of
1226programmatic transformations to respond to older HTTP versions (0.9 and
12271.0). But currently this function is a bit heavy-handed, just updating
1228the version field.
2e6f5ea4 1229@end deffn
8db7e094 1230
2e6f5ea4 1231@deffn {Scheme Procedure} write-response r port
8db7e094
AW
1232Write the given HTTP response to @var{port}.
1233
de54fb6d 1234Return a new response, whose @code{response-port} will continue writing
8db7e094 1235on @var{port}, perhaps using some transfer encoding.
2e6f5ea4 1236@end deffn
8db7e094 1237
164a78b3
AW
1238@deffn {Scheme Procedure} response-must-not-include-body? r
1239Some responses, like those with status code 304, are specified as never
1240having bodies. This predicate returns @code{#t} for those responses.
1241
1242Note also, though, that responses to @code{HEAD} requests must also not
1243have a body.
1244@end deffn
1245
2e6f5ea4 1246@deffn {Scheme Procedure} read-response-body r
de54fb6d 1247Read the response body from @var{r}, as a bytevector. Returns @code{#f}
8db7e094 1248if there was no response body.
2e6f5ea4 1249@end deffn
8db7e094 1250
2e6f5ea4 1251@deffn {Scheme Procedure} write-response-body r bv
64de6db5 1252Write @var{bv}, a bytevector, to the port corresponding to the HTTP
8db7e094 1253response @var{r}.
2e6f5ea4 1254@end deffn
8db7e094 1255
e471a3ee
AW
1256As with requests, the various headers that are typically associated with
1257HTTP responses may be accessed with these dedicated accessors.
1258@xref{HTTP Headers}, for more information on the format of parsed
1259headers.
1260
2e6f5ea4
AW
1261@deffn {Scheme Procedure} response-accept-ranges response [default=#f]
1262@deffnx {Scheme Procedure} response-age response [default='()]
1263@deffnx {Scheme Procedure} response-allow response [default='()]
1264@deffnx {Scheme Procedure} response-cache-control response [default='()]
1265@deffnx {Scheme Procedure} response-connection response [default='()]
1266@deffnx {Scheme Procedure} response-content-encoding response [default='()]
1267@deffnx {Scheme Procedure} response-content-language response [default='()]
1268@deffnx {Scheme Procedure} response-content-length response [default=#f]
1269@deffnx {Scheme Procedure} response-content-location response [default=#f]
1270@deffnx {Scheme Procedure} response-content-md5 response [default=#f]
1271@deffnx {Scheme Procedure} response-content-range response [default=#f]
1272@deffnx {Scheme Procedure} response-content-type response [default=#f]
1273@deffnx {Scheme Procedure} response-date response [default=#f]
1274@deffnx {Scheme Procedure} response-etag response [default=#f]
1275@deffnx {Scheme Procedure} response-expires response [default=#f]
1276@deffnx {Scheme Procedure} response-last-modified response [default=#f]
1277@deffnx {Scheme Procedure} response-location response [default=#f]
1278@deffnx {Scheme Procedure} response-pragma response [default='()]
1279@deffnx {Scheme Procedure} response-proxy-authenticate response [default=#f]
1280@deffnx {Scheme Procedure} response-retry-after response [default=#f]
1281@deffnx {Scheme Procedure} response-server response [default=#f]
1282@deffnx {Scheme Procedure} response-trailer response [default='()]
1283@deffnx {Scheme Procedure} response-transfer-encoding response [default='()]
1284@deffnx {Scheme Procedure} response-upgrade response [default='()]
1285@deffnx {Scheme Procedure} response-vary response [default='()]
1286@deffnx {Scheme Procedure} response-via response [default='()]
1287@deffnx {Scheme Procedure} response-warning response [default='()]
1288@deffnx {Scheme Procedure} response-www-authenticate response [default=#f]
de54fb6d 1289Return the given response header, or @var{default} if none was present.
2e6f5ea4 1290@end deffn
8db7e094
AW
1291
1292
ec811439
AW
1293@node Web Client
1294@subsection Web Client
1295
1296@code{(web client)} provides a simple, synchronous HTTP client, built on
1297the lower-level HTTP, request, and response modules.
1298
1299@deffn {Scheme Procedure} open-socket-for-uri uri
1300@end deffn
1301
64de6db5 1302@deffn {Scheme Procedure} http-get uri [#:port=(open-socket-for-uri uri)] [#:version='(1 . 1)] [#:keep-alive?=#f] [#:extra-headers='()] [#:decode-body?=#t]
ec811439
AW
1303Connect to the server corresponding to @var{uri} and ask for the
1304resource, using the @code{GET} method. If you already have a port open,
1305pass it as @var{port}. The port will be closed at the end of the
1306request unless @var{keep-alive?} is true. Any extra headers in the
1307alist @var{extra-headers} will be added to the request.
1308
1309If @var{decode-body?} is true, as is the default, the body of the
1310response will be decoded to string, if it is a textual content-type.
1311Otherwise it will be returned as a bytevector.
1312@end deffn
1313
1314@code{http-get} is useful for making one-off requests to web sites. If
1315you are writing a web spider or some other client that needs to handle a
1316number of requests in parallel, it's better to build an event-driven URL
1317fetcher, similar in structure to the web server (@pxref{Web Server}).
1318
1319Another option, good but not as performant, would be to use threads,
1320possibly via par-map or futures.
1321
1322More helper procedures for the other common HTTP verbs would be a good
1323addition to this module. Send your code to
1324@email{guile-user@@gnu.org}.
1325
1326
8db7e094
AW
1327@node Web Server
1328@subsection Web Server
1329
1330@code{(web server)} is a generic web server interface, along with a main
1331loop implementation for web servers controlled by Guile.
1332
5cdab8b8
AW
1333@example
1334(use-modules (web server))
1335@end example
1336
1337The lowest layer is the @code{<server-impl>} object, which defines a set
1338of hooks to open a server, read a request from a client, write a
1339response to a client, and close a server. These hooks -- @code{open},
1340@code{read}, @code{write}, and @code{close}, respectively -- are bound
1341together in a @code{<server-impl>} object. Procedures in this module take a
1342@code{<server-impl>} object, if needed.
1343
1344A @code{<server-impl>} may also be looked up by name. If you pass the
1345@code{http} symbol to @code{run-server}, Guile looks for a variable
1346named @code{http} in the @code{(web server http)} module, which should
1347be bound to a @code{<server-impl>} object. Such a binding is made by
1348instantiation of the @code{define-server-impl} syntax. In this way the
1349run-server loop can automatically load other backends if available.
8db7e094
AW
1350
1351The life cycle of a server goes as follows:
1352
1353@enumerate
1354@item
1355The @code{open} hook is called, to open the server. @code{open} takes 0 or
1356more arguments, depending on the backend, and returns an opaque
1357server socket object, or signals an error.
1358
1359@item
1360The @code{read} hook is called, to read a request from a new client.
5cdab8b8
AW
1361The @code{read} hook takes one argument, the server socket. It should
1362return three values: an opaque client socket, the request, and the
1363request body. The request should be a @code{<request>} object, from
1364@code{(web request)}. The body should be a string or a bytevector, or
1365@code{#f} if there is no body.
8db7e094
AW
1366
1367If the read failed, the @code{read} hook may return #f for the client
1368socket, request, and body.
1369
1370@item
09b7459b
AW
1371A user-provided handler procedure is called, with the request and body
1372as its arguments. The handler should return two values: the response,
1373as a @code{<response>} record from @code{(web response)}, and the
1374response body as bytevector, or @code{#f} if not present.
1375
1376The respose and response body are run through @code{sanitize-response},
1377documented below. This allows the handler writer to take some
1378convenient shortcuts: for example, instead of a @code{<response>}, the
1379handler can simply return an alist of headers, in which case a default
1380response object is constructed with those headers. Instead of a
1381bytevector for the body, the handler can return a string, which will be
1382serialized into an appropriate encoding; or it can return a procedure,
1383which will be called on a port to write out the data. See the
1384@code{sanitize-response} documentation, for more.
8db7e094
AW
1385
1386@item
1387The @code{write} hook is called with three arguments: the client
1388socket, the response, and the body. The @code{write} hook returns no
1389values.
1390
1391@item
1392At this point the request handling is complete. For a loop, we
1393loop back and try to read a new request.
1394
1395@item
1396If the user interrupts the loop, the @code{close} hook is called on
1397the server socket.
1398@end enumerate
1399
5cdab8b8
AW
1400A user may define a server implementation with the following form:
1401
2e6f5ea4 1402@deffn {Scheme Procedure} define-server-impl name open read write close
5cdab8b8
AW
1403Make a @code{<server-impl>} object with the hooks @var{open},
1404@var{read}, @var{write}, and @var{close}, and bind it to the symbol
1405@var{name} in the current module.
2e6f5ea4 1406@end deffn
8db7e094 1407
2e6f5ea4 1408@deffn {Scheme Procedure} lookup-server-impl impl
8db7e094
AW
1409Look up a server implementation. If @var{impl} is a server
1410implementation already, it is returned directly. If it is a symbol, the
1411binding named @var{impl} in the @code{(web server @var{impl})} module is
1412looked up. Otherwise an error is signaled.
1413
1414Currently a server implementation is a somewhat opaque type, useful only
1415for passing to other procedures in this module, like @code{read-client}.
2e6f5ea4 1416@end deffn
8db7e094 1417
5cdab8b8
AW
1418The @code{(web server)} module defines a number of routines that use
1419@code{<server-impl>} objects to implement parts of a web server. Given
1420that we don't expose the accessors for the various fields of a
1421@code{<server-impl>}, indeed these routines are the only procedures with
1422any access to the impl objects.
1423
2e6f5ea4 1424@deffn {Scheme Procedure} open-server impl open-params
f4ec6877 1425Open a server for the given implementation. Return one value, the new
8db7e094
AW
1426server object. The implementation's @code{open} procedure is applied to
1427@var{open-params}, which should be a list.
2e6f5ea4 1428@end deffn
8db7e094 1429
2e6f5ea4 1430@deffn {Scheme Procedure} read-client impl server
8db7e094 1431Read a new client from @var{server}, by applying the implementation's
f4ec6877 1432@code{read} procedure to the server. If successful, return three
8db7e094 1433values: an object corresponding to the client, a request object, and the
f4ec6877 1434request body. If any exception occurs, return @code{#f} for all three
8db7e094 1435values.
2e6f5ea4 1436@end deffn
8db7e094 1437
2e6f5ea4 1438@deffn {Scheme Procedure} handle-request handler request body state
8db7e094
AW
1439Handle a given request, returning the response and body.
1440
1441The response and response body are produced by calling the given
1442@var{handler} with @var{request} and @var{body} as arguments.
1443
1444The elements of @var{state} are also passed to @var{handler} as
1445arguments, and may be returned as additional values. The new
1446@var{state}, collected from the @var{handler}'s return values, is then
1447returned as a list. The idea is that a server loop receives a handler
1448from the user, along with whatever state values the user is interested
1449in, allowing the user's handler to explicitly manage its state.
2e6f5ea4 1450@end deffn
8db7e094 1451
2e6f5ea4 1452@deffn {Scheme Procedure} sanitize-response request response body
8db7e094
AW
1453"Sanitize" the given response and body, making them appropriate for the
1454given request.
1455
1456As a convenience to web handler authors, @var{response} may be given as
1457an alist of headers, in which case it is used to construct a default
1458response. Ensures that the response version corresponds to the request
1459version. If @var{body} is a string, encodes the string to a bytevector,
1460in an encoding appropriate for @var{response}. Adds a
1461@code{content-length} and @code{content-type} header, as necessary.
1462
1463If @var{body} is a procedure, it is called with a port as an argument,
1464and the output collected as a bytevector. In the future we might try to
1465instead use a compressing, chunk-encoded port, and call this procedure
1466later, in the write-client procedure. Authors are advised not to rely on
1467the procedure being called at any particular time.
2e6f5ea4 1468@end deffn
8db7e094 1469
2e6f5ea4 1470@deffn {Scheme Procedure} write-client impl server client response body
8db7e094
AW
1471Write an HTTP response and body to @var{client}. If the server and
1472client support persistent connections, it is the implementation's
1473responsibility to keep track of the client thereafter, presumably by
1474attaching it to the @var{server} argument somehow.
2e6f5ea4 1475@end deffn
8db7e094 1476
2e6f5ea4 1477@deffn {Scheme Procedure} close-server impl server
8db7e094
AW
1478Release resources allocated by a previous invocation of
1479@code{open-server}.
2e6f5ea4 1480@end deffn
8db7e094 1481
5cdab8b8
AW
1482Given the procedures above, it is a small matter to make a web server:
1483
2e6f5ea4 1484@deffn {Scheme Procedure} serve-one-client handler impl server state
8db7e094 1485Read one request from @var{server}, call @var{handler} on the request
f4ec6877 1486and body, and write the response to the client. Return the new state
8db7e094 1487produced by the handler procedure.
2e6f5ea4 1488@end deffn
8db7e094 1489
2e6f5ea4 1490@deffn {Scheme Procedure} run-server handler [impl='http] [open-params='()] . state
8db7e094
AW
1491Run Guile's built-in web server.
1492
1493@var{handler} should be a procedure that takes two or more arguments,
1494the HTTP request and request body, and returns two or more values, the
1495response and response body.
1496
f4ec6877 1497For examples, skip ahead to the next section, @ref{Web Examples}.
8db7e094
AW
1498
1499The response and body will be run through @code{sanitize-response}
1500before sending back to the client.
1501
1502Additional arguments to @var{handler} are taken from @var{state}.
1503Additional return values are accumulated into a new @var{state}, which
1504will be used for subsequent requests. In this way a handler can
1505explicitly manage its state.
2e6f5ea4 1506@end deffn
8db7e094 1507
f4ec6877
AW
1508The default web server implementation is @code{http}, which binds to a
1509socket, listening for request on that port.
1510
1511@deffn {HTTP Implementation} http [#:host=#f] [#:family=AF_INET] [#:addr=INADDR_LOOPBACK] [#:port 8080] [#:socket]
1512The default HTTP implementation. We document it as a function with
1513keyword arguments, because that is precisely the way that it is -- all
1514of the @var{open-params} to @code{run-server} get passed to the
1515implementation's open function.
1516
1517@example
1518;; The defaults: localhost:8080
1519(run-server handler)
1520;; Same thing
1521(run-server handler 'http '())
1522;; On a different port
1523(run-server handler 'http '(#:port 8081))
1524;; IPv6
1525(run-server handler 'http '(#:family AF_INET6 #:port 8081))
1526;; Custom socket
1527(run-server handler 'http `(#:socket ,(sudo-make-me-a-socket)))
1528@end example
1529@end deffn
5cdab8b8
AW
1530
1531@node Web Examples
1532@subsection Web Examples
1533
1534Well, enough about the tedious internals. Let's make a web application!
1535
1536@subsubsection Hello, World!
1537
1538The first program we have to write, of course, is ``Hello, World!''.
1539This means that we have to implement a web handler that does what we
1540want.
1541
1542Now we define a handler, a function of two arguments and two return
1543values:
1544
1545@example
1546(define (handler request request-body)
1547 (values @var{response} @var{response-body}))
1548@end example
1549
1550In this first example, we take advantage of a short-cut, returning an
1551alist of headers instead of a proper response object. The response body
1552is our payload:
1553
1554@example
1555(define (hello-world-handler request request-body)
f4ec6877 1556 (values '((content-type . (text/plain)))
5cdab8b8
AW
1557 "Hello World!"))
1558@end example
1559
1560Now let's test it, by running a server with this handler. Load up the
1561web server module if you haven't yet done so, and run a server with this
1562handler:
1563
8db7e094
AW
1564@example
1565(use-modules (web server))
5cdab8b8 1566(run-server hello-world-handler)
8db7e094
AW
1567@end example
1568
5cdab8b8
AW
1569By default, the web server listens for requests on
1570@code{localhost:8080}. Visit that address in your web browser to
1571test. If you see the string, @code{Hello World!}, sweet!
8db7e094 1572
5cdab8b8 1573@subsubsection Inspecting the Request
e471a3ee 1574
5cdab8b8
AW
1575The Hello World program above is a general greeter, responding to all
1576URIs. To make a more exclusive greeter, we need to inspect the request
1577object, and conditionally produce different results. So let's load up
1578the request, response, and URI modules, and do just that.
e471a3ee 1579
5cdab8b8
AW
1580@example
1581(use-modules (web server)) ; you probably did this already
1582(use-modules (web request)
1583 (web response)
1584 (web uri))
1585
1586(define (request-path-components request)
1587 (split-and-decode-uri-path (uri-path (request-uri request))))
1588
1589(define (hello-hacker-handler request body)
1590 (if (equal? (request-path-components request)
1591 '("hacker"))
f4ec6877 1592 (values '((content-type . (text/plain)))
5cdab8b8
AW
1593 "Hello hacker!")
1594 (not-found request)))
1595
1596(run-server hello-hacker-handler)
1597@end example
e471a3ee 1598
5cdab8b8
AW
1599Here we see that we have defined a helper to return the components of
1600the URI path as a list of strings, and used that to check for a request
1601to @code{/hacker/}. Then the success case is just as before -- visit
1602@code{http://localhost:8080/hacker/} in your browser to check.
1603
1604You should always match against URI path components as decoded by
1605@code{split-and-decode-uri-path}. The above example will work for
1606@code{/hacker/}, @code{//hacker///}, and @code{/h%61ck%65r}.
1607
1608But we forgot to define @code{not-found}! If you are pasting these
1609examples into a REPL, accessing any other URI in your web browser will
1610drop your Guile console into the debugger:
1611
1612@example
1613<unnamed port>:38:7: In procedure module-lookup:
1614<unnamed port>:38:7: Unbound variable: not-found
1615
1616Entering a new prompt. Type `,bt' for a backtrace or `,q' to continue.
1617scheme@@(guile-user) [1]>
1618@end example
1619
1620So let's define the function, right there in the debugger. As you
1621probably know, we'll want to return a 404 response.
1622
1623@example
1624;; Paste this in your REPL
1625(define (not-found request)
1626 (values (build-response #:code 404)
1627 (string-append "Resource not found: "
2ebdf6b5 1628 (uri->string (request-uri request)))))
5cdab8b8
AW
1629
1630;; Now paste this to let the web server keep going:
1631,continue
1632@end example
1633
1634Now if you access @code{http://localhost/foo/}, you get this error
1635message. (Note that some popular web browsers won't show
1636server-generated 404 messages, showing their own instead, unless the 404
1637message body is long enough.)
1638
1639@subsubsection Higher-Level Interfaces
1640
1641The web handler interface is a common baseline that all kinds of Guile
1642web applications can use. You will usually want to build something on
1643top of it, however, especially when producing HTML. Here is a simple
1644example that builds up HTML output using SXML (@pxref{sxml simple}).
1645
1646First, load up the modules:
1647
1648@example
1649(use-modules (web server)
1650 (web request)
1651 (web response)
1652 (sxml simple))
1653@end example
1654
1655Now we define a simple templating function that takes a list of HTML
1656body elements, as SXML, and puts them in our super template:
1657
1658@example
1659(define (templatize title body)
1660 `(html (head (title ,title))
1661 (body ,@@body)))
e471a3ee
AW
1662@end example
1663
5cdab8b8
AW
1664For example, the simplest Hello HTML can be produced like this:
1665
1666@example
1667(sxml->xml (templatize "Hello!" '((b "Hi!"))))
1668@print{}
1669<html><head><title>Hello!</title></head><body><b>Hi!</b></body></html>
1670@end example
1671
1672Much better to work with Scheme data types than to work with HTML as
1673strings. Now we define a little response helper:
1674
1675@example
1676(define* (respond #:optional body #:key
1677 (status 200)
1678 (title "Hello hello!")
1679 (doctype "<!DOCTYPE html>\n")
f4ec6877
AW
1680 (content-type-params '((charset . "utf-8")))
1681 (content-type 'text/html)
5cdab8b8
AW
1682 (extra-headers '())
1683 (sxml (and body (templatize title body))))
1684 (values (build-response
1685 #:code status
1686 #:headers `((content-type
1687 . (,content-type ,@@content-type-params))
1688 ,@@extra-headers))
1689 (lambda (port)
1690 (if sxml
1691 (begin
1692 (if doctype (display doctype port))
1693 (sxml->xml sxml port))))))
1694@end example
1695
1696Here we see the power of keyword arguments with default initializers. By
1697the time the arguments are fully parsed, the @code{sxml} local variable
1698will hold the templated SXML, ready for sending out to the client.
1699
f4ec6877
AW
1700Also, instead of returning the body as a string, @code{respond} gives a
1701procedure, which will be called by the web server to write out the
1702response to the client.
5cdab8b8
AW
1703
1704Now, a simple example using this responder, which lays out the incoming
1705headers in an HTML table.
1706
1707@example
1708(define (debug-page request body)
1709 (respond
1710 `((h1 "hello world!")
1711 (table
1712 (tr (th "header") (th "value"))
1713 ,@@(map (lambda (pair)
1714 `(tr (td (tt ,(with-output-to-string
1715 (lambda () (display (car pair))))))
1716 (td (tt ,(with-output-to-string
1717 (lambda ()
1718 (write (cdr pair))))))))
1719 (request-headers request))))))
1720
1721(run-server debug-page)
1722@end example
1723
1724Now if you visit any local address in your web browser, we actually see
1725some HTML, finally.
1726
1727@subsubsection Conclusion
e471a3ee 1728
5cdab8b8
AW
1729Well, this is about as far as Guile's built-in web support goes, for
1730now. There are many ways to make a web application, but hopefully by
1731standardizing the most fundamental data types, users will be able to
1732choose the approach that suits them best, while also being able to
1733switch between implementations of the server. This is a relatively new
1734part of Guile, so if you have feedback, let us know, and we can take it
1735into account. Happy hacking on the web!
e471a3ee 1736
8db7e094
AW
1737@c Local Variables:
1738@c TeX-master: "guile.texi"
1739@c End: