Merge remote-tracking branch 'origin/stable-2.0'
[bpt/guile.git] / doc / ref / web.texi
1 @c -*-texinfo-*-
2 @c This is part of the GNU Guile Reference Manual.
3 @c Copyright (C) 2010, 2011, 2012 Free Software Foundation, Inc.
4 @c See the file guile.texi for copying conditions.
5
6 @node Web
7 @section @acronym{HTTP}, the Web, and All That
8 @cindex Web
9 @cindex WWW
10 @cindex HTTP
11
12 It has always been possible to connect computers together and share
13 information between them, but the rise of the World-Wide Web over the
14 last couple of decades has made it much easier to do so. The result is
15 a richly connected network of computation, in which Guile forms a part.
16
17 By ``the web'', we mean the HTTP protocol@footnote{Yes, the P is for
18 protocol, but this phrase appears repeatedly in RFC 2616.} as handled by
19 servers, clients, proxies, caches, and the various kinds of messages and
20 message components that can be sent and received by that protocol,
21 notably HTML.
22
23 On one level, the web is text in motion: the protocols themselves are
24 textual (though the payload may be binary), and it's possible to create
25 a socket and speak text to the web. But such an approach is obviously
26 primitive. This section details the higher-level data types and
27 operations provided by Guile: URIs, HTTP request and response records,
28 and a conventional web server implementation.
29
30 The material in this section is arranged in ascending order, in which
31 later concepts build on previous ones. If you prefer to start with the
32 highest-level perspective, @pxref{Web Examples}, and work your way
33 back.
34
35 @menu
36 * Types and the Web:: Types prevent bugs and security problems.
37 * URIs:: Universal Resource Identifiers.
38 * HTTP:: The Hyper-Text Transfer Protocol.
39 * HTTP Headers:: How Guile represents specific header values.
40 * Transfer Codings:: HTTP Transfer Codings.
41 * Requests:: HTTP requests.
42 * Responses:: HTTP responses.
43 * Web Client:: Accessing web resources over HTTP.
44 * Web Server:: Serving HTTP to the internet.
45 * Web Examples:: How to use this thing.
46 @end menu
47
48 @node Types and the Web
49 @subsection Types and the Web
50
51 It is a truth universally acknowledged, that a program with good use of
52 data types, will be free from many common bugs. Unfortunately, the
53 common practice in web programming seems to ignore this maxim. This
54 subsection makes the case for expressive data types in web programming.
55
56 By ``expressive data types'', we mean that the data types @emph{say}
57 something about how a program solves a problem. For example, if we
58 choose to represent dates using SRFI 19 date records (@pxref{SRFI-19}),
59 this indicates that there is a part of the program that will always have
60 valid dates. Error handling for a number of basic cases, like invalid
61 dates, occurs on the boundary in which we produce a SRFI 19 date record
62 from other types, like strings.
63
64 With regards to the web, data types are helpful in the two broad phases
65 of HTTP messages: parsing and generation.
66
67 Consider a server, which has to parse a request, and produce a response.
68 Guile will parse the request into an HTTP request object
69 (@pxref{Requests}), with each header parsed into an appropriate Scheme
70 data type. This transition from an incoming stream of characters to
71 typed data is a state change in a program---the strings might parse, or
72 they might not, and something has to happen if they do not. (Guile
73 throws an error in this case.) But after you have the parsed request,
74 ``client'' code (code built on top of the Guile web framework) will not
75 have to check for syntactic validity. The types already make this
76 information manifest.
77
78 This state change on the parsing boundary makes programs more robust,
79 as they themselves are freed from the need to do a number of common
80 error checks, and they can use normal Scheme procedures to handle a
81 request instead of ad-hoc string parsers.
82
83 The need for types on the response generation side (in a server) is more
84 subtle, though not less important. Consider the example of a POST
85 handler, which prints out the text that a user submits from a form.
86 Such a handler might include a procedure like this:
87
88 @example
89 ;; First, a helper procedure
90 (define (para . contents)
91 (string-append "<p>" (string-concatenate contents) "</p>"))
92
93 ;; Now the meat of our simple web application
94 (define (you-said text)
95 (para "You said: " text))
96
97 (display (you-said "Hi!"))
98 @print{} <p>You said: Hi!</p>
99 @end example
100
101 This is a perfectly valid implementation, provided that the incoming
102 text does not contain the special HTML characters @samp{<}, @samp{>}, or
103 @samp{&}. But this provision of a restricted character set is not
104 reflected anywhere in the program itself: we must @emph{assume} that the
105 programmer understands this, and performs the check elsewhere.
106
107 Unfortunately, the short history of the practice of programming does not
108 bear out this assumption. A @dfn{cross-site scripting} (@acronym{XSS})
109 vulnerability is just such a common error in which unfiltered user input
110 is allowed into the output. A user could submit a crafted comment to
111 your web site which results in visitors running malicious Javascript,
112 within the security context of your domain:
113
114 @example
115 (display (you-said "<script src=\"http://bad.com/nasty.js\" />"))
116 @print{} <p>You said: <script src="http://bad.com/nasty.js" /></p>
117 @end example
118
119 The fundamental problem here is that both user data and the program
120 template are represented using strings. This identity means that types
121 can't help the programmer to make a distinction between these two, so
122 they get confused.
123
124 There are a number of possible solutions, but perhaps the best is to
125 treat HTML not as strings, but as native s-expressions: as SXML. The
126 basic idea is that HTML is either text, represented by a string, or an
127 element, represented as a tagged list. So @samp{foo} becomes
128 @samp{"foo"}, and @samp{<b>foo</b>} becomes @samp{(b "foo")}.
129 Attributes, if present, go in a tagged list headed by @samp{@@}, like
130 @samp{(img (@@ (src "http://example.com/foo.png")))}. @xref{sxml
131 simple}, for more information.
132
133 The good thing about SXML is that HTML elements cannot be confused with
134 text. Let's make a new definition of @code{para}:
135
136 @example
137 (define (para . contents)
138 `(p ,@@contents))
139
140 (use-modules (sxml simple))
141 (sxml->xml (you-said "Hi!"))
142 @print{} <p>You said: Hi!</p>
143
144 (sxml->xml (you-said "<i>Rats, foiled again!</i>"))
145 @print{} <p>You said: &lt;i&gt;Rats, foiled again!&lt;/i&gt;</p>
146 @end example
147
148 So we see in the second example that HTML elements cannot be unwittingly
149 introduced into the output. However it is now perfectly acceptable to
150 pass SXML to @code{you-said}; in fact, that is the big advantage of SXML
151 over everything-as-a-string.
152
153 @example
154 (sxml->xml (you-said (you-said "<Hi!>")))
155 @print{} <p>You said: <p>You said: &lt;Hi!&gt;</p></p>
156 @end example
157
158 The SXML types allow procedures to @emph{compose}. The types make
159 manifest which parts are HTML elements, and which are text. So you
160 needn't worry about escaping user input; the type transition back to a
161 string handles that for you. @acronym{XSS} vulnerabilities are a thing
162 of the past.
163
164 Well. That's all very nice and opinionated and such, but how do I use
165 the thing? Read on!
166
167 @node URIs
168 @subsection Universal Resource Identifiers
169
170 Guile provides a standard data type for Universal Resource Identifiers
171 (URIs), as defined in RFC 3986.
172
173 The generic URI syntax is as follows:
174
175 @example
176 URI := scheme ":" ["//" [userinfo "@@"] host [":" port]] path \
177 [ "?" query ] [ "#" fragment ]
178 @end example
179
180 For example, in the URI, @indicateurl{http://www.gnu.org/help/}, the
181 scheme is @code{http}, the host is @code{www.gnu.org}, the path is
182 @code{/help/}, and there is no userinfo, port, query, or fragment. All
183 URIs have a scheme and a path (though the path might be empty). Some
184 URIs have a host, and some of those have ports and userinfo. Any URI
185 might have a query part or a fragment.
186
187 Userinfo is something of an abstraction, as some legacy URI schemes
188 allowed userinfo of the form @code{@var{username}:@var{passwd}}. But
189 since passwords do not belong in URIs, the RFC does not want to condone
190 this practice, so it calls anything before the @code{@@} sign
191 @dfn{userinfo}.
192
193 Properly speaking, a fragment is not part of a URI. For example, when a
194 web browser follows a link to @indicateurl{http://example.com/#foo}, it
195 sends a request for @indicateurl{http://example.com/}, then looks in the
196 resulting page for the fragment identified @code{foo} reference. A
197 fragment identifies a part of a resource, not the resource itself. But
198 it is useful to have a fragment field in the URI record itself, so we
199 hope you will forgive the inconsistency.
200
201 @example
202 (use-modules (web uri))
203 @end example
204
205 The following procedures can be found in the @code{(web uri)}
206 module. Load it into your Guile, using a form like the above, to have
207 access to them.
208
209 @deffn {Scheme Procedure} build-uri scheme [#:userinfo=@code{#f}] [#:host=@code{#f}] @
210 [#:port=@code{#f}] [#:path=@code{""}] [#:query=@code{#f}] @
211 [#:fragment=@code{#f}] [#:validate?=@code{#t}]
212 Construct a URI object. @var{scheme} should be a symbol, and the rest
213 of the fields are either strings or @code{#f}. If @var{validate?} is
214 true, also run some consistency checks to make sure that the constructed
215 URI is valid.
216 @end deffn
217
218 @deffn {Scheme Procedure} uri? x
219 @deffnx {Scheme Procedure} uri-scheme uri
220 @deffnx {Scheme Procedure} uri-userinfo uri
221 @deffnx {Scheme Procedure} uri-host uri
222 @deffnx {Scheme Procedure} uri-port uri
223 @deffnx {Scheme Procedure} uri-path uri
224 @deffnx {Scheme Procedure} uri-query uri
225 @deffnx {Scheme Procedure} uri-fragment uri
226 A predicate and field accessors for the URI record type. The URI scheme
227 will be a symbol, and the rest either strings or @code{#f} if not
228 present.
229 @end deffn
230
231 @deffn {Scheme Procedure} string->uri string
232 Parse @var{string} into a URI object. Return @code{#f} if the string
233 could not be parsed.
234 @end deffn
235
236 @deffn {Scheme Procedure} uri->string uri
237 Serialize @var{uri} to a string. If the URI has a port that is the
238 default port for its scheme, the port is not included in the
239 serialization.
240 @end deffn
241
242 @deffn {Scheme Procedure} declare-default-port! scheme port
243 Declare a default port for the given URI scheme.
244 @end deffn
245
246 @deffn {Scheme Procedure} uri-decode str [#:encoding=@code{"utf-8"}]
247 Percent-decode the given @var{str}, according to @var{encoding}, which
248 should be the name of a character encoding.
249
250 Note that this function should not generally be applied to a full URI
251 string. For paths, use split-and-decode-uri-path instead. For query
252 strings, split the query on @code{&} and @code{=} boundaries, and decode
253 the components separately.
254
255 Note also that percent-encoded strings encode @emph{bytes}, not
256 characters. There is no guarantee that a given byte sequence is a valid
257 string encoding. Therefore this routine may signal an error if the
258 decoded bytes are not valid for the given encoding. Pass @code{#f} for
259 @var{encoding} if you want decoded bytes as a bytevector directly.
260 @xref{Ports, @code{set-port-encoding!}}, for more information on
261 character encodings.
262
263 Returns a string of the decoded characters, or a bytevector if
264 @var{encoding} was @code{#f}.
265 @end deffn
266
267 Fixme: clarify return type. indicate default values. type of
268 unescaped-chars.
269
270 @deffn {Scheme Procedure} uri-encode str [#:encoding=@code{"utf-8"}] [#:unescaped-chars]
271 Percent-encode any character not in the character set,
272 @var{unescaped-chars}.
273
274 The default character set includes alphanumerics from ASCII, as well as
275 the special characters @samp{-}, @samp{.}, @samp{_}, and @samp{~}. Any
276 other character will be percent-encoded, by writing out the character to
277 a bytevector within the given @var{encoding}, then encoding each byte as
278 @code{%@var{HH}}, where @var{HH} is the hexadecimal representation of
279 the byte.
280 @end deffn
281
282 @deffn {Scheme Procedure} split-and-decode-uri-path path
283 Split @var{path} into its components, and decode each component,
284 removing empty components.
285
286 For example, @code{"/foo/bar%20baz/"} decodes to the two-element list,
287 @code{("foo" "bar baz")}.
288 @end deffn
289
290 @deffn {Scheme Procedure} encode-and-join-uri-path parts
291 URI-encode each element of @var{parts}, which should be a list of
292 strings, and join the parts together with @code{/} as a delimiter.
293
294 For example, the list @code{("scrambled eggs" "biscuits&gravy")} encodes
295 as @code{"scrambled%20eggs/biscuits%26gravy"}.
296 @end deffn
297
298 @node HTTP
299 @subsection The Hyper-Text Transfer Protocol
300
301 The initial motivation for including web functionality in Guile, rather
302 than rely on an external package, was to establish a standard base on
303 which people can share code. To that end, we continue the focus on data
304 types by providing a number of low-level parsers and unparsers for
305 elements of the HTTP protocol.
306
307 If you are want to skip the low-level details for now and move on to web
308 pages, @pxref{Web Client}, and @pxref{Web Server}. Otherwise, load the
309 HTTP module, and read on.
310
311 @example
312 (use-modules (web http))
313 @end example
314
315 The focus of the @code{(web http)} module is to parse and unparse
316 standard HTTP headers, representing them to Guile as native data
317 structures. For example, a @code{Date:} header will be represented as a
318 SRFI-19 date record (@pxref{SRFI-19}), rather than as a string.
319
320 Guile tries to follow RFCs fairly strictly---the road to perdition being
321 paved with compatibility hacks---though some allowances are made for
322 not-too-divergent texts.
323
324 Header names are represented as lower-case symbols.
325
326 @deffn {Scheme Procedure} string->header name
327 Parse @var{name} to a symbolic header name.
328 @end deffn
329
330 @deffn {Scheme Procedure} header->string sym
331 Return the string form for the header named @var{sym}.
332 @end deffn
333
334 For example:
335
336 @example
337 (string->header "Content-Length")
338 @result{} content-length
339 (header->string 'content-length)
340 @result{} "Content-Length"
341
342 (string->header "FOO")
343 @result{} foo
344 (header->string 'foo)
345 @result{} "Foo"
346 @end example
347
348 Guile keeps a registry of known headers, their string names, and some
349 parsing and serialization procedures. If a header is unknown, its
350 string name is simply its symbol name in title-case.
351
352 @deffn {Scheme Procedure} known-header? sym
353 Return @code{#t} iff @var{sym} is a known header, with associated
354 parsers and serialization procedures.
355 @end deffn
356
357 @deffn {Scheme Procedure} header-parser sym
358 Return the value parser for headers named @var{sym}. The result is a
359 procedure that takes one argument, a string, and returns the parsed
360 value. If the header isn't known to Guile, a default parser is returned
361 that passes through the string unchanged.
362 @end deffn
363
364 @deffn {Scheme Procedure} header-validator sym
365 Return a predicate which returns @code{#t} if the given value is valid
366 for headers named @var{sym}. The default validator for unknown headers
367 is @code{string?}.
368 @end deffn
369
370 @deffn {Scheme Procedure} header-writer sym
371 Return a procedure that writes values for headers named @var{sym} to a
372 port. The resulting procedure takes two arguments: a value and a port.
373 The default writer is @code{display}.
374 @end deffn
375
376 For more on the set of headers that Guile knows about out of the box,
377 @pxref{HTTP Headers}. To add your own, use the @code{declare-header!}
378 procedure:
379
380 @deffn {Scheme Procedure} declare-header! name parser validator writer [#:multiple?=@code{#f}]
381 Declare a parser, validator, and writer for a given header.
382 @end deffn
383
384 For example, let's say you are running a web server behind some sort of
385 proxy, and your proxy adds an @code{X-Client-Address} header, indicating
386 the IPv4 address of the original client. You would like for the HTTP
387 request record to parse out this header to a Scheme value, instead of
388 leaving it as a string. You could register this header with Guile's
389 HTTP stack like this:
390
391 @example
392 (declare-header! "X-Client-Address"
393 (lambda (str)
394 (inet-aton str))
395 (lambda (ip)
396 (and (integer? ip) (exact? ip) (<= 0 ip #xffffffff)))
397 (lambda (ip port)
398 (display (inet-ntoa ip) port)))
399 @end example
400
401 @deffn {Scheme Procedure} declare-opaque-header! name
402 A specialised version of @code{declare-header!} for the case in which
403 you want a header's value to be returned/written ``as-is''.
404 @end deffn
405
406 @deffn {Scheme Procedure} valid-header? sym val
407 Return a true value iff @var{val} is a valid Scheme value for the header
408 with name @var{sym}.
409 @end deffn
410
411 Now that we have a generic interface for reading and writing headers, we
412 do just that.
413
414 @deffn {Scheme Procedure} read-header port
415 Read one HTTP header from @var{port}. Return two values: the header
416 name and the parsed Scheme value. May raise an exception if the header
417 was known but the value was invalid.
418
419 Returns the end-of-file object for both values if the end of the message
420 body was reached (i.e., a blank line).
421 @end deffn
422
423 @deffn {Scheme Procedure} parse-header name val
424 Parse @var{val}, a string, with the parser for the header named
425 @var{name}. Returns the parsed value.
426 @end deffn
427
428 @deffn {Scheme Procedure} write-header name val port
429 Write the given header name and value to @var{port}, using the writer
430 from @code{header-writer}.
431 @end deffn
432
433 @deffn {Scheme Procedure} read-headers port
434 Read the headers of an HTTP message from @var{port}, returning the
435 headers as an ordered alist.
436 @end deffn
437
438 @deffn {Scheme Procedure} write-headers headers port
439 Write the given header alist to @var{port}. Doesn't write the final
440 @samp{\r\n}, as the user might want to add another header.
441 @end deffn
442
443 The @code{(web http)} module also has some utility procedures to read
444 and write request and response lines.
445
446 @deffn {Scheme Procedure} parse-http-method str [start] [end]
447 Parse an HTTP method from @var{str}. The result is an upper-case symbol,
448 like @code{GET}.
449 @end deffn
450
451 @deffn {Scheme Procedure} parse-http-version str [start] [end]
452 Parse an HTTP version from @var{str}, returning it as a major-minor
453 pair. For example, @code{HTTP/1.1} parses as the pair of integers,
454 @code{(1 . 1)}.
455 @end deffn
456
457 @deffn {Scheme Procedure} parse-request-uri str [start] [end]
458 Parse a URI from an HTTP request line. Note that URIs in requests do not
459 have to have a scheme or host name. The result is a URI object.
460 @end deffn
461
462 @deffn {Scheme Procedure} read-request-line port
463 Read the first line of an HTTP request from @var{port}, returning three
464 values: the method, the URI, and the version.
465 @end deffn
466
467 @deffn {Scheme Procedure} write-request-line method uri version port
468 Write the first line of an HTTP request to @var{port}.
469 @end deffn
470
471 @deffn {Scheme Procedure} read-response-line port
472 Read the first line of an HTTP response from @var{port}, returning three
473 values: the HTTP version, the response code, and the "reason phrase".
474 @end deffn
475
476 @deffn {Scheme Procedure} write-response-line version code reason-phrase port
477 Write the first line of an HTTP response to @var{port}.
478 @end deffn
479
480
481 @node HTTP Headers
482 @subsection HTTP Headers
483
484 In addition to defining the infrastructure to parse headers, the
485 @code{(web http)} module defines specific parsers and unparsers for all
486 headers defined in the HTTP/1.1 standard.
487
488 For example, if you receive a header named @samp{Accept-Language} with a
489 value @samp{en, es;q=0.8}, Guile parses it as a quality list (defined
490 below):
491
492 @example
493 (parse-header 'accept-language "en, es;q=0.8")
494 @result{} ((1000 . "en") (800 . "es"))
495 @end example
496
497 The format of the value for @samp{Accept-Language} headers is defined
498 below, along with all other headers defined in the HTTP standard. (If
499 the header were unknown, the value would have been returned as a
500 string.)
501
502 For brevity, the header definitions below are given in the form,
503 @var{Type} @code{@var{name}}, indicating that values for the header
504 @code{@var{name}} will be of the given @var{Type}. Since Guile
505 internally treats header names in lower case, in this document we give
506 types title-cased names. A short description of the each header's
507 purpose and an example follow.
508
509 For full details on the meanings of all of these headers, see the HTTP
510 1.1 standard, RFC 2616.
511
512 @subsubsection HTTP Header Types
513
514 Here we define the types that are used below, when defining headers.
515
516 @deftp {HTTP Header Type} Date
517 A SRFI-19 date.
518 @end deftp
519
520 @deftp {HTTP Header Type} KVList
521 A list whose elements are keys or key-value pairs. Keys are parsed to
522 symbols. Values are strings by default. Non-string values are the
523 exception, and are mentioned explicitly below, as appropriate.
524 @end deftp
525
526 @deftp {HTTP Header Type} SList
527 A list of strings.
528 @end deftp
529
530 @deftp {HTTP Header Type} Quality
531 An exact integer between 0 and 1000. Qualities are used to express
532 preference, given multiple options. An option with a quality of 870,
533 for example, is preferred over an option with quality 500.
534
535 (Qualities are written out over the wire as numbers between 0.0 and
536 1.0, but since the standard only allows three digits after the decimal,
537 it's equivalent to integers between 0 and 1000, so that's what Guile
538 uses.)
539 @end deftp
540
541 @deftp {HTTP Header Type} QList
542 A quality list: a list of pairs, the car of which is a quality, and the
543 cdr a string. Used to express a list of options, along with their
544 qualities.
545 @end deftp
546
547 @deftp {HTTP Header Type} ETag
548 An entity tag, represented as a pair. The car of the pair is an opaque
549 string, and the cdr is @code{#t} if the entity tag is a ``strong'' entity
550 tag, and @code{#f} otherwise.
551 @end deftp
552
553 @subsubsection General Headers
554
555 General HTTP headers may be present in any HTTP message.
556
557 @deftypevr {HTTP Header} KVList cache-control
558 A key-value list of cache-control directives. See RFC 2616, for more
559 details.
560
561 If present, parameters to @code{max-age}, @code{max-stale},
562 @code{min-fresh}, and @code{s-maxage} are all parsed as non-negative
563 integers.
564
565 If present, parameters to @code{private} and @code{no-cache} are parsed
566 as lists of header names, as symbols.
567
568 @example
569 (parse-header 'cache-control "no-cache,no-store"
570 @result{} (no-cache no-store)
571 (parse-header 'cache-control "no-cache=\"Authorization,Date\",no-store"
572 @result{} ((no-cache . (authorization date)) no-store)
573 (parse-header 'cache-control "no-cache=\"Authorization,Date\",max-age=10"
574 @result{} ((no-cache . (authorization date)) (max-age . 10))
575 @end example
576 @end deftypevr
577
578 @deftypevr {HTTP Header} List connection
579 A list of header names that apply only to this HTTP connection, as
580 symbols. Additionally, the symbol @samp{close} may be present, to
581 indicate that the server should close the connection after responding to
582 the request.
583 @example
584 (parse-header 'connection "close")
585 @result{} (close)
586 @end example
587 @end deftypevr
588
589 @deftypevr {HTTP Header} Date date
590 The date that a given HTTP message was originated.
591 @example
592 (parse-header 'date "Tue, 15 Nov 1994 08:12:31 GMT")
593 @result{} #<date ...>
594 @end example
595 @end deftypevr
596
597 @deftypevr {HTTP Header} KVList pragma
598 A key-value list of implementation-specific directives.
599 @example
600 (parse-header 'pragma "no-cache, broccoli=tasty")
601 @result{} (no-cache (broccoli . "tasty"))
602 @end example
603 @end deftypevr
604
605 @deftypevr {HTTP Header} List trailer
606 A list of header names which will appear after the message body, instead
607 of with the message headers.
608 @example
609 (parse-header 'trailer "ETag")
610 @result{} (etag)
611 @end example
612 @end deftypevr
613
614 @deftypevr {HTTP Header} List transfer-encoding
615 A list of transfer codings, expressed as key-value lists. The only
616 transfer coding defined by the specification is @code{chunked}.
617 @example
618 (parse-header 'transfer-encoding "chunked")
619 @result{} ((chunked))
620 @end example
621 @end deftypevr
622
623 @deftypevr {HTTP Header} List upgrade
624 A list of strings, indicating additional protocols that a server could use
625 in response to a request.
626 @example
627 (parse-header 'upgrade "WebSocket")
628 @result{} ("WebSocket")
629 @end example
630 @end deftypevr
631
632 FIXME: parse out more fully?
633 @deftypevr {HTTP Header} List via
634 A list of strings, indicating the protocol versions and hosts of
635 intermediate servers and proxies. There may be multiple @code{via}
636 headers in one message.
637 @example
638 (parse-header 'via "1.0 venus, 1.1 mars")
639 @result{} ("1.0 venus" "1.1 mars")
640 @end example
641 @end deftypevr
642
643 @deftypevr {HTTP Header} List warning
644 A list of warnings given by a server or intermediate proxy. Each
645 warning is a itself a list of four elements: a code, as an exact integer
646 between 0 and 1000, a host as a string, the warning text as a string,
647 and either @code{#f} or a SRFI-19 date.
648
649 There may be multiple @code{warning} headers in one message.
650 @example
651 (parse-header 'warning "123 foo \"core breach imminent\"")
652 @result{} ((123 "foo" "core-breach imminent" #f))
653 @end example
654 @end deftypevr
655
656
657 @subsubsection Entity Headers
658
659 Entity headers may be present in any HTTP message, and refer to the
660 resource referenced in the HTTP request or response.
661
662 @deftypevr {HTTP Header} List allow
663 A list of allowed methods on a given resource, as symbols.
664 @example
665 (parse-header 'allow "GET, HEAD")
666 @result{} (GET HEAD)
667 @end example
668 @end deftypevr
669
670 @deftypevr {HTTP Header} List content-encoding
671 A list of content codings, as symbols.
672 @example
673 (parse-header 'content-encoding "gzip")
674 @result{} (gzip)
675 @end example
676 @end deftypevr
677
678 @deftypevr {HTTP Header} List content-language
679 The languages that a resource is in, as strings.
680 @example
681 (parse-header 'content-language "en")
682 @result{} ("en")
683 @end example
684 @end deftypevr
685
686 @deftypevr {HTTP Header} UInt content-length
687 The number of bytes in a resource, as an exact, non-negative integer.
688 @example
689 (parse-header 'content-length "300")
690 @result{} 300
691 @end example
692 @end deftypevr
693
694 @deftypevr {HTTP Header} URI content-location
695 The canonical URI for a resource, in the case that it is also accessible
696 from a different URI.
697 @example
698 (parse-header 'content-location "http://example.com/foo")
699 @result{} #<<uri> ...>
700 @end example
701 @end deftypevr
702
703 @deftypevr {HTTP Header} String content-md5
704 The MD5 digest of a resource.
705 @example
706 (parse-header 'content-md5 "ffaea1a79810785575e29e2bd45e2fa5")
707 @result{} "ffaea1a79810785575e29e2bd45e2fa5"
708 @end example
709 @end deftypevr
710
711 @deftypevr {HTTP Header} List content-range
712 A range specification, as a list of three elements: the symbol
713 @code{bytes}, either the symbol @code{*} or a pair of integers,
714 indicating the byte rage, and either @code{*} or an integer, for the
715 instance length. Used to indicate that a response only includes part of
716 a resource.
717 @example
718 (parse-header 'content-range "bytes 10-20/*")
719 @result{} (bytes (10 . 20) *)
720 @end example
721 @end deftypevr
722
723 @deftypevr {HTTP Header} List content-type
724 The MIME type of a resource, as a symbol, along with any parameters.
725 @example
726 (parse-header 'content-length "text/plain")
727 @result{} (text/plain)
728 (parse-header 'content-length "text/plain;charset=utf-8")
729 @result{} (text/plain (charset . "utf-8"))
730 @end example
731 Note that the @code{charset} parameter is something is a misnomer, and
732 the HTTP specification admits this. It specifies the @emph{encoding} of
733 the characters, not the character set.
734 @end deftypevr
735
736 @deftypevr {HTTP Header} Date expires
737 The date/time after which the resource given in a response is considered
738 stale.
739 @example
740 (parse-header 'expires "Tue, 15 Nov 1994 08:12:31 GMT")
741 @result{} #<date ...>
742 @end example
743 @end deftypevr
744
745 @deftypevr {HTTP Header} Date last-modified
746 The date/time on which the resource given in a response was last
747 modified.
748 @example
749 (parse-header 'expires "Tue, 15 Nov 1994 08:12:31 GMT")
750 @result{} #<date ...>
751 @end example
752 @end deftypevr
753
754
755 @subsubsection Request Headers
756
757 Request headers may only appear in an HTTP request, not in a response.
758
759 @deftypevr {HTTP Header} List accept
760 A list of preferred media types for a response. Each element of the
761 list is itself a list, in the same format as @code{content-type}.
762 @example
763 (parse-header 'accept "text/html,text/plain;charset=utf-8")
764 @result{} ((text/html) (text/plain (charset . "utf-8")))
765 @end example
766 Preference is expressed with quality values:
767 @example
768 (parse-header 'accept "text/html;q=0.8,text/plain;q=0.6")
769 @result{} ((text/html (q . 800)) (text/plain (q . 600)))
770 @end example
771 @end deftypevr
772
773 @deftypevr {HTTP Header} QList accept-charset
774 A quality list of acceptable charsets. Note again that what HTTP calls
775 a ``charset'' is what Guile calls a ``character encoding''.
776 @example
777 (parse-header 'accept-charset "iso-8859-5, unicode-1-1;q=0.8")
778 @result{} ((1000 . "iso-8859-5") (800 . "unicode-1-1"))
779 @end example
780 @end deftypevr
781
782 @deftypevr {HTTP Header} QList accept-encoding
783 A quality list of acceptable content codings.
784 @example
785 (parse-header 'accept-encoding "gzip,identity=0.8")
786 @result{} ((1000 . "gzip") (800 . "identity"))
787 @end example
788 @end deftypevr
789
790 @deftypevr {HTTP Header} QList accept-language
791 A quality list of acceptable languages.
792 @example
793 (parse-header 'accept-language "cn,en=0.75")
794 @result{} ((1000 . "cn") (750 . "en"))
795 @end example
796 @end deftypevr
797
798 @deftypevr {HTTP Header} Pair authorization
799 Authorization credentials. The car of the pair indicates the
800 authentication scheme, like @code{basic}. For basic authentication, the
801 cdr of the pair will be the base64-encoded @samp{@var{user}:@var{pass}}
802 string. For other authentication schemes, like @code{digest}, the cdr
803 will be a key-value list of credentials.
804 @example
805 (parse-header 'authorization "Basic QWxhZGRpbjpvcGVuIHNlc2FtZQ=="
806 @result{} (basic . "QWxhZGRpbjpvcGVuIHNlc2FtZQ==")
807 @end example
808 @end deftypevr
809
810 @deftypevr {HTTP Header} List expect
811 A list of expectations that a client has of a server. The expectations
812 are key-value lists.
813 @example
814 (parse-header 'expect "100-continue")
815 @result{} ((100-continue))
816 @end example
817 @end deftypevr
818
819 @deftypevr {HTTP Header} String from
820 The email address of a user making an HTTP request.
821 @example
822 (parse-header 'from "bob@@example.com")
823 @result{} "bob@@example.com"
824 @end example
825 @end deftypevr
826
827 @deftypevr {HTTP Header} Pair host
828 The host for the resource being requested, as a hostname-port pair. If
829 no port is given, the port is @code{#f}.
830 @example
831 (parse-header 'host "gnu.org:80")
832 @result{} ("gnu.org" . 80)
833 (parse-header 'host "gnu.org")
834 @result{} ("gnu.org" . #f)
835 @end example
836 @end deftypevr
837
838 @deftypevr {HTTP Header} *|List if-match
839 A set of etags, indicating that the request should proceed if and only
840 if the etag of the resource is in that set. Either the symbol @code{*},
841 indicating any etag, or a list of entity tags.
842 @example
843 (parse-header 'if-match "*")
844 @result{} *
845 (parse-header 'if-match "asdfadf")
846 @result{} (("asdfadf" . #t))
847 (parse-header 'if-match W/"asdfadf")
848 @result{} (("asdfadf" . #f))
849 @end example
850 @end deftypevr
851
852 @deftypevr {HTTP Header} Date if-modified-since
853 Indicates that a response should proceed if and only if the resource has
854 been modified since the given date.
855 @example
856 (parse-header 'if-modified-since "Tue, 15 Nov 1994 08:12:31 GMT")
857 @result{} #<date ...>
858 @end example
859 @end deftypevr
860
861 @deftypevr {HTTP Header} *|List if-none-match
862 A set of etags, indicating that the request should proceed if and only
863 if the etag of the resource is not in the set. Either the symbol
864 @code{*}, indicating any etag, or a list of entity tags.
865 @example
866 (parse-header 'if-none-match "*")
867 @result{} *
868 @end example
869 @end deftypevr
870
871 @deftypevr {HTTP Header} ETag|Date if-range
872 Indicates that the range request should proceed if and only if the
873 resource matches a modification date or an etag. Either an entity tag,
874 or a SRFI-19 date.
875 @example
876 (parse-header 'if-range "\"original-etag\"")
877 @result{} ("original-etag" . #t)
878 @end example
879 @end deftypevr
880
881 @deftypevr {HTTP Header} Date if-unmodified-since
882 Indicates that a response should proceed if and only if the resource has
883 not been modified since the given date.
884 @example
885 (parse-header 'if-not-modified-since "Tue, 15 Nov 1994 08:12:31 GMT")
886 @result{} #<date ...>
887 @end example
888 @end deftypevr
889
890 @deftypevr {HTTP Header} UInt max-forwards
891 The maximum number of proxy or gateway hops that a request should be
892 subject to.
893 @example
894 (parse-header 'max-forwards "10")
895 @result{} 10
896 @end example
897 @end deftypevr
898
899 @deftypevr {HTTP Header} Pair proxy-authorization
900 Authorization credentials for a proxy connection. See the documentation
901 for @code{authorization} above for more information on the format.
902 @example
903 (parse-header 'proxy-authorization "Digest foo=bar,baz=qux"
904 @result{} (digest (foo . "bar") (baz . "qux"))
905 @end example
906 @end deftypevr
907
908 @deftypevr {HTTP Header} Pair range
909 A range request, indicating that the client wants only part of a
910 resource. The car of the pair is the symbol @code{bytes}, and the cdr
911 is a list of pairs. Each element of the cdr indicates a range; the car
912 is the first byte position and the cdr is the last byte position, as
913 integers, or @code{#f} if not given.
914 @example
915 (parse-header 'range "bytes=10-30,50-")
916 @result{} (bytes (10 . 30) (50 . #f))
917 @end example
918 @end deftypevr
919
920 @deftypevr {HTTP Header} URI referer
921 The URI of the resource that referred the user to this resource. The
922 name of the header is a misspelling, but we are stuck with it.
923 @example
924 (parse-header 'referer "http://www.gnu.org/")
925 @result{} #<uri ...>
926 @end example
927 @end deftypevr
928
929 @deftypevr {HTTP Header} List te
930 A list of transfer codings, expressed as key-value lists. A common
931 transfer coding is @code{trailers}.
932 @example
933 (parse-header 'te "trailers")
934 @result{} ((trailers))
935 @end example
936 @end deftypevr
937
938 @deftypevr {HTTP Header} String user-agent
939 A string indicating the user agent making the request. The
940 specification defines a structured format for this header, but it is
941 widely disregarded, so Guile does not attempt to parse strictly.
942 @example
943 (parse-header 'user-agent "Mozilla/5.0")
944 @result{} "Mozilla/5.0"
945 @end example
946 @end deftypevr
947
948
949 @subsubsection Response Headers
950
951 @deftypevr {HTTP Header} List accept-ranges
952 A list of range units that the server supports, as symbols.
953 @example
954 (parse-header 'accept-ranges "bytes")
955 @result{} (bytes)
956 @end example
957 @end deftypevr
958
959 @deftypevr {HTTP Header} UInt age
960 The age of a cached response, in seconds.
961 @example
962 (parse-header 'age "3600")
963 @result{} 3600
964 @end example
965 @end deftypevr
966
967 @deftypevr {HTTP Header} ETag etag
968 The entity-tag of the resource.
969 @example
970 (parse-header 'etag "\"foo\"")
971 @result{} ("foo" . #t)
972 @end example
973 @end deftypevr
974
975 @deftypevr {HTTP Header} URI location
976 A URI on which a request may be completed. Used in combination with a
977 redirecting status code to perform client-side redirection.
978 @example
979 (parse-header 'location "http://example.com/other")
980 @result{} #<uri ...>
981 @end example
982 @end deftypevr
983
984 @deftypevr {HTTP Header} List proxy-authenticate
985 A list of challenges to a proxy, indicating the need for authentication.
986 @example
987 (parse-header 'proxy-authenticate "Basic realm=\"foo\"")
988 @result{} ((basic (realm . "foo")))
989 @end example
990 @end deftypevr
991
992 @deftypevr {HTTP Header} UInt|Date retry-after
993 Used in combination with a server-busy status code, like 503, to
994 indicate that a client should retry later. Either a number of seconds,
995 or a date.
996 @example
997 (parse-header 'retry-after "60")
998 @result{} 60
999 @end example
1000 @end deftypevr
1001
1002 @deftypevr {HTTP Header} String server
1003 A string identifying the server.
1004 @example
1005 (parse-header 'server "My first web server")
1006 @result{} "My first web server"
1007 @end example
1008 @end deftypevr
1009
1010 @deftypevr {HTTP Header} *|List vary
1011 A set of request headers that were used in computing this response.
1012 Used to indicate that server-side content negotiation was performed, for
1013 example in response to the @code{accept-language} header. Can also be
1014 the symbol @code{*}, indicating that all headers were considered.
1015 @example
1016 (parse-header 'vary "Accept-Language, Accept")
1017 @result{} (accept-language accept)
1018 @end example
1019 @end deftypevr
1020
1021 @deftypevr {HTTP Header} List www-authenticate
1022 A list of challenges to a user, indicating the need for authentication.
1023 @example
1024 (parse-header 'www-authenticate "Basic realm=\"foo\"")
1025 @result{} ((basic (realm . "foo")))
1026 @end example
1027 @end deftypevr
1028
1029 @node Transfer Codings
1030 @subsection Transfer Codings
1031
1032 HTTP 1.1 allows for various transfer codings to be applied to message
1033 bodies. These include various types of compression, and HTTP chunked
1034 encoding. Currently, only chunked encoding is supported by guile.
1035
1036 Chunked coding is an optional coding that may be applied to message
1037 bodies, to allow messages whose length is not known beforehand to be
1038 returned. Such messages can be split into chunks, terminated by a final
1039 zero length chunk.
1040
1041 In order to make dealing with encodings more simple, guile provides
1042 procedures to create ports that ``wrap'' existing ports, applying
1043 transformations transparently under the hood.
1044
1045 These procedures are in the @code{(web http)} module.
1046
1047 @example
1048 (use-modules (web http))
1049 @end example
1050
1051 @deffn {Scheme Procedure} make-chunked-input-port port [#:keep-alive?=#f]
1052 Returns a new port, that transparently reads and decodes chunk-encoded
1053 data from @var{port}. If no more chunk-encoded data is available, it
1054 returns the end-of-file object. When the port is closed, @var{port} will
1055 also be closed, unless @var{keep-alive?} is true.
1056 @end deffn
1057
1058 @example
1059 (use-modules (ice-9 rdelim))
1060
1061 (define s "5\r\nFirst\r\nA\r\n line\n Sec\r\n8\r\nond line\r\n0\r\n")
1062 (define p (make-chunked-input-port (open-input-string s)))
1063 (read-line s)
1064 @result{} "First line"
1065 (read-line s)
1066 @result{} "Second line"
1067 @end example
1068
1069 @deffn {Scheme Procedure} make-chunked-output-port port [#:keep-alive?=#f]
1070 Returns a new port, which transparently encodes data as chunk-encoded
1071 before writing it to @var{port}. Whenever a write occurs on this port,
1072 it buffers it, until the port is flushed, at which point it writes a
1073 chunk containing all the data written so far. When the port is closed,
1074 the data remaining is written to @var{port}, as is the terminating zero
1075 chunk. It also causes @var{port} to be closed, unless @var{keep-alive?}
1076 is true.
1077
1078 Note. Forcing a chunked output port when there is no data is buffered
1079 does not write a zero chunk, as this would cause the data to be
1080 interpreted incorrectly by the client.
1081 @end deffn
1082
1083 @example
1084 (call-with-output-string
1085 (lambda (out)
1086 (define out* (make-chunked-output-port out #:keep-alive? #t))
1087 (display "first chunk" out*)
1088 (force-output out*)
1089 (force-output out*) ; note this does not write a zero chunk
1090 (display "second chunk" out*)
1091 (close-port out*)))
1092 @result{} "b\r\nfirst chunk\r\nc\r\nsecond chunk\r\n0\r\n"
1093 @end example
1094
1095 @node Requests
1096 @subsection HTTP Requests
1097
1098 @example
1099 (use-modules (web request))
1100 @end example
1101
1102 The request module contains a data type for HTTP requests.
1103
1104 @subsubsection An Important Note on Character Sets
1105
1106 HTTP requests consist of two parts: the request proper, consisting of a
1107 request line and a set of headers, and (optionally) a body. The body
1108 might have a binary content-type, and even in the textual case its
1109 length is specified in bytes, not characters.
1110
1111 Therefore, HTTP is a fundamentally binary protocol. However the request
1112 line and headers are specified to be in a subset of ASCII, so they can
1113 be treated as text, provided that the port's encoding is set to an
1114 ASCII-compatible one-byte-per-character encoding. ISO-8859-1 (latin-1)
1115 is just such an encoding, and happens to be very efficient for Guile.
1116
1117 So what Guile does when reading requests from the wire, or writing them
1118 out, is to set the port's encoding to latin-1, and treating the request
1119 headers as text.
1120
1121 The request body is another issue. For binary data, the data is
1122 probably in a bytevector, so we use the R6RS binary output procedures to
1123 write out the binary payload. Textual data usually has to be written
1124 out to some character encoding, usually UTF-8, and then the resulting
1125 bytevector is written out to the port.
1126
1127 In summary, Guile reads and writes HTTP over latin-1 sockets, without
1128 any loss of generality.
1129
1130 @subsubsection Request API
1131
1132 @deffn {Scheme Procedure} request?
1133 @deffnx {Scheme Procedure} request-method
1134 @deffnx {Scheme Procedure} request-uri
1135 @deffnx {Scheme Procedure} request-version
1136 @deffnx {Scheme Procedure} request-headers
1137 @deffnx {Scheme Procedure} request-meta
1138 @deffnx {Scheme Procedure} request-port
1139 A predicate and field accessors for the request type. The fields are as
1140 follows:
1141 @table @code
1142 @item method
1143 The HTTP method, for example, @code{GET}.
1144 @item uri
1145 The URI as a URI record.
1146 @item version
1147 The HTTP version pair, like @code{(1 . 1)}.
1148 @item headers
1149 The request headers, as an alist of parsed values.
1150 @item meta
1151 An arbitrary alist of other data, for example information returned in
1152 the @code{sockaddr} from @code{accept} (@pxref{Network Sockets and
1153 Communication}).
1154 @item port
1155 The port on which to read or write a request body, if any.
1156 @end table
1157 @end deffn
1158
1159 @deffn {Scheme Procedure} read-request port [meta='()]
1160 Read an HTTP request from @var{port}, optionally attaching the given
1161 metadata, @var{meta}.
1162
1163 As a side effect, sets the encoding on @var{port} to ISO-8859-1
1164 (latin-1), so that reading one character reads one byte. See the
1165 discussion of character sets above, for more information.
1166
1167 Note that the body is not part of the request. Once you have read a
1168 request, you may read the body separately, and likewise for writing
1169 requests.
1170 @end deffn
1171
1172 @deffn {Scheme Procedure} build-request uri [#:method='GET] [#:version='(1 . 1)] [#:headers='()] [#:port=#f] [#:meta='()] [#:validate-headers?=#t]
1173 Construct an HTTP request object. If @var{validate-headers?} is true,
1174 the headers are each run through their respective validators.
1175 @end deffn
1176
1177 @deffn {Scheme Procedure} write-request r port
1178 Write the given HTTP request to @var{port}.
1179
1180 Return a new request, whose @code{request-port} will continue writing
1181 on @var{port}, perhaps using some transfer encoding.
1182 @end deffn
1183
1184 @deffn {Scheme Procedure} read-request-body r
1185 Reads the request body from @var{r}, as a bytevector. Return @code{#f}
1186 if there was no request body.
1187 @end deffn
1188
1189 @deffn {Scheme Procedure} write-request-body r bv
1190 Write @var{bv}, a bytevector, to the port corresponding to the HTTP
1191 request @var{r}.
1192 @end deffn
1193
1194 The various headers that are typically associated with HTTP requests may
1195 be accessed with these dedicated accessors. @xref{HTTP Headers}, for
1196 more information on the format of parsed headers.
1197
1198 @deffn {Scheme Procedure} request-accept request [default='()]
1199 @deffnx {Scheme Procedure} request-accept-charset request [default='()]
1200 @deffnx {Scheme Procedure} request-accept-encoding request [default='()]
1201 @deffnx {Scheme Procedure} request-accept-language request [default='()]
1202 @deffnx {Scheme Procedure} request-allow request [default='()]
1203 @deffnx {Scheme Procedure} request-authorization request [default=#f]
1204 @deffnx {Scheme Procedure} request-cache-control request [default='()]
1205 @deffnx {Scheme Procedure} request-connection request [default='()]
1206 @deffnx {Scheme Procedure} request-content-encoding request [default='()]
1207 @deffnx {Scheme Procedure} request-content-language request [default='()]
1208 @deffnx {Scheme Procedure} request-content-length request [default=#f]
1209 @deffnx {Scheme Procedure} request-content-location request [default=#f]
1210 @deffnx {Scheme Procedure} request-content-md5 request [default=#f]
1211 @deffnx {Scheme Procedure} request-content-range request [default=#f]
1212 @deffnx {Scheme Procedure} request-content-type request [default=#f]
1213 @deffnx {Scheme Procedure} request-date request [default=#f]
1214 @deffnx {Scheme Procedure} request-expect request [default='()]
1215 @deffnx {Scheme Procedure} request-expires request [default=#f]
1216 @deffnx {Scheme Procedure} request-from request [default=#f]
1217 @deffnx {Scheme Procedure} request-host request [default=#f]
1218 @deffnx {Scheme Procedure} request-if-match request [default=#f]
1219 @deffnx {Scheme Procedure} request-if-modified-since request [default=#f]
1220 @deffnx {Scheme Procedure} request-if-none-match request [default=#f]
1221 @deffnx {Scheme Procedure} request-if-range request [default=#f]
1222 @deffnx {Scheme Procedure} request-if-unmodified-since request [default=#f]
1223 @deffnx {Scheme Procedure} request-last-modified request [default=#f]
1224 @deffnx {Scheme Procedure} request-max-forwards request [default=#f]
1225 @deffnx {Scheme Procedure} request-pragma request [default='()]
1226 @deffnx {Scheme Procedure} request-proxy-authorization request [default=#f]
1227 @deffnx {Scheme Procedure} request-range request [default=#f]
1228 @deffnx {Scheme Procedure} request-referer request [default=#f]
1229 @deffnx {Scheme Procedure} request-te request [default=#f]
1230 @deffnx {Scheme Procedure} request-trailer request [default='()]
1231 @deffnx {Scheme Procedure} request-transfer-encoding request [default='()]
1232 @deffnx {Scheme Procedure} request-upgrade request [default='()]
1233 @deffnx {Scheme Procedure} request-user-agent request [default=#f]
1234 @deffnx {Scheme Procedure} request-via request [default='()]
1235 @deffnx {Scheme Procedure} request-warning request [default='()]
1236 Return the given request header, or @var{default} if none was present.
1237 @end deffn
1238
1239 @deffn {Scheme Procedure} request-absolute-uri r [default-host=#f] [default-port=#f]
1240 A helper routine to determine the absolute URI of a request, using the
1241 @code{host} header and the default host and port.
1242 @end deffn
1243
1244
1245 @node Responses
1246 @subsection HTTP Responses
1247
1248 @example
1249 (use-modules (web response))
1250 @end example
1251
1252 As with requests (@pxref{Requests}), Guile offers a data type for HTTP
1253 responses. Again, the body is represented separately from the request.
1254
1255 @deffn {Scheme Procedure} response?
1256 @deffnx {Scheme Procedure} response-version
1257 @deffnx {Scheme Procedure} response-code
1258 @deffnx {Scheme Procedure} response-reason-phrase response
1259 @deffnx {Scheme Procedure} response-headers
1260 @deffnx {Scheme Procedure} response-port
1261 A predicate and field accessors for the response type. The fields are as
1262 follows:
1263 @table @code
1264 @item version
1265 The HTTP version pair, like @code{(1 . 1)}.
1266 @item code
1267 The HTTP response code, like @code{200}.
1268 @item reason-phrase
1269 The reason phrase, or the standard reason phrase for the response's
1270 code.
1271 @item headers
1272 The response headers, as an alist of parsed values.
1273 @item port
1274 The port on which to read or write a response body, if any.
1275 @end table
1276 @end deffn
1277
1278 @deffn {Scheme Procedure} read-response port
1279 Read an HTTP response from @var{port}.
1280
1281 As a side effect, sets the encoding on @var{port} to ISO-8859-1
1282 (latin-1), so that reading one character reads one byte. See the
1283 discussion of character sets in @ref{Responses}, for more information.
1284 @end deffn
1285
1286 @deffn {Scheme Procedure} build-response [#:version='(1 . 1)] [#:code=200] [#:reason-phrase=#f] [#:headers='()] [#:port=#f] [#:validate-headers?=#t]
1287 Construct an HTTP response object. If @var{validate-headers?} is true,
1288 the headers are each run through their respective validators.
1289 @end deffn
1290
1291 @deffn {Scheme Procedure} adapt-response-version response version
1292 Adapt the given response to a different HTTP version. Return a new HTTP
1293 response.
1294
1295 The idea is that many applications might just build a response for the
1296 default HTTP version, and this method could handle a number of
1297 programmatic transformations to respond to older HTTP versions (0.9 and
1298 1.0). But currently this function is a bit heavy-handed, just updating
1299 the version field.
1300 @end deffn
1301
1302 @deffn {Scheme Procedure} write-response r port
1303 Write the given HTTP response to @var{port}.
1304
1305 Return a new response, whose @code{response-port} will continue writing
1306 on @var{port}, perhaps using some transfer encoding.
1307 @end deffn
1308
1309 @deffn {Scheme Procedure} response-must-not-include-body? r
1310 Some responses, like those with status code 304, are specified as never
1311 having bodies. This predicate returns @code{#t} for those responses.
1312
1313 Note also, though, that responses to @code{HEAD} requests must also not
1314 have a body.
1315 @end deffn
1316
1317 @deffn {Scheme Procedure} read-response-body r
1318 Read the response body from @var{r}, as a bytevector. Returns @code{#f}
1319 if there was no response body.
1320 @end deffn
1321
1322 @deffn {Scheme Procedure} write-response-body r bv
1323 Write @var{bv}, a bytevector, to the port corresponding to the HTTP
1324 response @var{r}.
1325 @end deffn
1326
1327 As with requests, the various headers that are typically associated with
1328 HTTP responses may be accessed with these dedicated accessors.
1329 @xref{HTTP Headers}, for more information on the format of parsed
1330 headers.
1331
1332 @deffn {Scheme Procedure} response-accept-ranges response [default=#f]
1333 @deffnx {Scheme Procedure} response-age response [default='()]
1334 @deffnx {Scheme Procedure} response-allow response [default='()]
1335 @deffnx {Scheme Procedure} response-cache-control response [default='()]
1336 @deffnx {Scheme Procedure} response-connection response [default='()]
1337 @deffnx {Scheme Procedure} response-content-encoding response [default='()]
1338 @deffnx {Scheme Procedure} response-content-language response [default='()]
1339 @deffnx {Scheme Procedure} response-content-length response [default=#f]
1340 @deffnx {Scheme Procedure} response-content-location response [default=#f]
1341 @deffnx {Scheme Procedure} response-content-md5 response [default=#f]
1342 @deffnx {Scheme Procedure} response-content-range response [default=#f]
1343 @deffnx {Scheme Procedure} response-content-type response [default=#f]
1344 @deffnx {Scheme Procedure} response-date response [default=#f]
1345 @deffnx {Scheme Procedure} response-etag response [default=#f]
1346 @deffnx {Scheme Procedure} response-expires response [default=#f]
1347 @deffnx {Scheme Procedure} response-last-modified response [default=#f]
1348 @deffnx {Scheme Procedure} response-location response [default=#f]
1349 @deffnx {Scheme Procedure} response-pragma response [default='()]
1350 @deffnx {Scheme Procedure} response-proxy-authenticate response [default=#f]
1351 @deffnx {Scheme Procedure} response-retry-after response [default=#f]
1352 @deffnx {Scheme Procedure} response-server response [default=#f]
1353 @deffnx {Scheme Procedure} response-trailer response [default='()]
1354 @deffnx {Scheme Procedure} response-transfer-encoding response [default='()]
1355 @deffnx {Scheme Procedure} response-upgrade response [default='()]
1356 @deffnx {Scheme Procedure} response-vary response [default='()]
1357 @deffnx {Scheme Procedure} response-via response [default='()]
1358 @deffnx {Scheme Procedure} response-warning response [default='()]
1359 @deffnx {Scheme Procedure} response-www-authenticate response [default=#f]
1360 Return the given response header, or @var{default} if none was present.
1361 @end deffn
1362
1363
1364 @node Web Client
1365 @subsection Web Client
1366
1367 @code{(web client)} provides a simple, synchronous HTTP client, built on
1368 the lower-level HTTP, request, and response modules.
1369
1370 @deffn {Scheme Procedure} open-socket-for-uri uri
1371 @end deffn
1372
1373 @deffn {Scheme Procedure} http-get uri [#:port=(open-socket-for-uri uri)] [#:version='(1 . 1)] [#:keep-alive?=#f] [#:extra-headers='()] [#:decode-body?=#t]
1374 Connect to the server corresponding to @var{uri} and ask for the
1375 resource, using the @code{GET} method. If you already have a port open,
1376 pass it as @var{port}. The port will be closed at the end of the
1377 request unless @var{keep-alive?} is true. Any extra headers in the
1378 alist @var{extra-headers} will be added to the request.
1379
1380 If @var{decode-body?} is true, as is the default, the body of the
1381 response will be decoded to string, if it is a textual content-type.
1382 Otherwise it will be returned as a bytevector.
1383 @end deffn
1384
1385 @code{http-get} is useful for making one-off requests to web sites. If
1386 you are writing a web spider or some other client that needs to handle a
1387 number of requests in parallel, it's better to build an event-driven URL
1388 fetcher, similar in structure to the web server (@pxref{Web Server}).
1389
1390 Another option, good but not as performant, would be to use threads,
1391 possibly via par-map or futures.
1392
1393 More helper procedures for the other common HTTP verbs would be a good
1394 addition to this module. Send your code to
1395 @email{guile-user@@gnu.org}.
1396
1397
1398 @node Web Server
1399 @subsection Web Server
1400
1401 @code{(web server)} is a generic web server interface, along with a main
1402 loop implementation for web servers controlled by Guile.
1403
1404 @example
1405 (use-modules (web server))
1406 @end example
1407
1408 The lowest layer is the @code{<server-impl>} object, which defines a set
1409 of hooks to open a server, read a request from a client, write a
1410 response to a client, and close a server. These hooks -- @code{open},
1411 @code{read}, @code{write}, and @code{close}, respectively -- are bound
1412 together in a @code{<server-impl>} object. Procedures in this module take a
1413 @code{<server-impl>} object, if needed.
1414
1415 A @code{<server-impl>} may also be looked up by name. If you pass the
1416 @code{http} symbol to @code{run-server}, Guile looks for a variable
1417 named @code{http} in the @code{(web server http)} module, which should
1418 be bound to a @code{<server-impl>} object. Such a binding is made by
1419 instantiation of the @code{define-server-impl} syntax. In this way the
1420 run-server loop can automatically load other backends if available.
1421
1422 The life cycle of a server goes as follows:
1423
1424 @enumerate
1425 @item
1426 The @code{open} hook is called, to open the server. @code{open} takes 0 or
1427 more arguments, depending on the backend, and returns an opaque
1428 server socket object, or signals an error.
1429
1430 @item
1431 The @code{read} hook is called, to read a request from a new client.
1432 The @code{read} hook takes one argument, the server socket. It should
1433 return three values: an opaque client socket, the request, and the
1434 request body. The request should be a @code{<request>} object, from
1435 @code{(web request)}. The body should be a string or a bytevector, or
1436 @code{#f} if there is no body.
1437
1438 If the read failed, the @code{read} hook may return #f for the client
1439 socket, request, and body.
1440
1441 @item
1442 A user-provided handler procedure is called, with the request and body
1443 as its arguments. The handler should return two values: the response,
1444 as a @code{<response>} record from @code{(web response)}, and the
1445 response body as bytevector, or @code{#f} if not present.
1446
1447 The respose and response body are run through @code{sanitize-response},
1448 documented below. This allows the handler writer to take some
1449 convenient shortcuts: for example, instead of a @code{<response>}, the
1450 handler can simply return an alist of headers, in which case a default
1451 response object is constructed with those headers. Instead of a
1452 bytevector for the body, the handler can return a string, which will be
1453 serialized into an appropriate encoding; or it can return a procedure,
1454 which will be called on a port to write out the data. See the
1455 @code{sanitize-response} documentation, for more.
1456
1457 @item
1458 The @code{write} hook is called with three arguments: the client
1459 socket, the response, and the body. The @code{write} hook returns no
1460 values.
1461
1462 @item
1463 At this point the request handling is complete. For a loop, we
1464 loop back and try to read a new request.
1465
1466 @item
1467 If the user interrupts the loop, the @code{close} hook is called on
1468 the server socket.
1469 @end enumerate
1470
1471 A user may define a server implementation with the following form:
1472
1473 @deffn {Scheme Procedure} define-server-impl name open read write close
1474 Make a @code{<server-impl>} object with the hooks @var{open},
1475 @var{read}, @var{write}, and @var{close}, and bind it to the symbol
1476 @var{name} in the current module.
1477 @end deffn
1478
1479 @deffn {Scheme Procedure} lookup-server-impl impl
1480 Look up a server implementation. If @var{impl} is a server
1481 implementation already, it is returned directly. If it is a symbol, the
1482 binding named @var{impl} in the @code{(web server @var{impl})} module is
1483 looked up. Otherwise an error is signaled.
1484
1485 Currently a server implementation is a somewhat opaque type, useful only
1486 for passing to other procedures in this module, like @code{read-client}.
1487 @end deffn
1488
1489 The @code{(web server)} module defines a number of routines that use
1490 @code{<server-impl>} objects to implement parts of a web server. Given
1491 that we don't expose the accessors for the various fields of a
1492 @code{<server-impl>}, indeed these routines are the only procedures with
1493 any access to the impl objects.
1494
1495 @deffn {Scheme Procedure} open-server impl open-params
1496 Open a server for the given implementation. Return one value, the new
1497 server object. The implementation's @code{open} procedure is applied to
1498 @var{open-params}, which should be a list.
1499 @end deffn
1500
1501 @deffn {Scheme Procedure} read-client impl server
1502 Read a new client from @var{server}, by applying the implementation's
1503 @code{read} procedure to the server. If successful, return three
1504 values: an object corresponding to the client, a request object, and the
1505 request body. If any exception occurs, return @code{#f} for all three
1506 values.
1507 @end deffn
1508
1509 @deffn {Scheme Procedure} handle-request handler request body state
1510 Handle a given request, returning the response and body.
1511
1512 The response and response body are produced by calling the given
1513 @var{handler} with @var{request} and @var{body} as arguments.
1514
1515 The elements of @var{state} are also passed to @var{handler} as
1516 arguments, and may be returned as additional values. The new
1517 @var{state}, collected from the @var{handler}'s return values, is then
1518 returned as a list. The idea is that a server loop receives a handler
1519 from the user, along with whatever state values the user is interested
1520 in, allowing the user's handler to explicitly manage its state.
1521 @end deffn
1522
1523 @deffn {Scheme Procedure} sanitize-response request response body
1524 "Sanitize" the given response and body, making them appropriate for the
1525 given request.
1526
1527 As a convenience to web handler authors, @var{response} may be given as
1528 an alist of headers, in which case it is used to construct a default
1529 response. Ensures that the response version corresponds to the request
1530 version. If @var{body} is a string, encodes the string to a bytevector,
1531 in an encoding appropriate for @var{response}. Adds a
1532 @code{content-length} and @code{content-type} header, as necessary.
1533
1534 If @var{body} is a procedure, it is called with a port as an argument,
1535 and the output collected as a bytevector. In the future we might try to
1536 instead use a compressing, chunk-encoded port, and call this procedure
1537 later, in the write-client procedure. Authors are advised not to rely on
1538 the procedure being called at any particular time.
1539 @end deffn
1540
1541 @deffn {Scheme Procedure} write-client impl server client response body
1542 Write an HTTP response and body to @var{client}. If the server and
1543 client support persistent connections, it is the implementation's
1544 responsibility to keep track of the client thereafter, presumably by
1545 attaching it to the @var{server} argument somehow.
1546 @end deffn
1547
1548 @deffn {Scheme Procedure} close-server impl server
1549 Release resources allocated by a previous invocation of
1550 @code{open-server}.
1551 @end deffn
1552
1553 Given the procedures above, it is a small matter to make a web server:
1554
1555 @deffn {Scheme Procedure} serve-one-client handler impl server state
1556 Read one request from @var{server}, call @var{handler} on the request
1557 and body, and write the response to the client. Return the new state
1558 produced by the handler procedure.
1559 @end deffn
1560
1561 @deffn {Scheme Procedure} run-server handler [impl='http] [open-params='()] . state
1562 Run Guile's built-in web server.
1563
1564 @var{handler} should be a procedure that takes two or more arguments,
1565 the HTTP request and request body, and returns two or more values, the
1566 response and response body.
1567
1568 For examples, skip ahead to the next section, @ref{Web Examples}.
1569
1570 The response and body will be run through @code{sanitize-response}
1571 before sending back to the client.
1572
1573 Additional arguments to @var{handler} are taken from @var{state}.
1574 Additional return values are accumulated into a new @var{state}, which
1575 will be used for subsequent requests. In this way a handler can
1576 explicitly manage its state.
1577 @end deffn
1578
1579 The default web server implementation is @code{http}, which binds to a
1580 socket, listening for request on that port.
1581
1582 @deffn {HTTP Implementation} http [#:host=#f] [#:family=AF_INET] [#:addr=INADDR_LOOPBACK] [#:port 8080] [#:socket]
1583 The default HTTP implementation. We document it as a function with
1584 keyword arguments, because that is precisely the way that it is -- all
1585 of the @var{open-params} to @code{run-server} get passed to the
1586 implementation's open function.
1587
1588 @example
1589 ;; The defaults: localhost:8080
1590 (run-server handler)
1591 ;; Same thing
1592 (run-server handler 'http '())
1593 ;; On a different port
1594 (run-server handler 'http '(#:port 8081))
1595 ;; IPv6
1596 (run-server handler 'http '(#:family AF_INET6 #:port 8081))
1597 ;; Custom socket
1598 (run-server handler 'http `(#:socket ,(sudo-make-me-a-socket)))
1599 @end example
1600 @end deffn
1601
1602 @node Web Examples
1603 @subsection Web Examples
1604
1605 Well, enough about the tedious internals. Let's make a web application!
1606
1607 @subsubsection Hello, World!
1608
1609 The first program we have to write, of course, is ``Hello, World!''.
1610 This means that we have to implement a web handler that does what we
1611 want.
1612
1613 Now we define a handler, a function of two arguments and two return
1614 values:
1615
1616 @example
1617 (define (handler request request-body)
1618 (values @var{response} @var{response-body}))
1619 @end example
1620
1621 In this first example, we take advantage of a short-cut, returning an
1622 alist of headers instead of a proper response object. The response body
1623 is our payload:
1624
1625 @example
1626 (define (hello-world-handler request request-body)
1627 (values '((content-type . (text/plain)))
1628 "Hello World!"))
1629 @end example
1630
1631 Now let's test it, by running a server with this handler. Load up the
1632 web server module if you haven't yet done so, and run a server with this
1633 handler:
1634
1635 @example
1636 (use-modules (web server))
1637 (run-server hello-world-handler)
1638 @end example
1639
1640 By default, the web server listens for requests on
1641 @code{localhost:8080}. Visit that address in your web browser to
1642 test. If you see the string, @code{Hello World!}, sweet!
1643
1644 @subsubsection Inspecting the Request
1645
1646 The Hello World program above is a general greeter, responding to all
1647 URIs. To make a more exclusive greeter, we need to inspect the request
1648 object, and conditionally produce different results. So let's load up
1649 the request, response, and URI modules, and do just that.
1650
1651 @example
1652 (use-modules (web server)) ; you probably did this already
1653 (use-modules (web request)
1654 (web response)
1655 (web uri))
1656
1657 (define (request-path-components request)
1658 (split-and-decode-uri-path (uri-path (request-uri request))))
1659
1660 (define (hello-hacker-handler request body)
1661 (if (equal? (request-path-components request)
1662 '("hacker"))
1663 (values '((content-type . (text/plain)))
1664 "Hello hacker!")
1665 (not-found request)))
1666
1667 (run-server hello-hacker-handler)
1668 @end example
1669
1670 Here we see that we have defined a helper to return the components of
1671 the URI path as a list of strings, and used that to check for a request
1672 to @code{/hacker/}. Then the success case is just as before -- visit
1673 @code{http://localhost:8080/hacker/} in your browser to check.
1674
1675 You should always match against URI path components as decoded by
1676 @code{split-and-decode-uri-path}. The above example will work for
1677 @code{/hacker/}, @code{//hacker///}, and @code{/h%61ck%65r}.
1678
1679 But we forgot to define @code{not-found}! If you are pasting these
1680 examples into a REPL, accessing any other URI in your web browser will
1681 drop your Guile console into the debugger:
1682
1683 @example
1684 <unnamed port>:38:7: In procedure module-lookup:
1685 <unnamed port>:38:7: Unbound variable: not-found
1686
1687 Entering a new prompt. Type `,bt' for a backtrace or `,q' to continue.
1688 scheme@@(guile-user) [1]>
1689 @end example
1690
1691 So let's define the function, right there in the debugger. As you
1692 probably know, we'll want to return a 404 response.
1693
1694 @example
1695 ;; Paste this in your REPL
1696 (define (not-found request)
1697 (values (build-response #:code 404)
1698 (string-append "Resource not found: "
1699 (uri->string (request-uri request)))))
1700
1701 ;; Now paste this to let the web server keep going:
1702 ,continue
1703 @end example
1704
1705 Now if you access @code{http://localhost/foo/}, you get this error
1706 message. (Note that some popular web browsers won't show
1707 server-generated 404 messages, showing their own instead, unless the 404
1708 message body is long enough.)
1709
1710 @subsubsection Higher-Level Interfaces
1711
1712 The web handler interface is a common baseline that all kinds of Guile
1713 web applications can use. You will usually want to build something on
1714 top of it, however, especially when producing HTML. Here is a simple
1715 example that builds up HTML output using SXML (@pxref{sxml simple}).
1716
1717 First, load up the modules:
1718
1719 @example
1720 (use-modules (web server)
1721 (web request)
1722 (web response)
1723 (sxml simple))
1724 @end example
1725
1726 Now we define a simple templating function that takes a list of HTML
1727 body elements, as SXML, and puts them in our super template:
1728
1729 @example
1730 (define (templatize title body)
1731 `(html (head (title ,title))
1732 (body ,@@body)))
1733 @end example
1734
1735 For example, the simplest Hello HTML can be produced like this:
1736
1737 @example
1738 (sxml->xml (templatize "Hello!" '((b "Hi!"))))
1739 @print{}
1740 <html><head><title>Hello!</title></head><body><b>Hi!</b></body></html>
1741 @end example
1742
1743 Much better to work with Scheme data types than to work with HTML as
1744 strings. Now we define a little response helper:
1745
1746 @example
1747 (define* (respond #:optional body #:key
1748 (status 200)
1749 (title "Hello hello!")
1750 (doctype "<!DOCTYPE html>\n")
1751 (content-type-params '((charset . "utf-8")))
1752 (content-type 'text/html)
1753 (extra-headers '())
1754 (sxml (and body (templatize title body))))
1755 (values (build-response
1756 #:code status
1757 #:headers `((content-type
1758 . (,content-type ,@@content-type-params))
1759 ,@@extra-headers))
1760 (lambda (port)
1761 (if sxml
1762 (begin
1763 (if doctype (display doctype port))
1764 (sxml->xml sxml port))))))
1765 @end example
1766
1767 Here we see the power of keyword arguments with default initializers. By
1768 the time the arguments are fully parsed, the @code{sxml} local variable
1769 will hold the templated SXML, ready for sending out to the client.
1770
1771 Also, instead of returning the body as a string, @code{respond} gives a
1772 procedure, which will be called by the web server to write out the
1773 response to the client.
1774
1775 Now, a simple example using this responder, which lays out the incoming
1776 headers in an HTML table.
1777
1778 @example
1779 (define (debug-page request body)
1780 (respond
1781 `((h1 "hello world!")
1782 (table
1783 (tr (th "header") (th "value"))
1784 ,@@(map (lambda (pair)
1785 `(tr (td (tt ,(with-output-to-string
1786 (lambda () (display (car pair))))))
1787 (td (tt ,(with-output-to-string
1788 (lambda ()
1789 (write (cdr pair))))))))
1790 (request-headers request))))))
1791
1792 (run-server debug-page)
1793 @end example
1794
1795 Now if you visit any local address in your web browser, we actually see
1796 some HTML, finally.
1797
1798 @subsubsection Conclusion
1799
1800 Well, this is about as far as Guile's built-in web support goes, for
1801 now. There are many ways to make a web application, but hopefully by
1802 standardizing the most fundamental data types, users will be able to
1803 choose the approach that suits them best, while also being able to
1804 switch between implementations of the server. This is a relatively new
1805 part of Guile, so if you have feedback, let us know, and we can take it
1806 into account. Happy hacking on the web!
1807
1808 @c Local Variables:
1809 @c TeX-master: "guile.texi"
1810 @c End: