Merge remote-tracking branch 'origin/stable-2.0'
[bpt/guile.git] / doc / ref / web.texi
1 @c -*-texinfo-*-
2 @c This is part of the GNU Guile Reference Manual.
3 @c Copyright (C) 2010, 2011, 2012 Free Software Foundation, Inc.
4 @c See the file guile.texi for copying conditions.
5
6 @node Web
7 @section @acronym{HTTP}, the Web, and All That
8 @cindex Web
9 @cindex WWW
10 @cindex HTTP
11
12 It has always been possible to connect computers together and share
13 information between them, but the rise of the World-Wide Web over the
14 last couple of decades has made it much easier to do so. The result is
15 a richly connected network of computation, in which Guile forms a part.
16
17 By ``the web'', we mean the HTTP protocol@footnote{Yes, the P is for
18 protocol, but this phrase appears repeatedly in RFC 2616.} as handled by
19 servers, clients, proxies, caches, and the various kinds of messages and
20 message components that can be sent and received by that protocol,
21 notably HTML.
22
23 On one level, the web is text in motion: the protocols themselves are
24 textual (though the payload may be binary), and it's possible to create
25 a socket and speak text to the web. But such an approach is obviously
26 primitive. This section details the higher-level data types and
27 operations provided by Guile: URIs, HTTP request and response records,
28 and a conventional web server implementation.
29
30 The material in this section is arranged in ascending order, in which
31 later concepts build on previous ones. If you prefer to start with the
32 highest-level perspective, @pxref{Web Examples}, and work your way
33 back.
34
35 @menu
36 * Types and the Web:: Types prevent bugs and security problems.
37 * URIs:: Universal Resource Identifiers.
38 * HTTP:: The Hyper-Text Transfer Protocol.
39 * HTTP Headers:: How Guile represents specific header values.
40 * Requests:: HTTP requests.
41 * Responses:: HTTP responses.
42 * Web Client:: Accessing web resources over HTTP.
43 * Web Server:: Serving HTTP to the internet.
44 * Web Examples:: How to use this thing.
45 @end menu
46
47 @node Types and the Web
48 @subsection Types and the Web
49
50 It is a truth universally acknowledged, that a program with good use of
51 data types, will be free from many common bugs. Unfortunately, the
52 common practice in web programming seems to ignore this maxim. This
53 subsection makes the case for expressive data types in web programming.
54
55 By ``expressive data types'', we mean that the data types @emph{say}
56 something about how a program solves a problem. For example, if we
57 choose to represent dates using SRFI 19 date records (@pxref{SRFI-19}),
58 this indicates that there is a part of the program that will always have
59 valid dates. Error handling for a number of basic cases, like invalid
60 dates, occurs on the boundary in which we produce a SRFI 19 date record
61 from other types, like strings.
62
63 With regards to the web, data types are helpful in the two broad phases
64 of HTTP messages: parsing and generation.
65
66 Consider a server, which has to parse a request, and produce a response.
67 Guile will parse the request into an HTTP request object
68 (@pxref{Requests}), with each header parsed into an appropriate Scheme
69 data type. This transition from an incoming stream of characters to
70 typed data is a state change in a program---the strings might parse, or
71 they might not, and something has to happen if they do not. (Guile
72 throws an error in this case.) But after you have the parsed request,
73 ``client'' code (code built on top of the Guile web framework) will not
74 have to check for syntactic validity. The types already make this
75 information manifest.
76
77 This state change on the parsing boundary makes programs more robust,
78 as they themselves are freed from the need to do a number of common
79 error checks, and they can use normal Scheme procedures to handle a
80 request instead of ad-hoc string parsers.
81
82 The need for types on the response generation side (in a server) is more
83 subtle, though not less important. Consider the example of a POST
84 handler, which prints out the text that a user submits from a form.
85 Such a handler might include a procedure like this:
86
87 @example
88 ;; First, a helper procedure
89 (define (para . contents)
90 (string-append "<p>" (string-concatenate contents) "</p>"))
91
92 ;; Now the meat of our simple web application
93 (define (you-said text)
94 (para "You said: " text))
95
96 (display (you-said "Hi!"))
97 @print{} <p>You said: Hi!</p>
98 @end example
99
100 This is a perfectly valid implementation, provided that the incoming
101 text does not contain the special HTML characters @samp{<}, @samp{>}, or
102 @samp{&}. But this provision of a restricted character set is not
103 reflected anywhere in the program itself: we must @emph{assume} that the
104 programmer understands this, and performs the check elsewhere.
105
106 Unfortunately, the short history of the practice of programming does not
107 bear out this assumption. A @dfn{cross-site scripting} (@acronym{XSS})
108 vulnerability is just such a common error in which unfiltered user input
109 is allowed into the output. A user could submit a crafted comment to
110 your web site which results in visitors running malicious Javascript,
111 within the security context of your domain:
112
113 @example
114 (display (you-said "<script src=\"http://bad.com/nasty.js\" />"))
115 @print{} <p>You said: <script src="http://bad.com/nasty.js" /></p>
116 @end example
117
118 The fundamental problem here is that both user data and the program
119 template are represented using strings. This identity means that types
120 can't help the programmer to make a distinction between these two, so
121 they get confused.
122
123 There are a number of possible solutions, but perhaps the best is to
124 treat HTML not as strings, but as native s-expressions: as SXML. The
125 basic idea is that HTML is either text, represented by a string, or an
126 element, represented as a tagged list. So @samp{foo} becomes
127 @samp{"foo"}, and @samp{<b>foo</b>} becomes @samp{(b "foo")}.
128 Attributes, if present, go in a tagged list headed by @samp{@@}, like
129 @samp{(img (@@ (src "http://example.com/foo.png")))}. @xref{sxml
130 simple}, for more information.
131
132 The good thing about SXML is that HTML elements cannot be confused with
133 text. Let's make a new definition of @code{para}:
134
135 @example
136 (define (para . contents)
137 `(p ,@@contents))
138
139 (use-modules (sxml simple))
140 (sxml->xml (you-said "Hi!"))
141 @print{} <p>You said: Hi!</p>
142
143 (sxml->xml (you-said "<i>Rats, foiled again!</i>"))
144 @print{} <p>You said: &lt;i&gt;Rats, foiled again!&lt;/i&gt;</p>
145 @end example
146
147 So we see in the second example that HTML elements cannot be unwittingly
148 introduced into the output. However it is now perfectly acceptable to
149 pass SXML to @code{you-said}; in fact, that is the big advantage of SXML
150 over everything-as-a-string.
151
152 @example
153 (sxml->xml (you-said (you-said "<Hi!>")))
154 @print{} <p>You said: <p>You said: &lt;Hi!&gt;</p></p>
155 @end example
156
157 The SXML types allow procedures to @emph{compose}. The types make
158 manifest which parts are HTML elements, and which are text. So you
159 needn't worry about escaping user input; the type transition back to a
160 string handles that for you. @acronym{XSS} vulnerabilities are a thing
161 of the past.
162
163 Well. That's all very nice and opinionated and such, but how do I use
164 the thing? Read on!
165
166 @node URIs
167 @subsection Universal Resource Identifiers
168
169 Guile provides a standard data type for Universal Resource Identifiers
170 (URIs), as defined in RFC 3986.
171
172 The generic URI syntax is as follows:
173
174 @example
175 URI := scheme ":" ["//" [userinfo "@@"] host [":" port]] path \
176 [ "?" query ] [ "#" fragment ]
177 @end example
178
179 For example, in the URI, @indicateurl{http://www.gnu.org/help/}, the
180 scheme is @code{http}, the host is @code{www.gnu.org}, the path is
181 @code{/help/}, and there is no userinfo, port, query, or fragment. All
182 URIs have a scheme and a path (though the path might be empty). Some
183 URIs have a host, and some of those have ports and userinfo. Any URI
184 might have a query part or a fragment.
185
186 Userinfo is something of an abstraction, as some legacy URI schemes
187 allowed userinfo of the form @code{@var{username}:@var{passwd}}. But
188 since passwords do not belong in URIs, the RFC does not want to condone
189 this practice, so it calls anything before the @code{@@} sign
190 @dfn{userinfo}.
191
192 Properly speaking, a fragment is not part of a URI. For example, when a
193 web browser follows a link to @indicateurl{http://example.com/#foo}, it
194 sends a request for @indicateurl{http://example.com/}, then looks in the
195 resulting page for the fragment identified @code{foo} reference. A
196 fragment identifies a part of a resource, not the resource itself. But
197 it is useful to have a fragment field in the URI record itself, so we
198 hope you will forgive the inconsistency.
199
200 @example
201 (use-modules (web uri))
202 @end example
203
204 The following procedures can be found in the @code{(web uri)}
205 module. Load it into your Guile, using a form like the above, to have
206 access to them.
207
208 @deffn {Scheme Procedure} build-uri scheme [#:userinfo=@code{#f}] [#:host=@code{#f}] @
209 [#:port=@code{#f}] [#:path=@code{""}] [#:query=@code{#f}] @
210 [#:fragment=@code{#f}] [#:validate?=@code{#t}]
211 Construct a URI object. @var{scheme} should be a symbol, and the rest
212 of the fields are either strings or @code{#f}. If @var{validate?} is
213 true, also run some consistency checks to make sure that the constructed
214 URI is valid.
215 @end deffn
216
217 @deffn {Scheme Procedure} uri? x
218 @deffnx {Scheme Procedure} uri-scheme uri
219 @deffnx {Scheme Procedure} uri-userinfo uri
220 @deffnx {Scheme Procedure} uri-host uri
221 @deffnx {Scheme Procedure} uri-port uri
222 @deffnx {Scheme Procedure} uri-path uri
223 @deffnx {Scheme Procedure} uri-query uri
224 @deffnx {Scheme Procedure} uri-fragment uri
225 A predicate and field accessors for the URI record type. The URI scheme
226 will be a symbol, and the rest either strings or @code{#f} if not
227 present.
228 @end deffn
229
230 @deffn {Scheme Procedure} string->uri string
231 Parse @var{string} into a URI object. Return @code{#f} if the string
232 could not be parsed.
233 @end deffn
234
235 @deffn {Scheme Procedure} uri->string uri
236 Serialize @var{uri} to a string. If the URI has a port that is the
237 default port for its scheme, the port is not included in the
238 serialization.
239 @end deffn
240
241 @deffn {Scheme Procedure} declare-default-port! scheme port
242 Declare a default port for the given URI scheme.
243 @end deffn
244
245 @deffn {Scheme Procedure} uri-decode str [#:encoding=@code{"utf-8"}]
246 Percent-decode the given @var{str}, according to @var{encoding}, which
247 should be the name of a character encoding.
248
249 Note that this function should not generally be applied to a full URI
250 string. For paths, use split-and-decode-uri-path instead. For query
251 strings, split the query on @code{&} and @code{=} boundaries, and decode
252 the components separately.
253
254 Note also that percent-encoded strings encode @emph{bytes}, not
255 characters. There is no guarantee that a given byte sequence is a valid
256 string encoding. Therefore this routine may signal an error if the
257 decoded bytes are not valid for the given encoding. Pass @code{#f} for
258 @var{encoding} if you want decoded bytes as a bytevector directly.
259 @xref{Ports, @code{set-port-encoding!}}, for more information on
260 character encodings.
261
262 Returns a string of the decoded characters, or a bytevector if
263 @var{encoding} was @code{#f}.
264 @end deffn
265
266 Fixme: clarify return type. indicate default values. type of
267 unescaped-chars.
268
269 @deffn {Scheme Procedure} uri-encode str [#:encoding=@code{"utf-8"}] [#:unescaped-chars]
270 Percent-encode any character not in the character set,
271 @var{unescaped-chars}.
272
273 The default character set includes alphanumerics from ASCII, as well as
274 the special characters @samp{-}, @samp{.}, @samp{_}, and @samp{~}. Any
275 other character will be percent-encoded, by writing out the character to
276 a bytevector within the given @var{encoding}, then encoding each byte as
277 @code{%@var{HH}}, where @var{HH} is the hexadecimal representation of
278 the byte.
279 @end deffn
280
281 @deffn {Scheme Procedure} split-and-decode-uri-path path
282 Split @var{path} into its components, and decode each component,
283 removing empty components.
284
285 For example, @code{"/foo/bar%20baz/"} decodes to the two-element list,
286 @code{("foo" "bar baz")}.
287 @end deffn
288
289 @deffn {Scheme Procedure} encode-and-join-uri-path parts
290 URI-encode each element of @var{parts}, which should be a list of
291 strings, and join the parts together with @code{/} as a delimiter.
292
293 For example, the list @code{("scrambled eggs" "biscuits&gravy")} encodes
294 as @code{"scrambled%20eggs/biscuits%26gravy"}.
295 @end deffn
296
297 @node HTTP
298 @subsection The Hyper-Text Transfer Protocol
299
300 The initial motivation for including web functionality in Guile, rather
301 than rely on an external package, was to establish a standard base on
302 which people can share code. To that end, we continue the focus on data
303 types by providing a number of low-level parsers and unparsers for
304 elements of the HTTP protocol.
305
306 If you are want to skip the low-level details for now and move on to web
307 pages, @pxref{Web Client}, and @pxref{Web Server}. Otherwise, load the
308 HTTP module, and read on.
309
310 @example
311 (use-modules (web http))
312 @end example
313
314 The focus of the @code{(web http)} module is to parse and unparse
315 standard HTTP headers, representing them to Guile as native data
316 structures. For example, a @code{Date:} header will be represented as a
317 SRFI-19 date record (@pxref{SRFI-19}), rather than as a string.
318
319 Guile tries to follow RFCs fairly strictly---the road to perdition being
320 paved with compatibility hacks---though some allowances are made for
321 not-too-divergent texts.
322
323 Header names are represented as lower-case symbols.
324
325 @deffn {Scheme Procedure} string->header name
326 Parse @var{name} to a symbolic header name.
327 @end deffn
328
329 @deffn {Scheme Procedure} header->string sym
330 Return the string form for the header named @var{sym}.
331 @end deffn
332
333 For example:
334
335 @example
336 (string->header "Content-Length")
337 @result{} content-length
338 (header->string 'content-length)
339 @result{} "Content-Length"
340
341 (string->header "FOO")
342 @result{} foo
343 (header->string 'foo)
344 @result{} "Foo"
345 @end example
346
347 Guile keeps a registry of known headers, their string names, and some
348 parsing and serialization procedures. If a header is unknown, its
349 string name is simply its symbol name in title-case.
350
351 @deffn {Scheme Procedure} known-header? sym
352 Return @code{#t} iff @var{sym} is a known header, with associated
353 parsers and serialization procedures.
354 @end deffn
355
356 @deffn {Scheme Procedure} header-parser sym
357 Return the value parser for headers named @var{sym}. The result is a
358 procedure that takes one argument, a string, and returns the parsed
359 value. If the header isn't known to Guile, a default parser is returned
360 that passes through the string unchanged.
361 @end deffn
362
363 @deffn {Scheme Procedure} header-validator sym
364 Return a predicate which returns @code{#t} if the given value is valid
365 for headers named @var{sym}. The default validator for unknown headers
366 is @code{string?}.
367 @end deffn
368
369 @deffn {Scheme Procedure} header-writer sym
370 Return a procedure that writes values for headers named @var{sym} to a
371 port. The resulting procedure takes two arguments: a value and a port.
372 The default writer is @code{display}.
373 @end deffn
374
375 For more on the set of headers that Guile knows about out of the box,
376 @pxref{HTTP Headers}. To add your own, use the @code{declare-header!}
377 procedure:
378
379 @deffn {Scheme Procedure} declare-header! name parser validator writer [#:multiple?=@code{#f}]
380 Declare a parser, validator, and writer for a given header.
381 @end deffn
382
383 For example, let's say you are running a web server behind some sort of
384 proxy, and your proxy adds an @code{X-Client-Address} header, indicating
385 the IPv4 address of the original client. You would like for the HTTP
386 request record to parse out this header to a Scheme value, instead of
387 leaving it as a string. You could register this header with Guile's
388 HTTP stack like this:
389
390 @example
391 (declare-header! "X-Client-Address"
392 (lambda (str)
393 (inet-aton str))
394 (lambda (ip)
395 (and (integer? ip) (exact? ip) (<= 0 ip #xffffffff)))
396 (lambda (ip port)
397 (display (inet-ntoa ip) port)))
398 @end example
399
400 @deffn {Scheme Procedure} valid-header? sym val
401 Return a true value iff @var{val} is a valid Scheme value for the header
402 with name @var{sym}.
403 @end deffn
404
405 Now that we have a generic interface for reading and writing headers, we
406 do just that.
407
408 @deffn {Scheme Procedure} read-header port
409 Read one HTTP header from @var{port}. Return two values: the header
410 name and the parsed Scheme value. May raise an exception if the header
411 was known but the value was invalid.
412
413 Returns the end-of-file object for both values if the end of the message
414 body was reached (i.e., a blank line).
415 @end deffn
416
417 @deffn {Scheme Procedure} parse-header name val
418 Parse @var{val}, a string, with the parser for the header named
419 @var{name}. Returns the parsed value.
420 @end deffn
421
422 @deffn {Scheme Procedure} write-header name val port
423 Write the given header name and value to @var{port}, using the writer
424 from @code{header-writer}.
425 @end deffn
426
427 @deffn {Scheme Procedure} read-headers port
428 Read the headers of an HTTP message from @var{port}, returning the
429 headers as an ordered alist.
430 @end deffn
431
432 @deffn {Scheme Procedure} write-headers headers port
433 Write the given header alist to @var{port}. Doesn't write the final
434 @samp{\r\n}, as the user might want to add another header.
435 @end deffn
436
437 The @code{(web http)} module also has some utility procedures to read
438 and write request and response lines.
439
440 @deffn {Scheme Procedure} parse-http-method str [start] [end]
441 Parse an HTTP method from @var{str}. The result is an upper-case symbol,
442 like @code{GET}.
443 @end deffn
444
445 @deffn {Scheme Procedure} parse-http-version str [start] [end]
446 Parse an HTTP version from @var{str}, returning it as a major-minor
447 pair. For example, @code{HTTP/1.1} parses as the pair of integers,
448 @code{(1 . 1)}.
449 @end deffn
450
451 @deffn {Scheme Procedure} parse-request-uri str [start] [end]
452 Parse a URI from an HTTP request line. Note that URIs in requests do not
453 have to have a scheme or host name. The result is a URI object.
454 @end deffn
455
456 @deffn {Scheme Procedure} read-request-line port
457 Read the first line of an HTTP request from @var{port}, returning three
458 values: the method, the URI, and the version.
459 @end deffn
460
461 @deffn {Scheme Procedure} write-request-line method uri version port
462 Write the first line of an HTTP request to @var{port}.
463 @end deffn
464
465 @deffn {Scheme Procedure} read-response-line port
466 Read the first line of an HTTP response from @var{port}, returning three
467 values: the HTTP version, the response code, and the "reason phrase".
468 @end deffn
469
470 @deffn {Scheme Procedure} write-response-line version code reason-phrase port
471 Write the first line of an HTTP response to @var{port}.
472 @end deffn
473
474
475 @node HTTP Headers
476 @subsection HTTP Headers
477
478 In addition to defining the infrastructure to parse headers, the
479 @code{(web http)} module defines specific parsers and unparsers for all
480 headers defined in the HTTP/1.1 standard.
481
482 For example, if you receive a header named @samp{Accept-Language} with a
483 value @samp{en, es;q=0.8}, Guile parses it as a quality list (defined
484 below):
485
486 @example
487 (parse-header 'accept-language "en, es;q=0.8")
488 @result{} ((1000 . "en") (800 . "es"))
489 @end example
490
491 The format of the value for @samp{Accept-Language} headers is defined
492 below, along with all other headers defined in the HTTP standard. (If
493 the header were unknown, the value would have been returned as a
494 string.)
495
496 For brevity, the header definitions below are given in the form,
497 @var{Type} @code{@var{name}}, indicating that values for the header
498 @code{@var{name}} will be of the given @var{Type}. Since Guile
499 internally treats header names in lower case, in this document we give
500 types title-cased names. A short description of the each header's
501 purpose and an example follow.
502
503 For full details on the meanings of all of these headers, see the HTTP
504 1.1 standard, RFC 2616.
505
506 @subsubsection HTTP Header Types
507
508 Here we define the types that are used below, when defining headers.
509
510 @deftp {HTTP Header Type} Date
511 A SRFI-19 date.
512 @end deftp
513
514 @deftp {HTTP Header Type} KVList
515 A list whose elements are keys or key-value pairs. Keys are parsed to
516 symbols. Values are strings by default. Non-string values are the
517 exception, and are mentioned explicitly below, as appropriate.
518 @end deftp
519
520 @deftp {HTTP Header Type} SList
521 A list of strings.
522 @end deftp
523
524 @deftp {HTTP Header Type} Quality
525 An exact integer between 0 and 1000. Qualities are used to express
526 preference, given multiple options. An option with a quality of 870,
527 for example, is preferred over an option with quality 500.
528
529 (Qualities are written out over the wire as numbers between 0.0 and
530 1.0, but since the standard only allows three digits after the decimal,
531 it's equivalent to integers between 0 and 1000, so that's what Guile
532 uses.)
533 @end deftp
534
535 @deftp {HTTP Header Type} QList
536 A quality list: a list of pairs, the car of which is a quality, and the
537 cdr a string. Used to express a list of options, along with their
538 qualities.
539 @end deftp
540
541 @deftp {HTTP Header Type} ETag
542 An entity tag, represented as a pair. The car of the pair is an opaque
543 string, and the cdr is @code{#t} if the entity tag is a ``strong'' entity
544 tag, and @code{#f} otherwise.
545 @end deftp
546
547 @subsubsection General Headers
548
549 General HTTP headers may be present in any HTTP message.
550
551 @deftypevr {HTTP Header} KVList cache-control
552 A key-value list of cache-control directives. See RFC 2616, for more
553 details.
554
555 If present, parameters to @code{max-age}, @code{max-stale},
556 @code{min-fresh}, and @code{s-maxage} are all parsed as non-negative
557 integers.
558
559 If present, parameters to @code{private} and @code{no-cache} are parsed
560 as lists of header names, as symbols.
561
562 @example
563 (parse-header 'cache-control "no-cache,no-store"
564 @result{} (no-cache no-store)
565 (parse-header 'cache-control "no-cache=\"Authorization,Date\",no-store"
566 @result{} ((no-cache . (authorization date)) no-store)
567 (parse-header 'cache-control "no-cache=\"Authorization,Date\",max-age=10"
568 @result{} ((no-cache . (authorization date)) (max-age . 10))
569 @end example
570 @end deftypevr
571
572 @deftypevr {HTTP Header} List connection
573 A list of header names that apply only to this HTTP connection, as
574 symbols. Additionally, the symbol @samp{close} may be present, to
575 indicate that the server should close the connection after responding to
576 the request.
577 @example
578 (parse-header 'connection "close")
579 @result{} (close)
580 @end example
581 @end deftypevr
582
583 @deftypevr {HTTP Header} Date date
584 The date that a given HTTP message was originated.
585 @example
586 (parse-header 'date "Tue, 15 Nov 1994 08:12:31 GMT")
587 @result{} #<date ...>
588 @end example
589 @end deftypevr
590
591 @deftypevr {HTTP Header} KVList pragma
592 A key-value list of implementation-specific directives.
593 @example
594 (parse-header 'pragma "no-cache, broccoli=tasty")
595 @result{} (no-cache (broccoli . "tasty"))
596 @end example
597 @end deftypevr
598
599 @deftypevr {HTTP Header} List trailer
600 A list of header names which will appear after the message body, instead
601 of with the message headers.
602 @example
603 (parse-header 'trailer "ETag")
604 @result{} (etag)
605 @end example
606 @end deftypevr
607
608 @deftypevr {HTTP Header} List transfer-encoding
609 A list of transfer codings, expressed as key-value lists. The only
610 transfer coding defined by the specification is @code{chunked}.
611 @example
612 (parse-header 'transfer-encoding "chunked")
613 @result{} ((chunked))
614 @end example
615 @end deftypevr
616
617 @deftypevr {HTTP Header} List upgrade
618 A list of strings, indicating additional protocols that a server could use
619 in response to a request.
620 @example
621 (parse-header 'upgrade "WebSocket")
622 @result{} ("WebSocket")
623 @end example
624 @end deftypevr
625
626 FIXME: parse out more fully?
627 @deftypevr {HTTP Header} List via
628 A list of strings, indicating the protocol versions and hosts of
629 intermediate servers and proxies. There may be multiple @code{via}
630 headers in one message.
631 @example
632 (parse-header 'via "1.0 venus, 1.1 mars")
633 @result{} ("1.0 venus" "1.1 mars")
634 @end example
635 @end deftypevr
636
637 @deftypevr {HTTP Header} List warning
638 A list of warnings given by a server or intermediate proxy. Each
639 warning is a itself a list of four elements: a code, as an exact integer
640 between 0 and 1000, a host as a string, the warning text as a string,
641 and either @code{#f} or a SRFI-19 date.
642
643 There may be multiple @code{warning} headers in one message.
644 @example
645 (parse-header 'warning "123 foo \"core breach imminent\"")
646 @result{} ((123 "foo" "core-breach imminent" #f))
647 @end example
648 @end deftypevr
649
650
651 @subsubsection Entity Headers
652
653 Entity headers may be present in any HTTP message, and refer to the
654 resource referenced in the HTTP request or response.
655
656 @deftypevr {HTTP Header} List allow
657 A list of allowed methods on a given resource, as symbols.
658 @example
659 (parse-header 'allow "GET, HEAD")
660 @result{} (GET HEAD)
661 @end example
662 @end deftypevr
663
664 @deftypevr {HTTP Header} List content-encoding
665 A list of content codings, as symbols.
666 @example
667 (parse-header 'content-encoding "gzip")
668 @result{} (gzip)
669 @end example
670 @end deftypevr
671
672 @deftypevr {HTTP Header} List content-language
673 The languages that a resource is in, as strings.
674 @example
675 (parse-header 'content-language "en")
676 @result{} ("en")
677 @end example
678 @end deftypevr
679
680 @deftypevr {HTTP Header} UInt content-length
681 The number of bytes in a resource, as an exact, non-negative integer.
682 @example
683 (parse-header 'content-length "300")
684 @result{} 300
685 @end example
686 @end deftypevr
687
688 @deftypevr {HTTP Header} URI content-location
689 The canonical URI for a resource, in the case that it is also accessible
690 from a different URI.
691 @example
692 (parse-header 'content-location "http://example.com/foo")
693 @result{} #<<uri> ...>
694 @end example
695 @end deftypevr
696
697 @deftypevr {HTTP Header} String content-md5
698 The MD5 digest of a resource.
699 @example
700 (parse-header 'content-md5 "ffaea1a79810785575e29e2bd45e2fa5")
701 @result{} "ffaea1a79810785575e29e2bd45e2fa5"
702 @end example
703 @end deftypevr
704
705 @deftypevr {HTTP Header} List content-range
706 A range specification, as a list of three elements: the symbol
707 @code{bytes}, either the symbol @code{*} or a pair of integers,
708 indicating the byte rage, and either @code{*} or an integer, for the
709 instance length. Used to indicate that a response only includes part of
710 a resource.
711 @example
712 (parse-header 'content-range "bytes 10-20/*")
713 @result{} (bytes (10 . 20) *)
714 @end example
715 @end deftypevr
716
717 @deftypevr {HTTP Header} List content-type
718 The MIME type of a resource, as a symbol, along with any parameters.
719 @example
720 (parse-header 'content-length "text/plain")
721 @result{} (text/plain)
722 (parse-header 'content-length "text/plain;charset=utf-8")
723 @result{} (text/plain (charset . "utf-8"))
724 @end example
725 Note that the @code{charset} parameter is something is a misnomer, and
726 the HTTP specification admits this. It specifies the @emph{encoding} of
727 the characters, not the character set.
728 @end deftypevr
729
730 @deftypevr {HTTP Header} Date expires
731 The date/time after which the resource given in a response is considered
732 stale.
733 @example
734 (parse-header 'expires "Tue, 15 Nov 1994 08:12:31 GMT")
735 @result{} #<date ...>
736 @end example
737 @end deftypevr
738
739 @deftypevr {HTTP Header} Date last-modified
740 The date/time on which the resource given in a response was last
741 modified.
742 @example
743 (parse-header 'expires "Tue, 15 Nov 1994 08:12:31 GMT")
744 @result{} #<date ...>
745 @end example
746 @end deftypevr
747
748
749 @subsubsection Request Headers
750
751 Request headers may only appear in an HTTP request, not in a response.
752
753 @deftypevr {HTTP Header} List accept
754 A list of preferred media types for a response. Each element of the
755 list is itself a list, in the same format as @code{content-type}.
756 @example
757 (parse-header 'accept "text/html,text/plain;charset=utf-8")
758 @result{} ((text/html) (text/plain (charset . "utf-8")))
759 @end example
760 Preference is expressed with quality values:
761 @example
762 (parse-header 'accept "text/html;q=0.8,text/plain;q=0.6")
763 @result{} ((text/html (q . 800)) (text/plain (q . 600)))
764 @end example
765 @end deftypevr
766
767 @deftypevr {HTTP Header} QList accept-charset
768 A quality list of acceptable charsets. Note again that what HTTP calls
769 a ``charset'' is what Guile calls a ``character encoding''.
770 @example
771 (parse-header 'accept-charset "iso-8859-5, unicode-1-1;q=0.8")
772 @result{} ((1000 . "iso-8859-5") (800 . "unicode-1-1"))
773 @end example
774 @end deftypevr
775
776 @deftypevr {HTTP Header} QList accept-encoding
777 A quality list of acceptable content codings.
778 @example
779 (parse-header 'accept-encoding "gzip,identity=0.8")
780 @result{} ((1000 . "gzip") (800 . "identity"))
781 @end example
782 @end deftypevr
783
784 @deftypevr {HTTP Header} QList accept-language
785 A quality list of acceptable languages.
786 @example
787 (parse-header 'accept-language "cn,en=0.75")
788 @result{} ((1000 . "cn") (750 . "en"))
789 @end example
790 @end deftypevr
791
792 @deftypevr {HTTP Header} Pair authorization
793 Authorization credentials. The car of the pair indicates the
794 authentication scheme, like @code{basic}. For basic authentication, the
795 cdr of the pair will be the base64-encoded @samp{@var{user}:@var{pass}}
796 string. For other authentication schemes, like @code{digest}, the cdr
797 will be a key-value list of credentials.
798 @example
799 (parse-header 'authorization "Basic QWxhZGRpbjpvcGVuIHNlc2FtZQ=="
800 @result{} (basic . "QWxhZGRpbjpvcGVuIHNlc2FtZQ==")
801 @end example
802 @end deftypevr
803
804 @deftypevr {HTTP Header} List expect
805 A list of expectations that a client has of a server. The expectations
806 are key-value lists.
807 @example
808 (parse-header 'expect "100-continue")
809 @result{} ((100-continue))
810 @end example
811 @end deftypevr
812
813 @deftypevr {HTTP Header} String from
814 The email address of a user making an HTTP request.
815 @example
816 (parse-header 'from "bob@@example.com")
817 @result{} "bob@@example.com"
818 @end example
819 @end deftypevr
820
821 @deftypevr {HTTP Header} Pair host
822 The host for the resource being requested, as a hostname-port pair. If
823 no port is given, the port is @code{#f}.
824 @example
825 (parse-header 'host "gnu.org:80")
826 @result{} ("gnu.org" . 80)
827 (parse-header 'host "gnu.org")
828 @result{} ("gnu.org" . #f)
829 @end example
830 @end deftypevr
831
832 @deftypevr {HTTP Header} *|List if-match
833 A set of etags, indicating that the request should proceed if and only
834 if the etag of the resource is in that set. Either the symbol @code{*},
835 indicating any etag, or a list of entity tags.
836 @example
837 (parse-header 'if-match "*")
838 @result{} *
839 (parse-header 'if-match "asdfadf")
840 @result{} (("asdfadf" . #t))
841 (parse-header 'if-match W/"asdfadf")
842 @result{} (("asdfadf" . #f))
843 @end example
844 @end deftypevr
845
846 @deftypevr {HTTP Header} Date if-modified-since
847 Indicates that a response should proceed if and only if the resource has
848 been modified since the given date.
849 @example
850 (parse-header 'if-modified-since "Tue, 15 Nov 1994 08:12:31 GMT")
851 @result{} #<date ...>
852 @end example
853 @end deftypevr
854
855 @deftypevr {HTTP Header} *|List if-none-match
856 A set of etags, indicating that the request should proceed if and only
857 if the etag of the resource is not in the set. Either the symbol
858 @code{*}, indicating any etag, or a list of entity tags.
859 @example
860 (parse-header 'if-none-match "*")
861 @result{} *
862 @end example
863 @end deftypevr
864
865 @deftypevr {HTTP Header} ETag|Date if-range
866 Indicates that the range request should proceed if and only if the
867 resource matches a modification date or an etag. Either an entity tag,
868 or a SRFI-19 date.
869 @example
870 (parse-header 'if-range "\"original-etag\"")
871 @result{} ("original-etag" . #t)
872 @end example
873 @end deftypevr
874
875 @deftypevr {HTTP Header} Date if-unmodified-since
876 Indicates that a response should proceed if and only if the resource has
877 not been modified since the given date.
878 @example
879 (parse-header 'if-not-modified-since "Tue, 15 Nov 1994 08:12:31 GMT")
880 @result{} #<date ...>
881 @end example
882 @end deftypevr
883
884 @deftypevr {HTTP Header} UInt max-forwards
885 The maximum number of proxy or gateway hops that a request should be
886 subject to.
887 @example
888 (parse-header 'max-forwards "10")
889 @result{} 10
890 @end example
891 @end deftypevr
892
893 @deftypevr {HTTP Header} Pair proxy-authorization
894 Authorization credentials for a proxy connection. See the documentation
895 for @code{authorization} above for more information on the format.
896 @example
897 (parse-header 'proxy-authorization "Digest foo=bar,baz=qux"
898 @result{} (digest (foo . "bar") (baz . "qux"))
899 @end example
900 @end deftypevr
901
902 @deftypevr {HTTP Header} Pair range
903 A range request, indicating that the client wants only part of a
904 resource. The car of the pair is the symbol @code{bytes}, and the cdr
905 is a list of pairs. Each element of the cdr indicates a range; the car
906 is the first byte position and the cdr is the last byte position, as
907 integers, or @code{#f} if not given.
908 @example
909 (parse-header 'range "bytes=10-30,50-")
910 @result{} (bytes (10 . 30) (50 . #f))
911 @end example
912 @end deftypevr
913
914 @deftypevr {HTTP Header} URI referer
915 The URI of the resource that referred the user to this resource. The
916 name of the header is a misspelling, but we are stuck with it.
917 @example
918 (parse-header 'referer "http://www.gnu.org/")
919 @result{} #<uri ...>
920 @end example
921 @end deftypevr
922
923 @deftypevr {HTTP Header} List te
924 A list of transfer codings, expressed as key-value lists. A common
925 transfer coding is @code{trailers}.
926 @example
927 (parse-header 'te "trailers")
928 @result{} ((trailers))
929 @end example
930 @end deftypevr
931
932 @deftypevr {HTTP Header} String user-agent
933 A string indicating the user agent making the request. The
934 specification defines a structured format for this header, but it is
935 widely disregarded, so Guile does not attempt to parse strictly.
936 @example
937 (parse-header 'user-agent "Mozilla/5.0")
938 @result{} "Mozilla/5.0"
939 @end example
940 @end deftypevr
941
942
943 @subsubsection Response Headers
944
945 @deftypevr {HTTP Header} List accept-ranges
946 A list of range units that the server supports, as symbols.
947 @example
948 (parse-header 'accept-ranges "bytes")
949 @result{} (bytes)
950 @end example
951 @end deftypevr
952
953 @deftypevr {HTTP Header} UInt age
954 The age of a cached response, in seconds.
955 @example
956 (parse-header 'age "3600")
957 @result{} 3600
958 @end example
959 @end deftypevr
960
961 @deftypevr {HTTP Header} ETag etag
962 The entity-tag of the resource.
963 @example
964 (parse-header 'etag "\"foo\"")
965 @result{} ("foo" . #t)
966 @end example
967 @end deftypevr
968
969 @deftypevr {HTTP Header} URI location
970 A URI on which a request may be completed. Used in combination with a
971 redirecting status code to perform client-side redirection.
972 @example
973 (parse-header 'location "http://example.com/other")
974 @result{} #<uri ...>
975 @end example
976 @end deftypevr
977
978 @deftypevr {HTTP Header} List proxy-authenticate
979 A list of challenges to a proxy, indicating the need for authentication.
980 @example
981 (parse-header 'proxy-authenticate "Basic realm=\"foo\"")
982 @result{} ((basic (realm . "foo")))
983 @end example
984 @end deftypevr
985
986 @deftypevr {HTTP Header} UInt|Date retry-after
987 Used in combination with a server-busy status code, like 503, to
988 indicate that a client should retry later. Either a number of seconds,
989 or a date.
990 @example
991 (parse-header 'retry-after "60")
992 @result{} 60
993 @end example
994 @end deftypevr
995
996 @deftypevr {HTTP Header} String server
997 A string identifying the server.
998 @example
999 (parse-header 'server "My first web server")
1000 @result{} "My first web server"
1001 @end example
1002 @end deftypevr
1003
1004 @deftypevr {HTTP Header} *|List vary
1005 A set of request headers that were used in computing this response.
1006 Used to indicate that server-side content negotiation was performed, for
1007 example in response to the @code{accept-language} header. Can also be
1008 the symbol @code{*}, indicating that all headers were considered.
1009 @example
1010 (parse-header 'vary "Accept-Language, Accept")
1011 @result{} (accept-language accept)
1012 @end example
1013 @end deftypevr
1014
1015 @deftypevr {HTTP Header} List www-authenticate
1016 A list of challenges to a user, indicating the need for authentication.
1017 @example
1018 (parse-header 'www-authenticate "Basic realm=\"foo\"")
1019 @result{} ((basic (realm . "foo")))
1020 @end example
1021 @end deftypevr
1022
1023
1024 @node Requests
1025 @subsection HTTP Requests
1026
1027 @example
1028 (use-modules (web request))
1029 @end example
1030
1031 The request module contains a data type for HTTP requests.
1032
1033 @subsubsection An Important Note on Character Sets
1034
1035 HTTP requests consist of two parts: the request proper, consisting of a
1036 request line and a set of headers, and (optionally) a body. The body
1037 might have a binary content-type, and even in the textual case its
1038 length is specified in bytes, not characters.
1039
1040 Therefore, HTTP is a fundamentally binary protocol. However the request
1041 line and headers are specified to be in a subset of ASCII, so they can
1042 be treated as text, provided that the port's encoding is set to an
1043 ASCII-compatible one-byte-per-character encoding. ISO-8859-1 (latin-1)
1044 is just such an encoding, and happens to be very efficient for Guile.
1045
1046 So what Guile does when reading requests from the wire, or writing them
1047 out, is to set the port's encoding to latin-1, and treating the request
1048 headers as text.
1049
1050 The request body is another issue. For binary data, the data is
1051 probably in a bytevector, so we use the R6RS binary output procedures to
1052 write out the binary payload. Textual data usually has to be written
1053 out to some character encoding, usually UTF-8, and then the resulting
1054 bytevector is written out to the port.
1055
1056 In summary, Guile reads and writes HTTP over latin-1 sockets, without
1057 any loss of generality.
1058
1059 @subsubsection Request API
1060
1061 @deffn {Scheme Procedure} request?
1062 @deffnx {Scheme Procedure} request-method
1063 @deffnx {Scheme Procedure} request-uri
1064 @deffnx {Scheme Procedure} request-version
1065 @deffnx {Scheme Procedure} request-headers
1066 @deffnx {Scheme Procedure} request-meta
1067 @deffnx {Scheme Procedure} request-port
1068 A predicate and field accessors for the request type. The fields are as
1069 follows:
1070 @table @code
1071 @item method
1072 The HTTP method, for example, @code{GET}.
1073 @item uri
1074 The URI as a URI record.
1075 @item version
1076 The HTTP version pair, like @code{(1 . 1)}.
1077 @item headers
1078 The request headers, as an alist of parsed values.
1079 @item meta
1080 An arbitrary alist of other data, for example information returned in
1081 the @code{sockaddr} from @code{accept} (@pxref{Network Sockets and
1082 Communication}).
1083 @item port
1084 The port on which to read or write a request body, if any.
1085 @end table
1086 @end deffn
1087
1088 @deffn {Scheme Procedure} read-request port [meta='()]
1089 Read an HTTP request from @var{port}, optionally attaching the given
1090 metadata, @var{meta}.
1091
1092 As a side effect, sets the encoding on @var{port} to ISO-8859-1
1093 (latin-1), so that reading one character reads one byte. See the
1094 discussion of character sets above, for more information.
1095
1096 Note that the body is not part of the request. Once you have read a
1097 request, you may read the body separately, and likewise for writing
1098 requests.
1099 @end deffn
1100
1101 @deffn {Scheme Procedure} build-request uri [#:method='GET] [#:version='(1 . 1)] [#:headers='()] [#:port=#f] [#:meta='()] [#:validate-headers?=#t]
1102 Construct an HTTP request object. If @var{validate-headers?} is true,
1103 the headers are each run through their respective validators.
1104 @end deffn
1105
1106 @deffn {Scheme Procedure} write-request r port
1107 Write the given HTTP request to @var{port}.
1108
1109 Return a new request, whose @code{request-port} will continue writing
1110 on @var{port}, perhaps using some transfer encoding.
1111 @end deffn
1112
1113 @deffn {Scheme Procedure} read-request-body r
1114 Reads the request body from @var{r}, as a bytevector. Return @code{#f}
1115 if there was no request body.
1116 @end deffn
1117
1118 @deffn {Scheme Procedure} write-request-body r bv
1119 Write @var{bv}, a bytevector, to the port corresponding to the HTTP
1120 request @var{r}.
1121 @end deffn
1122
1123 The various headers that are typically associated with HTTP requests may
1124 be accessed with these dedicated accessors. @xref{HTTP Headers}, for
1125 more information on the format of parsed headers.
1126
1127 @deffn {Scheme Procedure} request-accept request [default='()]
1128 @deffnx {Scheme Procedure} request-accept-charset request [default='()]
1129 @deffnx {Scheme Procedure} request-accept-encoding request [default='()]
1130 @deffnx {Scheme Procedure} request-accept-language request [default='()]
1131 @deffnx {Scheme Procedure} request-allow request [default='()]
1132 @deffnx {Scheme Procedure} request-authorization request [default=#f]
1133 @deffnx {Scheme Procedure} request-cache-control request [default='()]
1134 @deffnx {Scheme Procedure} request-connection request [default='()]
1135 @deffnx {Scheme Procedure} request-content-encoding request [default='()]
1136 @deffnx {Scheme Procedure} request-content-language request [default='()]
1137 @deffnx {Scheme Procedure} request-content-length request [default=#f]
1138 @deffnx {Scheme Procedure} request-content-location request [default=#f]
1139 @deffnx {Scheme Procedure} request-content-md5 request [default=#f]
1140 @deffnx {Scheme Procedure} request-content-range request [default=#f]
1141 @deffnx {Scheme Procedure} request-content-type request [default=#f]
1142 @deffnx {Scheme Procedure} request-date request [default=#f]
1143 @deffnx {Scheme Procedure} request-expect request [default='()]
1144 @deffnx {Scheme Procedure} request-expires request [default=#f]
1145 @deffnx {Scheme Procedure} request-from request [default=#f]
1146 @deffnx {Scheme Procedure} request-host request [default=#f]
1147 @deffnx {Scheme Procedure} request-if-match request [default=#f]
1148 @deffnx {Scheme Procedure} request-if-modified-since request [default=#f]
1149 @deffnx {Scheme Procedure} request-if-none-match request [default=#f]
1150 @deffnx {Scheme Procedure} request-if-range request [default=#f]
1151 @deffnx {Scheme Procedure} request-if-unmodified-since request [default=#f]
1152 @deffnx {Scheme Procedure} request-last-modified request [default=#f]
1153 @deffnx {Scheme Procedure} request-max-forwards request [default=#f]
1154 @deffnx {Scheme Procedure} request-pragma request [default='()]
1155 @deffnx {Scheme Procedure} request-proxy-authorization request [default=#f]
1156 @deffnx {Scheme Procedure} request-range request [default=#f]
1157 @deffnx {Scheme Procedure} request-referer request [default=#f]
1158 @deffnx {Scheme Procedure} request-te request [default=#f]
1159 @deffnx {Scheme Procedure} request-trailer request [default='()]
1160 @deffnx {Scheme Procedure} request-transfer-encoding request [default='()]
1161 @deffnx {Scheme Procedure} request-upgrade request [default='()]
1162 @deffnx {Scheme Procedure} request-user-agent request [default=#f]
1163 @deffnx {Scheme Procedure} request-via request [default='()]
1164 @deffnx {Scheme Procedure} request-warning request [default='()]
1165 Return the given request header, or @var{default} if none was present.
1166 @end deffn
1167
1168 @deffn {Scheme Procedure} request-absolute-uri r [default-host=#f] [default-port=#f]
1169 A helper routine to determine the absolute URI of a request, using the
1170 @code{host} header and the default host and port.
1171 @end deffn
1172
1173
1174 @node Responses
1175 @subsection HTTP Responses
1176
1177 @example
1178 (use-modules (web response))
1179 @end example
1180
1181 As with requests (@pxref{Requests}), Guile offers a data type for HTTP
1182 responses. Again, the body is represented separately from the request.
1183
1184 @deffn {Scheme Procedure} response?
1185 @deffnx {Scheme Procedure} response-version
1186 @deffnx {Scheme Procedure} response-code
1187 @deffnx {Scheme Procedure} response-reason-phrase response
1188 @deffnx {Scheme Procedure} response-headers
1189 @deffnx {Scheme Procedure} response-port
1190 A predicate and field accessors for the response type. The fields are as
1191 follows:
1192 @table @code
1193 @item version
1194 The HTTP version pair, like @code{(1 . 1)}.
1195 @item code
1196 The HTTP response code, like @code{200}.
1197 @item reason-phrase
1198 The reason phrase, or the standard reason phrase for the response's
1199 code.
1200 @item headers
1201 The response headers, as an alist of parsed values.
1202 @item port
1203 The port on which to read or write a response body, if any.
1204 @end table
1205 @end deffn
1206
1207 @deffn {Scheme Procedure} read-response port
1208 Read an HTTP response from @var{port}.
1209
1210 As a side effect, sets the encoding on @var{port} to ISO-8859-1
1211 (latin-1), so that reading one character reads one byte. See the
1212 discussion of character sets in @ref{Responses}, for more information.
1213 @end deffn
1214
1215 @deffn {Scheme Procedure} build-response [#:version='(1 . 1)] [#:code=200] [#:reason-phrase=#f] [#:headers='()] [#:port=#f] [#:validate-headers?=#t]
1216 Construct an HTTP response object. If @var{validate-headers?} is true,
1217 the headers are each run through their respective validators.
1218 @end deffn
1219
1220 @deffn {Scheme Procedure} adapt-response-version response version
1221 Adapt the given response to a different HTTP version. Return a new HTTP
1222 response.
1223
1224 The idea is that many applications might just build a response for the
1225 default HTTP version, and this method could handle a number of
1226 programmatic transformations to respond to older HTTP versions (0.9 and
1227 1.0). But currently this function is a bit heavy-handed, just updating
1228 the version field.
1229 @end deffn
1230
1231 @deffn {Scheme Procedure} write-response r port
1232 Write the given HTTP response to @var{port}.
1233
1234 Return a new response, whose @code{response-port} will continue writing
1235 on @var{port}, perhaps using some transfer encoding.
1236 @end deffn
1237
1238 @deffn {Scheme Procedure} response-must-not-include-body? r
1239 Some responses, like those with status code 304, are specified as never
1240 having bodies. This predicate returns @code{#t} for those responses.
1241
1242 Note also, though, that responses to @code{HEAD} requests must also not
1243 have a body.
1244 @end deffn
1245
1246 @deffn {Scheme Procedure} read-response-body r
1247 Read the response body from @var{r}, as a bytevector. Returns @code{#f}
1248 if there was no response body.
1249 @end deffn
1250
1251 @deffn {Scheme Procedure} write-response-body r bv
1252 Write @var{bv}, a bytevector, to the port corresponding to the HTTP
1253 response @var{r}.
1254 @end deffn
1255
1256 As with requests, the various headers that are typically associated with
1257 HTTP responses may be accessed with these dedicated accessors.
1258 @xref{HTTP Headers}, for more information on the format of parsed
1259 headers.
1260
1261 @deffn {Scheme Procedure} response-accept-ranges response [default=#f]
1262 @deffnx {Scheme Procedure} response-age response [default='()]
1263 @deffnx {Scheme Procedure} response-allow response [default='()]
1264 @deffnx {Scheme Procedure} response-cache-control response [default='()]
1265 @deffnx {Scheme Procedure} response-connection response [default='()]
1266 @deffnx {Scheme Procedure} response-content-encoding response [default='()]
1267 @deffnx {Scheme Procedure} response-content-language response [default='()]
1268 @deffnx {Scheme Procedure} response-content-length response [default=#f]
1269 @deffnx {Scheme Procedure} response-content-location response [default=#f]
1270 @deffnx {Scheme Procedure} response-content-md5 response [default=#f]
1271 @deffnx {Scheme Procedure} response-content-range response [default=#f]
1272 @deffnx {Scheme Procedure} response-content-type response [default=#f]
1273 @deffnx {Scheme Procedure} response-date response [default=#f]
1274 @deffnx {Scheme Procedure} response-etag response [default=#f]
1275 @deffnx {Scheme Procedure} response-expires response [default=#f]
1276 @deffnx {Scheme Procedure} response-last-modified response [default=#f]
1277 @deffnx {Scheme Procedure} response-location response [default=#f]
1278 @deffnx {Scheme Procedure} response-pragma response [default='()]
1279 @deffnx {Scheme Procedure} response-proxy-authenticate response [default=#f]
1280 @deffnx {Scheme Procedure} response-retry-after response [default=#f]
1281 @deffnx {Scheme Procedure} response-server response [default=#f]
1282 @deffnx {Scheme Procedure} response-trailer response [default='()]
1283 @deffnx {Scheme Procedure} response-transfer-encoding response [default='()]
1284 @deffnx {Scheme Procedure} response-upgrade response [default='()]
1285 @deffnx {Scheme Procedure} response-vary response [default='()]
1286 @deffnx {Scheme Procedure} response-via response [default='()]
1287 @deffnx {Scheme Procedure} response-warning response [default='()]
1288 @deffnx {Scheme Procedure} response-www-authenticate response [default=#f]
1289 Return the given response header, or @var{default} if none was present.
1290 @end deffn
1291
1292
1293 @node Web Client
1294 @subsection Web Client
1295
1296 @code{(web client)} provides a simple, synchronous HTTP client, built on
1297 the lower-level HTTP, request, and response modules.
1298
1299 @deffn {Scheme Procedure} open-socket-for-uri uri
1300 @end deffn
1301
1302 @deffn {Scheme Procedure} http-get uri [#:port=(open-socket-for-uri uri)] [#:version='(1 . 1)] [#:keep-alive?=#f] [#:extra-headers='()] [#:decode-body?=#t]
1303 Connect to the server corresponding to @var{uri} and ask for the
1304 resource, using the @code{GET} method. If you already have a port open,
1305 pass it as @var{port}. The port will be closed at the end of the
1306 request unless @var{keep-alive?} is true. Any extra headers in the
1307 alist @var{extra-headers} will be added to the request.
1308
1309 If @var{decode-body?} is true, as is the default, the body of the
1310 response will be decoded to string, if it is a textual content-type.
1311 Otherwise it will be returned as a bytevector.
1312 @end deffn
1313
1314 @code{http-get} is useful for making one-off requests to web sites. If
1315 you are writing a web spider or some other client that needs to handle a
1316 number of requests in parallel, it's better to build an event-driven URL
1317 fetcher, similar in structure to the web server (@pxref{Web Server}).
1318
1319 Another option, good but not as performant, would be to use threads,
1320 possibly via par-map or futures.
1321
1322 More helper procedures for the other common HTTP verbs would be a good
1323 addition to this module. Send your code to
1324 @email{guile-user@@gnu.org}.
1325
1326
1327 @node Web Server
1328 @subsection Web Server
1329
1330 @code{(web server)} is a generic web server interface, along with a main
1331 loop implementation for web servers controlled by Guile.
1332
1333 @example
1334 (use-modules (web server))
1335 @end example
1336
1337 The lowest layer is the @code{<server-impl>} object, which defines a set
1338 of hooks to open a server, read a request from a client, write a
1339 response to a client, and close a server. These hooks -- @code{open},
1340 @code{read}, @code{write}, and @code{close}, respectively -- are bound
1341 together in a @code{<server-impl>} object. Procedures in this module take a
1342 @code{<server-impl>} object, if needed.
1343
1344 A @code{<server-impl>} may also be looked up by name. If you pass the
1345 @code{http} symbol to @code{run-server}, Guile looks for a variable
1346 named @code{http} in the @code{(web server http)} module, which should
1347 be bound to a @code{<server-impl>} object. Such a binding is made by
1348 instantiation of the @code{define-server-impl} syntax. In this way the
1349 run-server loop can automatically load other backends if available.
1350
1351 The life cycle of a server goes as follows:
1352
1353 @enumerate
1354 @item
1355 The @code{open} hook is called, to open the server. @code{open} takes 0 or
1356 more arguments, depending on the backend, and returns an opaque
1357 server socket object, or signals an error.
1358
1359 @item
1360 The @code{read} hook is called, to read a request from a new client.
1361 The @code{read} hook takes one argument, the server socket. It should
1362 return three values: an opaque client socket, the request, and the
1363 request body. The request should be a @code{<request>} object, from
1364 @code{(web request)}. The body should be a string or a bytevector, or
1365 @code{#f} if there is no body.
1366
1367 If the read failed, the @code{read} hook may return #f for the client
1368 socket, request, and body.
1369
1370 @item
1371 A user-provided handler procedure is called, with the request and body
1372 as its arguments. The handler should return two values: the response,
1373 as a @code{<response>} record from @code{(web response)}, and the
1374 response body as bytevector, or @code{#f} if not present.
1375
1376 The respose and response body are run through @code{sanitize-response},
1377 documented below. This allows the handler writer to take some
1378 convenient shortcuts: for example, instead of a @code{<response>}, the
1379 handler can simply return an alist of headers, in which case a default
1380 response object is constructed with those headers. Instead of a
1381 bytevector for the body, the handler can return a string, which will be
1382 serialized into an appropriate encoding; or it can return a procedure,
1383 which will be called on a port to write out the data. See the
1384 @code{sanitize-response} documentation, for more.
1385
1386 @item
1387 The @code{write} hook is called with three arguments: the client
1388 socket, the response, and the body. The @code{write} hook returns no
1389 values.
1390
1391 @item
1392 At this point the request handling is complete. For a loop, we
1393 loop back and try to read a new request.
1394
1395 @item
1396 If the user interrupts the loop, the @code{close} hook is called on
1397 the server socket.
1398 @end enumerate
1399
1400 A user may define a server implementation with the following form:
1401
1402 @deffn {Scheme Procedure} define-server-impl name open read write close
1403 Make a @code{<server-impl>} object with the hooks @var{open},
1404 @var{read}, @var{write}, and @var{close}, and bind it to the symbol
1405 @var{name} in the current module.
1406 @end deffn
1407
1408 @deffn {Scheme Procedure} lookup-server-impl impl
1409 Look up a server implementation. If @var{impl} is a server
1410 implementation already, it is returned directly. If it is a symbol, the
1411 binding named @var{impl} in the @code{(web server @var{impl})} module is
1412 looked up. Otherwise an error is signaled.
1413
1414 Currently a server implementation is a somewhat opaque type, useful only
1415 for passing to other procedures in this module, like @code{read-client}.
1416 @end deffn
1417
1418 The @code{(web server)} module defines a number of routines that use
1419 @code{<server-impl>} objects to implement parts of a web server. Given
1420 that we don't expose the accessors for the various fields of a
1421 @code{<server-impl>}, indeed these routines are the only procedures with
1422 any access to the impl objects.
1423
1424 @deffn {Scheme Procedure} open-server impl open-params
1425 Open a server for the given implementation. Return one value, the new
1426 server object. The implementation's @code{open} procedure is applied to
1427 @var{open-params}, which should be a list.
1428 @end deffn
1429
1430 @deffn {Scheme Procedure} read-client impl server
1431 Read a new client from @var{server}, by applying the implementation's
1432 @code{read} procedure to the server. If successful, return three
1433 values: an object corresponding to the client, a request object, and the
1434 request body. If any exception occurs, return @code{#f} for all three
1435 values.
1436 @end deffn
1437
1438 @deffn {Scheme Procedure} handle-request handler request body state
1439 Handle a given request, returning the response and body.
1440
1441 The response and response body are produced by calling the given
1442 @var{handler} with @var{request} and @var{body} as arguments.
1443
1444 The elements of @var{state} are also passed to @var{handler} as
1445 arguments, and may be returned as additional values. The new
1446 @var{state}, collected from the @var{handler}'s return values, is then
1447 returned as a list. The idea is that a server loop receives a handler
1448 from the user, along with whatever state values the user is interested
1449 in, allowing the user's handler to explicitly manage its state.
1450 @end deffn
1451
1452 @deffn {Scheme Procedure} sanitize-response request response body
1453 "Sanitize" the given response and body, making them appropriate for the
1454 given request.
1455
1456 As a convenience to web handler authors, @var{response} may be given as
1457 an alist of headers, in which case it is used to construct a default
1458 response. Ensures that the response version corresponds to the request
1459 version. If @var{body} is a string, encodes the string to a bytevector,
1460 in an encoding appropriate for @var{response}. Adds a
1461 @code{content-length} and @code{content-type} header, as necessary.
1462
1463 If @var{body} is a procedure, it is called with a port as an argument,
1464 and the output collected as a bytevector. In the future we might try to
1465 instead use a compressing, chunk-encoded port, and call this procedure
1466 later, in the write-client procedure. Authors are advised not to rely on
1467 the procedure being called at any particular time.
1468 @end deffn
1469
1470 @deffn {Scheme Procedure} write-client impl server client response body
1471 Write an HTTP response and body to @var{client}. If the server and
1472 client support persistent connections, it is the implementation's
1473 responsibility to keep track of the client thereafter, presumably by
1474 attaching it to the @var{server} argument somehow.
1475 @end deffn
1476
1477 @deffn {Scheme Procedure} close-server impl server
1478 Release resources allocated by a previous invocation of
1479 @code{open-server}.
1480 @end deffn
1481
1482 Given the procedures above, it is a small matter to make a web server:
1483
1484 @deffn {Scheme Procedure} serve-one-client handler impl server state
1485 Read one request from @var{server}, call @var{handler} on the request
1486 and body, and write the response to the client. Return the new state
1487 produced by the handler procedure.
1488 @end deffn
1489
1490 @deffn {Scheme Procedure} run-server handler [impl='http] [open-params='()] . state
1491 Run Guile's built-in web server.
1492
1493 @var{handler} should be a procedure that takes two or more arguments,
1494 the HTTP request and request body, and returns two or more values, the
1495 response and response body.
1496
1497 For examples, skip ahead to the next section, @ref{Web Examples}.
1498
1499 The response and body will be run through @code{sanitize-response}
1500 before sending back to the client.
1501
1502 Additional arguments to @var{handler} are taken from @var{state}.
1503 Additional return values are accumulated into a new @var{state}, which
1504 will be used for subsequent requests. In this way a handler can
1505 explicitly manage its state.
1506 @end deffn
1507
1508 The default web server implementation is @code{http}, which binds to a
1509 socket, listening for request on that port.
1510
1511 @deffn {HTTP Implementation} http [#:host=#f] [#:family=AF_INET] [#:addr=INADDR_LOOPBACK] [#:port 8080] [#:socket]
1512 The default HTTP implementation. We document it as a function with
1513 keyword arguments, because that is precisely the way that it is -- all
1514 of the @var{open-params} to @code{run-server} get passed to the
1515 implementation's open function.
1516
1517 @example
1518 ;; The defaults: localhost:8080
1519 (run-server handler)
1520 ;; Same thing
1521 (run-server handler 'http '())
1522 ;; On a different port
1523 (run-server handler 'http '(#:port 8081))
1524 ;; IPv6
1525 (run-server handler 'http '(#:family AF_INET6 #:port 8081))
1526 ;; Custom socket
1527 (run-server handler 'http `(#:socket ,(sudo-make-me-a-socket)))
1528 @end example
1529 @end deffn
1530
1531 @node Web Examples
1532 @subsection Web Examples
1533
1534 Well, enough about the tedious internals. Let's make a web application!
1535
1536 @subsubsection Hello, World!
1537
1538 The first program we have to write, of course, is ``Hello, World!''.
1539 This means that we have to implement a web handler that does what we
1540 want.
1541
1542 Now we define a handler, a function of two arguments and two return
1543 values:
1544
1545 @example
1546 (define (handler request request-body)
1547 (values @var{response} @var{response-body}))
1548 @end example
1549
1550 In this first example, we take advantage of a short-cut, returning an
1551 alist of headers instead of a proper response object. The response body
1552 is our payload:
1553
1554 @example
1555 (define (hello-world-handler request request-body)
1556 (values '((content-type . (text/plain)))
1557 "Hello World!"))
1558 @end example
1559
1560 Now let's test it, by running a server with this handler. Load up the
1561 web server module if you haven't yet done so, and run a server with this
1562 handler:
1563
1564 @example
1565 (use-modules (web server))
1566 (run-server hello-world-handler)
1567 @end example
1568
1569 By default, the web server listens for requests on
1570 @code{localhost:8080}. Visit that address in your web browser to
1571 test. If you see the string, @code{Hello World!}, sweet!
1572
1573 @subsubsection Inspecting the Request
1574
1575 The Hello World program above is a general greeter, responding to all
1576 URIs. To make a more exclusive greeter, we need to inspect the request
1577 object, and conditionally produce different results. So let's load up
1578 the request, response, and URI modules, and do just that.
1579
1580 @example
1581 (use-modules (web server)) ; you probably did this already
1582 (use-modules (web request)
1583 (web response)
1584 (web uri))
1585
1586 (define (request-path-components request)
1587 (split-and-decode-uri-path (uri-path (request-uri request))))
1588
1589 (define (hello-hacker-handler request body)
1590 (if (equal? (request-path-components request)
1591 '("hacker"))
1592 (values '((content-type . (text/plain)))
1593 "Hello hacker!")
1594 (not-found request)))
1595
1596 (run-server hello-hacker-handler)
1597 @end example
1598
1599 Here we see that we have defined a helper to return the components of
1600 the URI path as a list of strings, and used that to check for a request
1601 to @code{/hacker/}. Then the success case is just as before -- visit
1602 @code{http://localhost:8080/hacker/} in your browser to check.
1603
1604 You should always match against URI path components as decoded by
1605 @code{split-and-decode-uri-path}. The above example will work for
1606 @code{/hacker/}, @code{//hacker///}, and @code{/h%61ck%65r}.
1607
1608 But we forgot to define @code{not-found}! If you are pasting these
1609 examples into a REPL, accessing any other URI in your web browser will
1610 drop your Guile console into the debugger:
1611
1612 @example
1613 <unnamed port>:38:7: In procedure module-lookup:
1614 <unnamed port>:38:7: Unbound variable: not-found
1615
1616 Entering a new prompt. Type `,bt' for a backtrace or `,q' to continue.
1617 scheme@@(guile-user) [1]>
1618 @end example
1619
1620 So let's define the function, right there in the debugger. As you
1621 probably know, we'll want to return a 404 response.
1622
1623 @example
1624 ;; Paste this in your REPL
1625 (define (not-found request)
1626 (values (build-response #:code 404)
1627 (string-append "Resource not found: "
1628 (uri->string (request-uri request)))))
1629
1630 ;; Now paste this to let the web server keep going:
1631 ,continue
1632 @end example
1633
1634 Now if you access @code{http://localhost/foo/}, you get this error
1635 message. (Note that some popular web browsers won't show
1636 server-generated 404 messages, showing their own instead, unless the 404
1637 message body is long enough.)
1638
1639 @subsubsection Higher-Level Interfaces
1640
1641 The web handler interface is a common baseline that all kinds of Guile
1642 web applications can use. You will usually want to build something on
1643 top of it, however, especially when producing HTML. Here is a simple
1644 example that builds up HTML output using SXML (@pxref{sxml simple}).
1645
1646 First, load up the modules:
1647
1648 @example
1649 (use-modules (web server)
1650 (web request)
1651 (web response)
1652 (sxml simple))
1653 @end example
1654
1655 Now we define a simple templating function that takes a list of HTML
1656 body elements, as SXML, and puts them in our super template:
1657
1658 @example
1659 (define (templatize title body)
1660 `(html (head (title ,title))
1661 (body ,@@body)))
1662 @end example
1663
1664 For example, the simplest Hello HTML can be produced like this:
1665
1666 @example
1667 (sxml->xml (templatize "Hello!" '((b "Hi!"))))
1668 @print{}
1669 <html><head><title>Hello!</title></head><body><b>Hi!</b></body></html>
1670 @end example
1671
1672 Much better to work with Scheme data types than to work with HTML as
1673 strings. Now we define a little response helper:
1674
1675 @example
1676 (define* (respond #:optional body #:key
1677 (status 200)
1678 (title "Hello hello!")
1679 (doctype "<!DOCTYPE html>\n")
1680 (content-type-params '((charset . "utf-8")))
1681 (content-type 'text/html)
1682 (extra-headers '())
1683 (sxml (and body (templatize title body))))
1684 (values (build-response
1685 #:code status
1686 #:headers `((content-type
1687 . (,content-type ,@@content-type-params))
1688 ,@@extra-headers))
1689 (lambda (port)
1690 (if sxml
1691 (begin
1692 (if doctype (display doctype port))
1693 (sxml->xml sxml port))))))
1694 @end example
1695
1696 Here we see the power of keyword arguments with default initializers. By
1697 the time the arguments are fully parsed, the @code{sxml} local variable
1698 will hold the templated SXML, ready for sending out to the client.
1699
1700 Also, instead of returning the body as a string, @code{respond} gives a
1701 procedure, which will be called by the web server to write out the
1702 response to the client.
1703
1704 Now, a simple example using this responder, which lays out the incoming
1705 headers in an HTML table.
1706
1707 @example
1708 (define (debug-page request body)
1709 (respond
1710 `((h1 "hello world!")
1711 (table
1712 (tr (th "header") (th "value"))
1713 ,@@(map (lambda (pair)
1714 `(tr (td (tt ,(with-output-to-string
1715 (lambda () (display (car pair))))))
1716 (td (tt ,(with-output-to-string
1717 (lambda ()
1718 (write (cdr pair))))))))
1719 (request-headers request))))))
1720
1721 (run-server debug-page)
1722 @end example
1723
1724 Now if you visit any local address in your web browser, we actually see
1725 some HTML, finally.
1726
1727 @subsubsection Conclusion
1728
1729 Well, this is about as far as Guile's built-in web support goes, for
1730 now. There are many ways to make a web application, but hopefully by
1731 standardizing the most fundamental data types, users will be able to
1732 choose the approach that suits them best, while also being able to
1733 switch between implementations of the server. This is a relatively new
1734 part of Guile, so if you have feedback, let us know, and we can take it
1735 into account. Happy hacking on the web!
1736
1737 @c Local Variables:
1738 @c TeX-master: "guile.texi"
1739 @c End: