update web.text documentation for requests and responses
[bpt/guile.git] / doc / ref / web.texi
1 @c -*-texinfo-*-
2 @c This is part of the GNU Guile Reference Manual.
3 @c Copyright (C) 2010, 2011 Free Software Foundation, Inc.
4 @c See the file guile.texi for copying conditions.
5
6 @node Web
7 @section @acronym{HTTP}, the Web, and All That
8 @cindex Web
9 @cindex WWW
10 @cindex HTTP
11
12 It has always been possible to connect computers together and share
13 information between them, but the rise of the World-Wide Web over the
14 last couple of decades has made it much easier to do so. The result is
15 a richly connected network of computation, in which Guile forms a part.
16
17 By ``the web'', we mean the HTTP protocol@footnote{Yes, the P is for
18 protocol, but this phrase appears repeatedly in RFC 2616.} as handled by
19 servers, clients, proxies, caches, and the various kinds of messages and
20 message components that can be sent and received by that protocol,
21 notably HTML.
22
23 On one level, the web is text in motion: the protocols themselves are
24 textual (though the payload may be binary), and it's possible to create
25 a socket and speak text to the web. But such an approach is obviously
26 primitive. This section details the higher-level data types and
27 operations provided by Guile: URIs, HTTP request and response records,
28 and a conventional web server implementation.
29
30 The material in this section is arranged in ascending order, in which
31 later concepts build on previous ones. If you prefer to start with the
32 highest-level perspective, @pxref{Web Examples}, and work your way
33 back.
34
35 @menu
36 * Types and the Web:: Types prevent bugs and security problems.
37 * URIs:: Universal Resource Identifiers.
38 * HTTP:: The Hyper-Text Transfer Protocol.
39 * HTTP Headers:: How Guile represents specific header values.
40 * Requests:: HTTP requests.
41 * Responses:: HTTP responses.
42 * Web Server:: Serving HTTP to the internet.
43 * Web Examples:: How to use this thing.
44 @end menu
45
46 @node Types and the Web
47 @subsection Types and the Web
48
49 It is a truth universally acknowledged, that a program with good use of
50 data types, will be free from many common bugs. Unfortunately, the
51 common practice in web programming seems to ignore this maxim. This
52 subsection makes the case for expressive data types in web programming.
53
54 By ``expressive data types'', we mean that the data types @emph{say}
55 something about how a program solves a problem. For example, if we
56 choose to represent dates using SRFI 19 date records (@pxref{SRFI-19}),
57 this indicates that there is a part of the program that will always have
58 valid dates. Error handling for a number of basic cases, like invalid
59 dates, occurs on the boundary in which we produce a SRFI 19 date record
60 from other types, like strings.
61
62 With regards to the web, data types are help in the two broad phases of
63 HTTP messages: parsing and generation.
64
65 Consider a server, which has to parse a request, and produce a response.
66 Guile will parse the request into an HTTP request object
67 (@pxref{Requests}), with each header parsed into an appropriate Scheme
68 data type. This transition from an incoming stream of characters to
69 typed data is a state change in a program---the strings might parse, or
70 they might not, and something has to happen if they do not. (Guile
71 throws an error in this case.) But after you have the parsed request,
72 ``client'' code (code built on top of the Guile web framework) will not
73 have to check for syntactic validity. The types already make this
74 information manifest.
75
76 This state change on the parsing boundary makes programs more robust,
77 as they themselves are freed from the need to do a number of common
78 error checks, and they can use normal Scheme procedures to handle a
79 request instead of ad-hoc string parsers.
80
81 The need for types on the response generation side (in a server) is more
82 subtle, though not less important. Consider the example of a POST
83 handler, which prints out the text that a user submits from a form.
84 Such a handler might include a procedure like this:
85
86 @example
87 ;; First, a helper procedure
88 (define (para . contents)
89 (string-append "<p>" (string-concatenate contents) "</p>"))
90
91 ;; Now the meat of our simple web application
92 (define (you-said text)
93 (para "You said: " text))
94
95 (display (you-said "Hi!"))
96 @print{} <p>You said: Hi!</p>
97 @end example
98
99 This is a perfectly valid implementation, provided that the incoming
100 text does not contain the special HTML characters @samp{<}, @samp{>}, or
101 @samp{&}. But this provision of a restricted character set is not
102 reflected anywhere in the program itself: we must @emph{assume} that the
103 programmer understands this, and performs the check elsewhere.
104
105 Unfortunately, the short history of the practice of programming does not
106 bear out this assumption. A @dfn{cross-site scripting} (@acronym{XSS})
107 vulnerability is just such a common error in which unfiltered user input
108 is allowed into the output. A user could submit a crafted comment to
109 your web site which results in visitors running malicious Javascript,
110 within the security context of your domain:
111
112 @example
113 (display (you-said "<script src=\"http://bad.com/nasty.js\" />"))
114 @print{} <p>You said: <script src="http://bad.com/nasty.js" /></p>
115 @end example
116
117 The fundamental problem here is that both user data and the program
118 template are represented using strings. This identity means that types
119 can't help the programmer to make a distinction between these two, so
120 they get confused.
121
122 There are a number of possible solutions, but perhaps the best is to
123 treat HTML not as strings, but as native s-expressions: as SXML. The
124 basic idea is that HTML is either text, represented by a string, or an
125 element, represented as a tagged list. So @samp{foo} becomes
126 @samp{"foo"}, and @samp{<b>foo</b>} becomes @samp{(b "foo")}.
127 Attributes, if present, go in a tagged list headed by @samp{@@}, like
128 @samp{(img (@@ (src "http://example.com/foo.png")))}. @xref{sxml
129 simple}, for more information.
130
131 The good thing about SXML is that HTML elements cannot be confused with
132 text. Let's make a new definition of @code{para}:
133
134 @example
135 (define (para . contents)
136 `(p ,@@contents))
137
138 (use-modules (sxml simple))
139 (sxml->xml (you-said "Hi!"))
140 @print{} <p>You said: Hi!</p>
141
142 (sxml->xml (you-said "<i>Rats, foiled again!</i>"))
143 @print{} <p>You said: &lt;i&gt;Rats, foiled again!&lt;/i&gt;</p>
144 @end example
145
146 So we see in the second example that HTML elements cannot be unwittingly
147 introduced into the output. However it is now perfectly acceptable to
148 pass SXML to @code{you-said}; in fact, that is the big advantage of SXML
149 over everything-as-a-string.
150
151 @example
152 (sxml->xml (you-said (you-said "<Hi!>")))
153 @print{} <p>You said: <p>You said: &lt;Hi!&gt;</p></p>
154 @end example
155
156 The SXML types allow procedures to @emph{compose}. The types make
157 manifest which parts are HTML elements, and which are text. So you
158 needn't worry about escaping user input; the type transition back to a
159 string handles that for you. @acronym{XSS} vulnerabilities are a thing
160 of the past.
161
162 Well. That's all very nice and opinionated and such, but how do I use
163 the thing? Read on!
164
165 @node URIs
166 @subsection Universal Resource Identifiers
167
168 Guile provides a standard data type for Universal Resource Identifiers
169 (URIs), as defined in RFC 3986.
170
171 The generic URI syntax is as follows:
172
173 @example
174 URI := scheme ":" ["//" [userinfo "@@"] host [":" port]] path \
175 [ "?" query ] [ "#" fragment ]
176 @end example
177
178 For example, in the URI, @indicateurl{http://www.gnu.org/help/}, the
179 scheme is @code{http}, the host is @code{www.gnu.org}, the path is
180 @code{/help/}, and there is no userinfo, port, query, or path. All URIs
181 have a scheme and a path (though the path might be empty). Some URIs
182 have a host, and some of those have ports and userinfo. Any URI might
183 have a query part or a fragment.
184
185 Userinfo is something of an abstraction, as some legacy URI schemes
186 allowed userinfo of the form @code{@var{username}:@var{passwd}}. But
187 since passwords do not belong in URIs, the RFC does not want to condone
188 this practice, so it calls anything before the @code{@@} sign
189 @dfn{userinfo}.
190
191 Properly speaking, a fragment is not part of a URI. For example, when a
192 web browser follows a link to @indicateurl{http://example.com/#foo}, it
193 sends a request for @indicateurl{http://example.com/}, then looks in the
194 resulting page for the fragment identified @code{foo} reference. A
195 fragment identifies a part of a resource, not the resource itself. But
196 it is useful to have a fragment field in the URI record itself, so we
197 hope you will forgive the inconsistency.
198
199 @example
200 (use-modules (web uri))
201 @end example
202
203 The following procedures can be found in the @code{(web uri)}
204 module. Load it into your Guile, using a form like the above, to have
205 access to them.
206
207 @defun build-uri scheme [#:userinfo=@code{#f}] [#:host=@code{#f}] @
208 [#:port=@code{#f}] [#:path=@code{""}] [#:query=@code{#f}] @
209 [#:fragment=@code{#f}] [#:validate?=@code{#t}]
210 Construct a URI object. @var{scheme} should be a symbol, and the rest
211 of the fields are either strings or @code{#f}. If @var{validate?} is
212 true, also run some consistency checks to make sure that the constructed
213 URI is valid.
214 @end defun
215
216 @defun uri? x
217 @defunx uri-scheme uri
218 @defunx uri-userinfo uri
219 @defunx uri-host uri
220 @defunx uri-port uri
221 @defunx uri-path uri
222 @defunx uri-query uri
223 @defunx uri-fragment uri
224 A predicate and field accessors for the URI record type. The URI scheme
225 will be a symbol, and the rest either strings or @code{#f} if not
226 present.
227 @end defun
228
229 @defun string->uri string
230 Parse @var{string} into a URI object. Return @code{#f} if the string
231 could not be parsed.
232 @end defun
233
234 @defun uri->string uri
235 Serialize @var{uri} to a string. If the URI has a port that is the
236 default port for its scheme, the port is not included in the
237 serialization.
238 @end defun
239
240 @defun declare-default-port! scheme port
241 Declare a default port for the given URI scheme.
242 @end defun
243
244 @defun uri-decode str [#:encoding=@code{"utf-8"}]
245 Percent-decode the given @var{str}, according to @var{encoding}, which
246 should be the name of a character encoding.
247
248 Note that this function should not generally be applied to a full URI
249 string. For paths, use split-and-decode-uri-path instead. For query
250 strings, split the query on @code{&} and @code{=} boundaries, and decode
251 the components separately.
252
253 Note also that percent-encoded strings encode @emph{bytes}, not
254 characters. There is no guarantee that a given byte sequence is a valid
255 string encoding. Therefore this routine may signal an error if the
256 decoded bytes are not valid for the given encoding. Pass @code{#f} for
257 @var{encoding} if you want decoded bytes as a bytevector directly.
258 @xref{Ports, @code{set-port-encoding!}}, for more information on
259 character encodings.
260
261 Returns a string of the decoded characters, or a bytevector if
262 @var{encoding} was @code{#f}.
263 @end defun
264
265 Fixme: clarify return type. indicate default values. type of
266 unescaped-chars.
267
268 @defun uri-encode str [#:encoding=@code{"utf-8"}] [#:unescaped-chars]
269 Percent-encode any character not in the character set,
270 @var{unescaped-chars}.
271
272 The default character set includes alphanumerics from ASCII, as well as
273 the special characters @samp{-}, @samp{.}, @samp{_}, and @samp{~}. Any
274 other character will be percent-encoded, by writing out the character to
275 a bytevector within the given @var{encoding}, then encoding each byte as
276 @code{%@var{HH}}, where @var{HH} is the hexadecimal representation of
277 the byte.
278 @end defun
279
280 @defun split-and-decode-uri-path path
281 Split @var{path} into its components, and decode each component,
282 removing empty components.
283
284 For example, @code{"/foo/bar%20baz/"} decodes to the two-element list,
285 @code{("foo" "bar baz")}.
286 @end defun
287
288 @defun encode-and-join-uri-path parts
289 URI-encode each element of @var{parts}, which should be a list of
290 strings, and join the parts together with @code{/} as a delimiter.
291
292 For example, the list @code{("scrambled eggs" "biscuits&gravy")} encodes
293 as @code{"scrambled%20eggs/biscuits%26gravy"}.
294 @end defun
295
296 @node HTTP
297 @subsection The Hyper-Text Transfer Protocol
298
299 The initial motivation for including web functionality in Guile, rather
300 than rely on an external package, was to establish a standard base on
301 which people can share code. To that end, we continue the focus on data
302 types by providing a number of low-level parsers and unparsers for
303 elements of the HTTP protocol.
304
305 If you are want to skip the low-level details for now and move on to web
306 pages, @pxref{Web Server}. Otherwise, load the HTTP module, and read
307 on.
308
309 @example
310 (use-modules (web http))
311 @end example
312
313 The focus of the @code{(web http)} module is to parse and unparse
314 standard HTTP headers, representing them to Guile as native data
315 structures. For example, a @code{Date:} header will be represented as a
316 SRFI-19 date record (@pxref{SRFI-19}), rather than as a string.
317
318 Guile tries to follow RFCs fairly strictly---the road to perdition being
319 paved with compatibility hacks---though some allowances are made for
320 not-too-divergent texts.
321
322 Header names are represented as lower-case symbols.
323
324 @defun string->header name
325 Parse @var{name} to a symbolic header name.
326 @end defun
327
328 @defun header->string sym
329 Return the string form for the header named @var{sym}.
330 @end defun
331
332 For example:
333
334 @example
335 (string->header "Content-Length")
336 @result{} content-length
337 (header->string 'content-length)
338 @result{} "Content-Length"
339
340 (string->header "FOO")
341 @result{} foo
342 (header->string 'foo
343 @result{} "Foo"
344 @end example
345
346 Guile keeps a registry of known headers, their string names, and some
347 parsing and serialization procedures. If a header is unknown, its
348 string name is simply its symbol name in title-case.
349
350 @defun known-header? sym
351 Return @code{#t} iff @var{sym} is a known header, with associated
352 parsers and serialization procedures.
353 @end defun
354
355 @defun header-parser sym
356 Return the value parser for headers named @var{sym}. The result is a
357 procedure that takes one argument, a string, and returns the parsed
358 value. If the header isn't known to Guile, a default parser is returned
359 that passes through the string unchanged.
360 @end defun
361
362 @defun header-validator sym
363 Return a predicate which returns @code{#t} if the given value is valid
364 for headers named @var{sym}. The default validator for unknown headers
365 is @code{string?}.
366 @end defun
367
368 @defun header-writer sym
369 Return a procedure that writes values for headers named @var{sym} to a
370 port. The resulting procedure takes two arguments: a value and a port.
371 The default writer is @code{display}.
372 @end defun
373
374 For more on the set of headers that Guile knows about out of the box,
375 @pxref{HTTP Headers}. To add your own, use the @code{declare-header!}
376 procedure:
377
378 @defun declare-header! name parser validator writer [#:multiple?=@code{#f}]
379 Declare a parser, validator, and writer for a given header.
380 @end defun
381
382 For example, let's say you are running a web server behind some sort of
383 proxy, and your proxy adds an @code{X-Client-Address} header, indicating
384 the IPv4 address of the original client. You would like for the HTTP
385 request record to parse out this header to a Scheme value, instead of
386 leaving it as a string. You could register this header with Guile's
387 HTTP stack like this:
388
389 @example
390 (define (parse-ip str)
391 (inet-aton str)
392 (define (validate-ip ip)
393 (define (write-ip ip port)
394 (display (inet-ntoa ip) port))
395
396 (declare-header! "X-Client-Address"
397 (lambda (str)
398 (inet-aton str))
399 (lambda (ip)
400 (and (integer? ip) (exact? ip) (<= 0 ip #xffffffff)))
401 (lambda (ip port)
402 (display (inet-ntoa ip) port)))
403 @end example
404
405 @defun valid-header? sym val
406 Return a true value iff @var{val} is a valid Scheme value for the header
407 with name @var{sym}.
408 @end defun
409
410 Now that we have a generic interface for reading and writing headers, we
411 do just that.
412
413 @defun read-header port
414 Read one HTTP header from @var{port}. Return two values: the header
415 name and the parsed Scheme value. May raise an exception if the header
416 was known but the value was invalid.
417
418 Returns the end-of-file object for both values if the end of the message
419 body was reached (i.e., a blank line).
420 @end defun
421
422 @defun parse-header name val
423 Parse @var{val}, a string, with the parser for the header named
424 @var{name}. Returns the parsed value.
425 @end defun
426
427 @defun write-header name val port
428 Write the given header name and value to @var{port}, using the writer
429 from @code{header-writer}.
430 @end defun
431
432 @defun read-headers port
433 Read the headers of an HTTP message from @var{port}, returning the
434 headers as an ordered alist.
435 @end defun
436
437 @defun write-headers headers port
438 Write the given header alist to @var{port}. Doesn't write the final
439 @samp{\r\n}, as the user might want to add another header.
440 @end defun
441
442 The @code{(web http)} module also has some utility procedures to read
443 and write request and response lines.
444
445 @defun parse-http-method str [start] [end]
446 Parse an HTTP method from @var{str}. The result is an upper-case symbol,
447 like @code{GET}.
448 @end defun
449
450 @defun parse-http-version str [start] [end]
451 Parse an HTTP version from @var{str}, returning it as a major-minor
452 pair. For example, @code{HTTP/1.1} parses as the pair of integers,
453 @code{(1 . 1)}.
454 @end defun
455
456 @defun parse-request-uri str [start] [end]
457 Parse a URI from an HTTP request line. Note that URIs in requests do not
458 have to have a scheme or host name. The result is a URI object.
459 @end defun
460
461 @defun read-request-line port
462 Read the first line of an HTTP request from @var{port}, returning three
463 values: the method, the URI, and the version.
464 @end defun
465
466 @defun write-request-line method uri version port
467 Write the first line of an HTTP request to @var{port}.
468 @end defun
469
470 @defun read-response-line port
471 Read the first line of an HTTP response from @var{port}, returning three
472 values: the HTTP version, the response code, and the "reason phrase".
473 @end defun
474
475 @defun write-response-line version code reason-phrase port
476 Write the first line of an HTTP response to @var{port}.
477 @end defun
478
479
480 @node HTTP Headers
481 @subsection HTTP Headers
482
483 In addition to defining the infrastructure to parse headers, the
484 @code{(web http)} module defines specific parsers and unparsers for all
485 headers defined in the HTTP/1.1 standard.
486
487 For example, if you receive a header named @samp{Accept-Language} with a
488 value @samp{en, es;q=0.8}, Guile parses it as a quality list (defined
489 below):
490
491 @example
492 (parse-header 'accept-language "en, es;q=0.8")
493 @result{} ((1000 . "en") (800 . "es"))
494 @end example
495
496 The format of the value for @samp{Accept-Language} headers is defined
497 below, along with all other headers defined in the HTTP standard. (If
498 the header were unknown, the value would have been returned as a
499 string.)
500
501 For brevity, the header definitions below are given in the form,
502 @var{Type} @code{@var{name}}, indicating that values for the header
503 @code{@var{name}} will be of the given @var{Type}. Since Guile
504 internally treats header names in lower case, in this document we give
505 types title-cased names. A short description of the each header's
506 purpose and an example follow.
507
508 For full details on the meanings of all of these headers, see the HTTP
509 1.1 standard, RFC 2616.
510
511 @subsubsection HTTP Header Types
512
513 Here we define the types that are used below, when defining headers.
514
515 @deftp {HTTP Header Type} Date
516 A SRFI-19 date.
517 @end deftp
518
519 @deftp {HTTP Header Type} KVList
520 A list whose elements are keys or key-value pairs. Keys are parsed to
521 symbols. Values are strings by default. Non-string values are the
522 exception, and are mentioned explicitly below, as appropriate.
523 @end deftp
524
525 @deftp {HTTP Header Type} SList
526 A list of strings.
527 @end deftp
528
529 @deftp {HTTP Header Type} Quality
530 An exact integer between 0 and 1000. Qualities are used to express
531 preference, given multiple options. An option with a quality of 870,
532 for example, is preferred over an option with quality 500.
533
534 (Qualities are written out over the wire as numbers between 0.0 and
535 1.0, but since the standard only allows three digits after the decimal,
536 it's equivalent to integers between 0 and 1000, so that's what Guile
537 uses.)
538 @end deftp
539
540 @deftp {HTTP Header Type} QList
541 A quality list: a list of pairs, the car of which is a quality, and the
542 cdr a string. Used to express a list of options, along with their
543 qualities.
544 @end deftp
545
546 @deftp {HTTP Header Type} ETag
547 An entity tag, represented as a pair. The car of the pair is an opaque
548 string, and the cdr is @code{#t} if the entity tag is a ``strong'' entity
549 tag, and @code{#f} otherwise.
550 @end deftp
551
552 @subsubsection General Headers
553
554 General HTTP headers may be present in any HTTP message.
555
556 @deftypevr {HTTP Header} KVList cache-control
557 A key-value list of cache-control directives. See RFC 2616, for more
558 details.
559
560 If present, parameters to @code{max-age}, @code{max-stale},
561 @code{min-fresh}, and @code{s-maxage} are all parsed as non-negative
562 integers.
563
564 If present, parameters to @code{private} and @code{no-cache} are parsed
565 as lists of header names, as symbols.
566
567 @example
568 (parse-header 'cache-control "no-cache,no-store"
569 @result{} (no-cache no-store)
570 (parse-header 'cache-control "no-cache=\"Authorization,Date\",no-store"
571 @result{} ((no-cache . (authorization date)) no-store)
572 (parse-header 'cache-control "no-cache=\"Authorization,Date\",max-age=10"
573 @result{} ((no-cache . (authorization date)) (max-age . 10))
574 @end example
575 @end deftypevr
576
577 @deftypevr {HTTP Header} List connection
578 A list of header names that apply only to this HTTP connection, as
579 symbols. Additionally, the symbol @samp{close} may be present, to
580 indicate that the server should close the connection after responding to
581 the request.
582 @example
583 (parse-header 'connection "close")
584 @result{} (close)
585 @end example
586 @end deftypevr
587
588 @deftypevr {HTTP Header} Date date
589 The date that a given HTTP message was originated.
590 @example
591 (parse-header 'date "Tue, 15 Nov 1994 08:12:31 GMT")
592 @result{} #<date ...>
593 @end example
594 @end deftypevr
595
596 @deftypevr {HTTP Header} KVList pragma
597 A key-value list of implementation-specific directives.
598 @example
599 (parse-header 'pragma "no-cache, broccoli=tasty")
600 @result{} (no-cache (broccoli . "tasty"))
601 @end example
602 @end deftypevr
603
604 @deftypevr {HTTP Header} List trailer
605 A list of header names which will appear after the message body, instead
606 of with the message headers.
607 @example
608 (parse-header 'trailer "ETag")
609 @result{} (etag)
610 @end example
611 @end deftypevr
612
613 @deftypevr {HTTP Header} List transfer-encoding
614 A list of transfer codings, expressed as key-value lists. The only
615 transfer coding defined by the specification is @code{chunked}.
616 @example
617 (parse-header 'transfer-encoding "chunked")
618 @result{} (chunked)
619 @end example
620 @end deftypevr
621
622 @deftypevr {HTTP Header} List upgrade
623 A list of strings, indicating additional protocols that a server could use
624 in response to a request.
625 @example
626 (parse-header 'upgrade "WebSocket")
627 @result{} ("WebSocket")
628 @end example
629 @end deftypevr
630
631 FIXME: parse out more fully?
632 @deftypevr {HTTP Header} List via
633 A list of strings, indicating the protocol versions and hosts of
634 intermediate servers and proxies. There may be multiple @code{via}
635 headers in one message.
636 @example
637 (parse-header 'via "1.0 venus, 1.1 mars")
638 @result{} ("1.0 venus" "1.1 mars")
639 @end example
640 @end deftypevr
641
642 @deftypevr {HTTP Header} List warning
643 A list of warnings given by a server or intermediate proxy. Each
644 warning is a itself a list of four elements: a code, as an exact integer
645 between 0 and 1000, a host as a string, the warning text as a string,
646 and either @code{#f} or a SRFI-19 date.
647
648 There may be multiple @code{warning} headers in one message.
649 @example
650 (parse-header 'warning "123 foo \"core breach imminent\"")
651 @result{} ((123 "foo" "core-breach imminent" #f))
652 @end example
653 @end deftypevr
654
655
656 @subsubsection Entity Headers
657
658 Entity headers may be present in any HTTP message, and refer to the
659 resource referenced in the HTTP request or response.
660
661 @deftypevr {HTTP Header} List allow
662 A list of allowed methods on a given resource, as symbols.
663 @example
664 (parse-header 'allow "GET, HEAD")
665 @result{} (GET HEAD)
666 @end example
667 @end deftypevr
668
669 @deftypevr {HTTP Header} List content-encoding
670 A list of content codings, as symbols.
671 @example
672 (parse-header 'content-encoding "gzip")
673 @result{} (GET HEAD)
674 @end example
675 @end deftypevr
676
677 @deftypevr {HTTP Header} List content-language
678 The languages that a resource is in, as strings.
679 @example
680 (parse-header 'content-language "en")
681 @result{} ("en")
682 @end example
683 @end deftypevr
684
685 @deftypevr {HTTP Header} UInt content-length
686 The number of bytes in a resource, as an exact, non-negative integer.
687 @example
688 (parse-header 'content-length "300")
689 @result{} 300
690 @end example
691 @end deftypevr
692
693 @deftypevr {HTTP Header} URI content-location
694 The canonical URI for a resource, in the case that it is also accessible
695 from a different URI.
696 @example
697 (parse-header 'content-location "http://example.com/foo")
698 @result{} #<<uri> ...>
699 @end example
700 @end deftypevr
701
702 @deftypevr {HTTP Header} String content-md5
703 The MD5 digest of a resource.
704 @example
705 (parse-header 'content-md5 "ffaea1a79810785575e29e2bd45e2fa5")
706 @result{} "ffaea1a79810785575e29e2bd45e2fa5"
707 @end example
708 @end deftypevr
709
710 @deftypevr {HTTP Header} List content-range
711 A range specification, as a list of three elements: the symbol
712 @code{bytes}, either the symbol @code{*} or a pair of integers,
713 indicating the byte rage, and either @code{*} or an integer, for the
714 instance length. Used to indicate that a response only includes part of
715 a resource.
716 @example
717 (parse-header 'content-range "bytes 10-20/*")
718 @result{} (bytes (10 . 20) *)
719 @end example
720 @end deftypevr
721
722 @deftypevr {HTTP Header} List content-type
723 The MIME type of a resource, as a symbol, along with any parameters.
724 @example
725 (parse-header 'content-length "text/plain")
726 @result{} (text/plain)
727 (parse-header 'content-length "text/plain;charset=utf-8")
728 @result{} (text/plain (charset . "utf-8"))
729 @end example
730 Note that the @code{charset} parameter is something is a misnomer, and
731 the HTTP specification admits this. It specifies the @emph{encoding} of
732 the characters, not the character set.
733 @end deftypevr
734
735 @deftypevr {HTTP Header} Date expires
736 The date/time after which the resource given in a response is considered
737 stale.
738 @example
739 (parse-header 'expires "Tue, 15 Nov 1994 08:12:31 GMT")
740 @result{} #<date ...>
741 @end example
742 @end deftypevr
743
744 @deftypevr {HTTP Header} Date last-modified
745 The date/time on which the resource given in a response was last
746 modified.
747 @example
748 (parse-header 'expires "Tue, 15 Nov 1994 08:12:31 GMT")
749 @result{} #<date ...>
750 @end example
751 @end deftypevr
752
753
754 @subsubsection Request Headers
755
756 Request headers may only appear in an HTTP request, not in a response.
757
758 @deftypevr {HTTP Header} List accept
759 A list of preferred media types for a response. Each element of the
760 list is itself a list, in the same format as @code{content-type}.
761 @example
762 (parse-header 'accept "text/html,text/plain;charset=utf-8")
763 @result{} ((text/html) (text/plain (charset . "utf-8")))
764 @end example
765 Preference is expressed with qualitiy values:
766 @example
767 (parse-header 'accept "text/html;q=0.8,text/plain;q=0.6")
768 @result{} ((text/html (q . 800)) (text/plain (q . 600)))
769 @end example
770 @end deftypevr
771
772 @deftypevr {HTTP Header} QList accept-charset
773 A quality list of acceptable charsets. Note again that what HTTP calls
774 a ``charset'' is what Guile calls a ``character encoding''.
775 @example
776 (parse-header 'accept-charset "iso-8859-5, unicode-1-1;q=0.8")
777 @result{} ((1000 . "iso-8859-5") (800 . "unicode-1-1"))
778 @end example
779 @end deftypevr
780
781 @deftypevr {HTTP Header} QList accept-encoding
782 A quality list of acceptable content codings.
783 @example
784 (parse-header 'accept-encoding "gzip,identity=0.8")
785 @result{} ((1000 . "gzip") (800 . "identity"))
786 @end example
787 @end deftypevr
788
789 @deftypevr {HTTP Header} QList accept-language
790 A quality list of acceptable languages.
791 @example
792 (parse-header 'accept-language "cn,en=0.75")
793 @result{} ((1000 . "cn") (750 . "en"))
794 @end example
795 @end deftypevr
796
797 @deftypevr {HTTP Header} Pair authorization
798 Authorization credentials. The car of the pair indicates the
799 authentication scheme, like @code{basic}. For basic authentication, the
800 cdr of the pair will be the base64-encoded @samp{@var{user}:@var{pass}}
801 string. For other authentication schemes, like @code{digest}, the cdr
802 will be a key-value list of credentials.
803 @example
804 (parse-header 'authorization "Basic QWxhZGRpbjpvcGVuIHNlc2FtZQ=="
805 @result{} (basic . "QWxhZGRpbjpvcGVuIHNlc2FtZQ==")
806 @end example
807 @end deftypevr
808
809 @deftypevr {HTTP Header} List expect
810 A list of expectations that a client has of a server. The expectations
811 are key-value lists.
812 @example
813 (parse-header 'expect "100-continue")
814 @result{} ((100-continue))
815 @end example
816 @end deftypevr
817
818 @deftypevr {HTTP Header} String from
819 The email address of a user making an HTTP request.
820 @example
821 (parse-header 'from "bob@@example.com")
822 @result{} "bob@@example.com"
823 @end example
824 @end deftypevr
825
826 @deftypevr {HTTP Header} Pair host
827 The host for the resource being requested, as a hostname-port pair. If
828 no port is given, the port is @code{#f}.
829 @example
830 (parse-header 'host "gnu.org:80")
831 @result{} ("gnu.org" . 80)
832 (parse-header 'host "gnu.org")
833 @result{} ("gnu.org" . #f)
834 @end example
835 @end deftypevr
836
837 @deftypevr {HTTP Header} *|List if-match
838 A set of etags, indicating that the request should proceed if and only
839 if the etag of the resource is in that set. Either the symbol @code{*},
840 indicating any etag, or a list of entity tags.
841 @example
842 (parse-header 'if-match "*")
843 @result{} *
844 (parse-header 'if-match "asdfadf")
845 @result{} (("asdfadf" . #t))
846 (parse-header 'if-match W/"asdfadf")
847 @result{} (("asdfadf" . #f))
848 @end example
849 @end deftypevr
850
851 @deftypevr {HTTP Header} Date if-modified-since
852 Indicates that a response should proceed if and only if the resource has
853 been modified since the given date.
854 @example
855 (parse-header if-modified-since "Tue, 15 Nov 1994 08:12:31 GMT")
856 @result{} #<date ...>
857 @end example
858 @end deftypevr
859
860 @deftypevr {HTTP Header} *|List if-none-match
861 A set of etags, indicating that the request should proceed if and only
862 if the etag of the resource is not in the set. Either the symbol
863 @code{*}, indicating any etag, or a list of entity tags.
864 @example
865 (parse-header 'if-none-match "*")
866 @result{} *
867 @end example
868 @end deftypevr
869
870 @deftypevr {HTTP Header} ETag|Date if-range
871 Indicates that the range request should proceed if and only if the
872 resource matches a modification date or an etag. Either an entity tag,
873 or a SRFI-19 date.
874 @example
875 (parse-header 'if-range "\"original-etag\"")
876 @result{} ("original-etag" . #t)
877 @end example
878 @end deftypevr
879
880 @deftypevr {HTTP Header} Date if-unmodified-since
881 Indicates that a response should proceed if and only if the resource has
882 not been modified since the given date.
883 @example
884 (parse-header 'if-not-modified-since "Tue, 15 Nov 1994 08:12:31 GMT")
885 @result{} #<date ...>
886 @end example
887 @end deftypevr
888
889 @deftypevr {HTTP Header} UInt max-forwards
890 The maximum number of proxy or gateway hops that a request should be
891 subject to.
892 @example
893 (parse-header 'max-forwards "10")
894 @result{} 10
895 @end example
896 @end deftypevr
897
898 @deftypevr {HTTP Header} Pair proxy-authorization
899 Authorization credentials for a proxy connection. See the documentation
900 for @code{authorization} above for more information on the format.
901 @example
902 (parse-header 'proxy-authorization "Digest foo=bar,baz=qux"
903 @result{} (digest (foo . "bar") (baz . "qux"))
904 @end example
905 @end deftypevr
906
907 @deftypevr {HTTP Header} Pair range
908 A range request, indicating that the client wants only part of a
909 resource. The car of the pair is the symbol @code{bytes}, and the cdr
910 is a list of pairs. Each element of the cdr indicates a range; the car
911 is the first byte position and the cdr is the last byte position, as
912 integers, or @code{#f} if not given.
913 @example
914 (parse-header 'range "bytes=10-30,50-")
915 @result{} (bytes (10 . 30) (50 . #f))
916 @end example
917 @end deftypevr
918
919 @deftypevr {HTTP Header} URI referer
920 The URI of the resource that referred the user to this resource. The
921 name of the header is a misspelling, but we are stuck with it.
922 @example
923 (parse-header 'referer "http://www.gnu.org/")
924 @result{} #<uri ...>
925 @end example
926 @end deftypevr
927
928 @deftypevr {HTTP Header} List te
929 A list of transfer codings, expressed as key-value lists. A common
930 transfer coding is @code{trailers}.
931 @example
932 (parse-header 'te "trailers")
933 @result{} ((trailers))
934 @end example
935 @end deftypevr
936
937 @deftypevr {HTTP Header} String user-agent
938 A string indicating the user agent making the request. The
939 specification defines a structured format for this header, but it is
940 widely disregarded, so Guile does not attempt to parse strictly.
941 @example
942 (parse-header 'user-agent "Mozilla/5.0")
943 @result{} "Mozilla/5.0"
944 @end example
945 @end deftypevr
946
947
948 @subsubsection Response Headers
949
950 @deftypevr {HTTP Header} List accept-ranges
951 A list of range units that the server supports, as symbols.
952 @example
953 (parse-header 'accept-ranges "bytes")
954 @result{} (bytes)
955 @end example
956 @end deftypevr
957
958 @deftypevr {HTTP Header} UInt age
959 The age of a cached response, in seconds.
960 @example
961 (parse-header 'age "3600")
962 @result{} 3600
963 @end example
964 @end deftypevr
965
966 @deftypevr {HTTP Header} ETag etag
967 The entity-tag of the resource.
968 @example
969 (parse-header 'etag "\"foo\"")
970 @result{} ("foo" . #t)
971 @end example
972 @end deftypevr
973
974 @deftypevr {HTTP Header} URI location
975 A URI on which a request may be completed. Used in combination with a
976 redirecting status code to perform client-side redirection.
977 @example
978 (parse-header 'location "http://example.com/other")
979 @result{} #<uri ...>
980 @end example
981 @end deftypevr
982
983 @deftypevr {HTTP Header} List proxy-authenticate
984 A list of challenges to a proxy, indicating the need for authentication.
985 @example
986 (parse-header 'proxy-authenticate "Basic realm=\"foo\"")
987 @result{} ((basic (realm . "foo")))
988 @end example
989 @end deftypevr
990
991 @deftypevr {HTTP Header} UInt|Date retry-after
992 Used in combination with a server-busy status code, like 503, to
993 indicate that a client should retry later. Either a number of seconds,
994 or a date.
995 @example
996 (parse-header 'retry-after "60")
997 @result{} 60
998 @end example
999 @end deftypevr
1000
1001 @deftypevr {HTTP Header} String server
1002 A string identifying the server.
1003 @example
1004 (parse-header 'server "My first web server")
1005 @result{} "My first web server"
1006 @end example
1007 @end deftypevr
1008
1009 @deftypevr {HTTP Header} *|List vary
1010 A set of request headers that were used in computing this response.
1011 Used to indicate that server-side content negotation was performed, for
1012 example in response to the @code{accept-language} header. Can also be
1013 the symbol @code{*}, indicating that all headers were considered.
1014 @example
1015 (parse-header 'vary "Accept-Language, Accept")
1016 @result{} (accept-language accept)
1017 @end example
1018 @end deftypevr
1019
1020 @deftypevr {HTTP Header} List www-authenticate
1021 A list of challenges to a user, indicating the need for authentication.
1022 @example
1023 (parse-header 'www-authenticate "Basic realm=\"foo\"")
1024 @result{} ((basic (realm . "foo")))
1025 @end example
1026 @end deftypevr
1027
1028
1029 @node Requests
1030 @subsection HTTP Requests
1031
1032 @example
1033 (use-modules (web request))
1034 @end example
1035
1036 The request module contains a data type for HTTP requests.
1037
1038 @subsubsection An Important Note on Character Sets
1039
1040 HTTP requests consist of two parts: the request proper, consisting of a
1041 request line and a set of headers, and (optionally) a body. The body
1042 might have a binary content-type, and even in the textual case its
1043 length is specified in bytes, not characters.
1044
1045 Therefore, HTTP is a fundamentally binary protocol. However the request
1046 line and headers are specified to be in a subset of ASCII, so they can
1047 be treated as text, provided that the port's encoding is set to an
1048 ASCII-compatible one-byte-per-character encoding. ISO-8859-1 (latin-1)
1049 is just such an encoding, and happens to be very efficient for Guile.
1050
1051 So what Guile does when reading requests from the wire, or writing them
1052 out, is to set the port's encoding to latin-1, and treating the request
1053 headers as text.
1054
1055 The request body is another issue. For binary data, the data is
1056 probably in a bytevector, so we use the R6RS binary output procedures to
1057 write out the binary payload. Textual data usually has to be written
1058 out to some character encoding, usually UTF-8, and then the resulting
1059 bytevector is written out to the port.
1060
1061 In summary, Guile reads and writes HTTP over latin-1 sockets, without
1062 any loss of generality.
1063
1064 @subsubsection Request API
1065
1066 @defun request?
1067 @defunx request-method
1068 @defunx request-uri
1069 @defunx request-version
1070 @defunx request-headers
1071 @defunx request-meta
1072 @defunx request-port
1073 A predicate and field accessors for the request type. The fields are as
1074 follows:
1075 @table @code
1076 @item method
1077 The HTTP method, for example, @code{GET}.
1078 @item uri
1079 The URI as a URI record.
1080 @item version
1081 The HTTP version pair, like @code{(1 . 1)}.
1082 @item headers
1083 The request headers, as an alist of parsed values.
1084 @item meta
1085 An arbitrary alist of other data, for example information returned in
1086 the @code{sockaddr} from @code{accept} (@pxref{Network Sockets and
1087 Communication}).
1088 @item port
1089 The port on which to read or write a request body, if any.
1090 @end table
1091 @end defun
1092
1093 @defun read-request port [meta='()]
1094 Read an HTTP request from @var{port}, optionally attaching the given
1095 metadata, @var{meta}.
1096
1097 As a side effect, sets the encoding on @var{port} to ISO-8859-1
1098 (latin-1), so that reading one character reads one byte. See the
1099 discussion of character sets above, for more information.
1100
1101 Note that the body is not part of the request. Once you have read a
1102 request, you may read the body separately, and likewise for writing
1103 requests.
1104 @end defun
1105
1106 @defun build-request uri [#:method='GET] [#:version='(1 . 1)] [#:headers='()] [#:port=#f] [#:meta='()] [#:validate-headers?=#t]
1107 Construct an HTTP request object. If @var{validate-headers?} is true,
1108 the headers are each run through their respective validators.
1109 @end defun
1110
1111 @defun write-request r port
1112 Write the given HTTP request to @var{port}.
1113
1114 Return a new request, whose @code{request-port} will continue writing
1115 on @var{port}, perhaps using some transfer encoding.
1116 @end defun
1117
1118 @defun read-request-body r
1119 Reads the request body from @var{r}, as a bytevector. Return @code{#f}
1120 if there was no request body.
1121 @end defun
1122
1123 @defun write-request-body r bv
1124 Write @var{body}, a bytevector, to the port corresponding to the HTTP
1125 request @var{r}.
1126 @end defun
1127
1128 The various headers that are typically associated with HTTP requests may
1129 be accessed with these dedicated accessors. @xref{HTTP Headers}, for
1130 more information on the format of parsed headers.
1131
1132 @defun request-accept request [default='()]
1133 @defunx request-accept-charset request [default='()]
1134 @defunx request-accept-encoding request [default='()]
1135 @defunx request-accept-language request [default='()]
1136 @defunx request-allow request [default='()]
1137 @defunx request-authorization request [default=#f]
1138 @defunx request-cache-control request [default='()]
1139 @defunx request-connection request [default='()]
1140 @defunx request-content-encoding request [default='()]
1141 @defunx request-content-language request [default='()]
1142 @defunx request-content-length request [default=#f]
1143 @defunx request-content-location request [default=#f]
1144 @defunx request-content-md5 request [default=#f]
1145 @defunx request-content-range request [default=#f]
1146 @defunx request-content-type request [default=#f]
1147 @defunx request-date request [default=#f]
1148 @defunx request-expect request [default='()]
1149 @defunx request-expires request [default=#f]
1150 @defunx request-from request [default=#f]
1151 @defunx request-host request [default=#f]
1152 @defunx request-if-match request [default=#f]
1153 @defunx request-if-modified-since request [default=#f]
1154 @defunx request-if-none-match request [default=#f]
1155 @defunx request-if-range request [default=#f]
1156 @defunx request-if-unmodified-since request [default=#f]
1157 @defunx request-last-modified request [default=#f]
1158 @defunx request-max-forwards request [default=#f]
1159 @defunx request-pragma request [default='()]
1160 @defunx request-proxy-authorization request [default=#f]
1161 @defunx request-range request [default=#f]
1162 @defunx request-referer request [default=#f]
1163 @defunx request-te request [default=#f]
1164 @defunx request-trailer request [default='()]
1165 @defunx request-transfer-encoding request [default='()]
1166 @defunx request-upgrade request [default='()]
1167 @defunx request-user-agent request [default=#f]
1168 @defunx request-via request [default='()]
1169 @defunx request-warning request [default='()]
1170 Return the given request header, or @var{default} if none was present.
1171 @end defun
1172
1173 @defun request-absolute-uri r [default-host=#f] [default-port=#f]
1174 A helper routine to determine the absolute URI of a request, using the
1175 @code{host} header and the default host and port.
1176 @end defun
1177
1178
1179 @node Responses
1180 @subsection HTTP Responses
1181
1182 @example
1183 (use-modules (web response))
1184 @end example
1185
1186 As with requests (@pxref{Requests}), Guile offers a data type for HTTP
1187 responses. Again, the body is represented separately from the request.
1188
1189 @defun response?
1190 @defunx response-version
1191 @defunx response-code
1192 @defunx response-reason-phrase response
1193 @defunx response-headers
1194 @defunx response-port
1195 A predicate and field accessors for the response type. The fields are as
1196 follows:
1197 @table @code
1198 @item version
1199 The HTTP version pair, like @code{(1 . 1)}.
1200 @item code
1201 The HTTP response code, like @code{200}.
1202 @item reason-phrase
1203 The reason phrase, or the standard reason phrase for the response's
1204 code.
1205 @item headers
1206 The response headers, as an alist of parsed values.
1207 @item port
1208 The port on which to read or write a response body, if any.
1209 @end table
1210 @end defun
1211
1212 @defun read-response port
1213 Read an HTTP response from @var{port}.
1214
1215 As a side effect, sets the encoding on @var{port} to ISO-8859-1
1216 (latin-1), so that reading one character reads one byte. See the
1217 discussion of character sets in @ref{Responses}, for more information.
1218 @end defun
1219
1220 @defun build-response [#:version='(1 . 1)] [#:code=200] [#:reason-phrase=#f] [#:headers='()] [#:port=#f] [#:validate-headers=#t]
1221 Construct an HTTP response object. If @var{validate-headers?} is true,
1222 the headers are each run through their respective validators.
1223 @end defun
1224
1225 @defun adapt-response-version response version
1226 Adapt the given response to a different HTTP version. Return a new HTTP
1227 response.
1228
1229 The idea is that many applications might just build a response for the
1230 default HTTP version, and this method could handle a number of
1231 programmatic transformations to respond to older HTTP versions (0.9 and
1232 1.0). But currently this function is a bit heavy-handed, just updating
1233 the version field.
1234 @end defun
1235
1236 @defun write-response r port
1237 Write the given HTTP response to @var{port}.
1238
1239 Return a new response, whose @code{response-port} will continue writing
1240 on @var{port}, perhaps using some transfer encoding.
1241 @end defun
1242
1243 @defun read-response-body r
1244 Read the response body from @var{r}, as a bytevector. Returns @code{#f}
1245 if there was no response body.
1246 @end defun
1247
1248 @defun write-response-body r bv
1249 Write @var{body}, a bytevector, to the port corresponding to the HTTP
1250 response @var{r}.
1251 @end defun
1252
1253 As with requests, the various headers that are typically associated with
1254 HTTP responses may be accessed with these dedicated accessors.
1255 @xref{HTTP Headers}, for more information on the format of parsed
1256 headers.
1257
1258 @defun response-accept-ranges response [default=#f]
1259 @defunx response-age response [default='()]
1260 @defunx response-allow response [default='()]
1261 @defunx response-cache-control response [default='()]
1262 @defunx response-connection response [default='()]
1263 @defunx response-content-encoding response [default='()]
1264 @defunx response-content-language response [default='()]
1265 @defunx response-content-length response [default=#f]
1266 @defunx response-content-location response [default=#f]
1267 @defunx response-content-md5 response [default=#f]
1268 @defunx response-content-range response [default=#f]
1269 @defunx response-content-type response [default=#f]
1270 @defunx response-date response [default=#f]
1271 @defunx response-etag response [default=#f]
1272 @defunx response-expires response [default=#f]
1273 @defunx response-last-modified response [default=#f]
1274 @defunx response-location response [default=#f]
1275 @defunx response-pragma response [default='()]
1276 @defunx response-proxy-authenticate response [default=#f]
1277 @defunx response-retry-after response [default=#f]
1278 @defunx response-server response [default=#f]
1279 @defunx response-trailer response [default='()]
1280 @defunx response-transfer-encoding response [default='()]
1281 @defunx response-upgrade response [default='()]
1282 @defunx response-vary response [default='()]
1283 @defunx response-via response [default='()]
1284 @defunx response-warning response [default='()]
1285 @defunx response-www-authenticate response [default=#f]
1286 Return the given response header, or @var{default} if none was present.
1287 @end defun
1288
1289
1290 @node Web Server
1291 @subsection Web Server
1292
1293 @code{(web server)} is a generic web server interface, along with a main
1294 loop implementation for web servers controlled by Guile.
1295
1296 @example
1297 (use-modules (web server))
1298 @end example
1299
1300 The lowest layer is the @code{<server-impl>} object, which defines a set
1301 of hooks to open a server, read a request from a client, write a
1302 response to a client, and close a server. These hooks -- @code{open},
1303 @code{read}, @code{write}, and @code{close}, respectively -- are bound
1304 together in a @code{<server-impl>} object. Procedures in this module take a
1305 @code{<server-impl>} object, if needed.
1306
1307 A @code{<server-impl>} may also be looked up by name. If you pass the
1308 @code{http} symbol to @code{run-server}, Guile looks for a variable
1309 named @code{http} in the @code{(web server http)} module, which should
1310 be bound to a @code{<server-impl>} object. Such a binding is made by
1311 instantiation of the @code{define-server-impl} syntax. In this way the
1312 run-server loop can automatically load other backends if available.
1313
1314 The life cycle of a server goes as follows:
1315
1316 @enumerate
1317 @item
1318 The @code{open} hook is called, to open the server. @code{open} takes 0 or
1319 more arguments, depending on the backend, and returns an opaque
1320 server socket object, or signals an error.
1321
1322 @item
1323 The @code{read} hook is called, to read a request from a new client.
1324 The @code{read} hook takes one argument, the server socket. It should
1325 return three values: an opaque client socket, the request, and the
1326 request body. The request should be a @code{<request>} object, from
1327 @code{(web request)}. The body should be a string or a bytevector, or
1328 @code{#f} if there is no body.
1329
1330 If the read failed, the @code{read} hook may return #f for the client
1331 socket, request, and body.
1332
1333 @item
1334 A user-provided handler procedure is called, with the request
1335 and body as its arguments. The handler should return two
1336 values: the response, as a @code{<response>} record from @code{(web
1337 response)}, and the response body as a string, bytevector, or
1338 @code{#f} if not present. We also allow the reponse to be simply an
1339 alist of headers, in which case a default response object is
1340 constructed with those headers.
1341
1342 @item
1343 The @code{write} hook is called with three arguments: the client
1344 socket, the response, and the body. The @code{write} hook returns no
1345 values.
1346
1347 @item
1348 At this point the request handling is complete. For a loop, we
1349 loop back and try to read a new request.
1350
1351 @item
1352 If the user interrupts the loop, the @code{close} hook is called on
1353 the server socket.
1354 @end enumerate
1355
1356 A user may define a server implementation with the following form:
1357
1358 @defun define-server-impl name open read write close
1359 Make a @code{<server-impl>} object with the hooks @var{open},
1360 @var{read}, @var{write}, and @var{close}, and bind it to the symbol
1361 @var{name} in the current module.
1362 @end defun
1363
1364 @defun lookup-server-impl impl
1365 Look up a server implementation. If @var{impl} is a server
1366 implementation already, it is returned directly. If it is a symbol, the
1367 binding named @var{impl} in the @code{(web server @var{impl})} module is
1368 looked up. Otherwise an error is signaled.
1369
1370 Currently a server implementation is a somewhat opaque type, useful only
1371 for passing to other procedures in this module, like @code{read-client}.
1372 @end defun
1373
1374 The @code{(web server)} module defines a number of routines that use
1375 @code{<server-impl>} objects to implement parts of a web server. Given
1376 that we don't expose the accessors for the various fields of a
1377 @code{<server-impl>}, indeed these routines are the only procedures with
1378 any access to the impl objects.
1379
1380 @defun open-server impl open-params
1381 Open a server for the given implementation. Returns one value, the new
1382 server object. The implementation's @code{open} procedure is applied to
1383 @var{open-params}, which should be a list.
1384 @end defun
1385
1386 @defun read-client impl server
1387 Read a new client from @var{server}, by applying the implementation's
1388 @code{read} procedure to the server. If successful, returns three
1389 values: an object corresponding to the client, a request object, and the
1390 request body. If any exception occurs, returns @code{#f} for all three
1391 values.
1392 @end defun
1393
1394 @defun handle-request handler request body state
1395 Handle a given request, returning the response and body.
1396
1397 The response and response body are produced by calling the given
1398 @var{handler} with @var{request} and @var{body} as arguments.
1399
1400 The elements of @var{state} are also passed to @var{handler} as
1401 arguments, and may be returned as additional values. The new
1402 @var{state}, collected from the @var{handler}'s return values, is then
1403 returned as a list. The idea is that a server loop receives a handler
1404 from the user, along with whatever state values the user is interested
1405 in, allowing the user's handler to explicitly manage its state.
1406 @end defun
1407
1408 @defun sanitize-response request response body
1409 "Sanitize" the given response and body, making them appropriate for the
1410 given request.
1411
1412 As a convenience to web handler authors, @var{response} may be given as
1413 an alist of headers, in which case it is used to construct a default
1414 response. Ensures that the response version corresponds to the request
1415 version. If @var{body} is a string, encodes the string to a bytevector,
1416 in an encoding appropriate for @var{response}. Adds a
1417 @code{content-length} and @code{content-type} header, as necessary.
1418
1419 If @var{body} is a procedure, it is called with a port as an argument,
1420 and the output collected as a bytevector. In the future we might try to
1421 instead use a compressing, chunk-encoded port, and call this procedure
1422 later, in the write-client procedure. Authors are advised not to rely on
1423 the procedure being called at any particular time.
1424 @end defun
1425
1426 @defun write-client impl server client response body
1427 Write an HTTP response and body to @var{client}. If the server and
1428 client support persistent connections, it is the implementation's
1429 responsibility to keep track of the client thereafter, presumably by
1430 attaching it to the @var{server} argument somehow.
1431 @end defun
1432
1433 @defun close-server impl server
1434 Release resources allocated by a previous invocation of
1435 @code{open-server}.
1436 @end defun
1437
1438 Given the procedures above, it is a small matter to make a web server:
1439
1440 @defun serve-one-client handler impl server state
1441 Read one request from @var{server}, call @var{handler} on the request
1442 and body, and write the response to the client. Returns the new state
1443 produced by the handler procedure.
1444 @end defun
1445
1446 @defun run-server handler [impl] [open-params] . state
1447 Run Guile's built-in web server.
1448
1449 @var{handler} should be a procedure that takes two or more arguments,
1450 the HTTP request and request body, and returns two or more values, the
1451 response and response body.
1452
1453 For example, here is a simple "Hello, World!" server:
1454
1455 @example
1456 (define (handler request body)
1457 (values '((content-type . ("text/plain")))
1458 "Hello, World!"))
1459 (run-server handler)
1460 @end example
1461
1462 The response and body will be run through @code{sanitize-response}
1463 before sending back to the client.
1464
1465 Additional arguments to @var{handler} are taken from @var{state}.
1466 Additional return values are accumulated into a new @var{state}, which
1467 will be used for subsequent requests. In this way a handler can
1468 explicitly manage its state.
1469
1470 The default server implementation is @code{http}, which accepts
1471 @var{open-params} like @code{(#:port 8081)}, among others. See "Web
1472 Server" in the manual, for more information.
1473 @end defun
1474
1475
1476 @node Web Examples
1477 @subsection Web Examples
1478
1479 Well, enough about the tedious internals. Let's make a web application!
1480
1481 @subsubsection Hello, World!
1482
1483 The first program we have to write, of course, is ``Hello, World!''.
1484 This means that we have to implement a web handler that does what we
1485 want.
1486
1487 Now we define a handler, a function of two arguments and two return
1488 values:
1489
1490 @example
1491 (define (handler request request-body)
1492 (values @var{response} @var{response-body}))
1493 @end example
1494
1495 In this first example, we take advantage of a short-cut, returning an
1496 alist of headers instead of a proper response object. The response body
1497 is our payload:
1498
1499 @example
1500 (define (hello-world-handler request request-body)
1501 (values '((content-type . ("text/plain")))
1502 "Hello World!"))
1503 @end example
1504
1505 Now let's test it, by running a server with this handler. Load up the
1506 web server module if you haven't yet done so, and run a server with this
1507 handler:
1508
1509 @example
1510 (use-modules (web server))
1511 (run-server hello-world-handler)
1512 @end example
1513
1514 By default, the web server listens for requests on
1515 @code{localhost:8080}. Visit that address in your web browser to
1516 test. If you see the string, @code{Hello World!}, sweet!
1517
1518 @subsubsection Inspecting the Request
1519
1520 The Hello World program above is a general greeter, responding to all
1521 URIs. To make a more exclusive greeter, we need to inspect the request
1522 object, and conditionally produce different results. So let's load up
1523 the request, response, and URI modules, and do just that.
1524
1525 @example
1526 (use-modules (web server)) ; you probably did this already
1527 (use-modules (web request)
1528 (web response)
1529 (web uri))
1530
1531 (define (request-path-components request)
1532 (split-and-decode-uri-path (uri-path (request-uri request))))
1533
1534 (define (hello-hacker-handler request body)
1535 (if (equal? (request-path-components request)
1536 '("hacker"))
1537 (values '((content-type . ("text/plain")))
1538 "Hello hacker!")
1539 (not-found request)))
1540
1541 (run-server hello-hacker-handler)
1542 @end example
1543
1544 Here we see that we have defined a helper to return the components of
1545 the URI path as a list of strings, and used that to check for a request
1546 to @code{/hacker/}. Then the success case is just as before -- visit
1547 @code{http://localhost:8080/hacker/} in your browser to check.
1548
1549 You should always match against URI path components as decoded by
1550 @code{split-and-decode-uri-path}. The above example will work for
1551 @code{/hacker/}, @code{//hacker///}, and @code{/h%61ck%65r}.
1552
1553 But we forgot to define @code{not-found}! If you are pasting these
1554 examples into a REPL, accessing any other URI in your web browser will
1555 drop your Guile console into the debugger:
1556
1557 @example
1558 <unnamed port>:38:7: In procedure module-lookup:
1559 <unnamed port>:38:7: Unbound variable: not-found
1560
1561 Entering a new prompt. Type `,bt' for a backtrace or `,q' to continue.
1562 scheme@@(guile-user) [1]>
1563 @end example
1564
1565 So let's define the function, right there in the debugger. As you
1566 probably know, we'll want to return a 404 response.
1567
1568 @example
1569 ;; Paste this in your REPL
1570 (define (not-found request)
1571 (values (build-response #:code 404)
1572 (string-append "Resource not found: "
1573 (unparse-uri (request-uri request)))))
1574
1575 ;; Now paste this to let the web server keep going:
1576 ,continue
1577 @end example
1578
1579 Now if you access @code{http://localhost/foo/}, you get this error
1580 message. (Note that some popular web browsers won't show
1581 server-generated 404 messages, showing their own instead, unless the 404
1582 message body is long enough.)
1583
1584 @subsubsection Higher-Level Interfaces
1585
1586 The web handler interface is a common baseline that all kinds of Guile
1587 web applications can use. You will usually want to build something on
1588 top of it, however, especially when producing HTML. Here is a simple
1589 example that builds up HTML output using SXML (@pxref{sxml simple}).
1590
1591 First, load up the modules:
1592
1593 @example
1594 (use-modules (web server)
1595 (web request)
1596 (web response)
1597 (sxml simple))
1598 @end example
1599
1600 Now we define a simple templating function that takes a list of HTML
1601 body elements, as SXML, and puts them in our super template:
1602
1603 @example
1604 (define (templatize title body)
1605 `(html (head (title ,title))
1606 (body ,@@body)))
1607 @end example
1608
1609 For example, the simplest Hello HTML can be produced like this:
1610
1611 @example
1612 (sxml->xml (templatize "Hello!" '((b "Hi!"))))
1613 @print{}
1614 <html><head><title>Hello!</title></head><body><b>Hi!</b></body></html>
1615 @end example
1616
1617 Much better to work with Scheme data types than to work with HTML as
1618 strings. Now we define a little response helper:
1619
1620 @example
1621 (define* (respond #:optional body #:key
1622 (status 200)
1623 (title "Hello hello!")
1624 (doctype "<!DOCTYPE html>\n")
1625 (content-type-params '(("charset" . "utf-8")))
1626 (content-type "text/html")
1627 (extra-headers '())
1628 (sxml (and body (templatize title body))))
1629 (values (build-response
1630 #:code status
1631 #:headers `((content-type
1632 . (,content-type ,@@content-type-params))
1633 ,@@extra-headers))
1634 (lambda (port)
1635 (if sxml
1636 (begin
1637 (if doctype (display doctype port))
1638 (sxml->xml sxml port))))))
1639 @end example
1640
1641 Here we see the power of keyword arguments with default initializers. By
1642 the time the arguments are fully parsed, the @code{sxml} local variable
1643 will hold the templated SXML, ready for sending out to the client.
1644
1645 Instead of returning the body as a string, here we give a procedure,
1646 which will be called by the web server to write out the response to the
1647 client.
1648
1649 Now, a simple example using this responder, which lays out the incoming
1650 headers in an HTML table.
1651
1652 @example
1653 (define (debug-page request body)
1654 (respond
1655 `((h1 "hello world!")
1656 (table
1657 (tr (th "header") (th "value"))
1658 ,@@(map (lambda (pair)
1659 `(tr (td (tt ,(with-output-to-string
1660 (lambda () (display (car pair))))))
1661 (td (tt ,(with-output-to-string
1662 (lambda ()
1663 (write (cdr pair))))))))
1664 (request-headers request))))))
1665
1666 (run-server debug-page)
1667 @end example
1668
1669 Now if you visit any local address in your web browser, we actually see
1670 some HTML, finally.
1671
1672 @subsubsection Conclusion
1673
1674 Well, this is about as far as Guile's built-in web support goes, for
1675 now. There are many ways to make a web application, but hopefully by
1676 standardizing the most fundamental data types, users will be able to
1677 choose the approach that suits them best, while also being able to
1678 switch between implementations of the server. This is a relatively new
1679 part of Guile, so if you have feedback, let us know, and we can take it
1680 into account. Happy hacking on the web!
1681
1682 @c Local Variables:
1683 @c TeX-master: "guile.texi"
1684 @c End: