2 @c This is part of the GNU Guile Reference Manual.
3 @c Copyright (C) 2010, 2011, 2012 Free Software Foundation, Inc.
4 @c See the file guile.texi for copying conditions.
7 @section @acronym{HTTP}, the Web, and All That
12 It has always been possible to connect computers together and share
13 information between them, but the rise of the World-Wide Web over the
14 last couple of decades has made it much easier to do so. The result is
15 a richly connected network of computation, in which Guile forms a part.
17 By ``the web'', we mean the HTTP protocol@footnote{Yes, the P is for
18 protocol, but this phrase appears repeatedly in RFC 2616.} as handled by
19 servers, clients, proxies, caches, and the various kinds of messages and
20 message components that can be sent and received by that protocol,
23 On one level, the web is text in motion: the protocols themselves are
24 textual (though the payload may be binary), and it's possible to create
25 a socket and speak text to the web. But such an approach is obviously
26 primitive. This section details the higher-level data types and
27 operations provided by Guile: URIs, HTTP request and response records,
28 and a conventional web server implementation.
30 The material in this section is arranged in ascending order, in which
31 later concepts build on previous ones. If you prefer to start with the
32 highest-level perspective, @pxref{Web Examples}, and work your way
36 * Types and the Web:: Types prevent bugs and security problems.
37 * URIs:: Universal Resource Identifiers.
38 * HTTP:: The Hyper-Text Transfer Protocol.
39 * HTTP Headers:: How Guile represents specific header values.
40 * Transfer Codings:: HTTP Transfer Codings.
41 * Requests:: HTTP requests.
42 * Responses:: HTTP responses.
43 * Web Client:: Accessing web resources over HTTP.
44 * Web Server:: Serving HTTP to the internet.
45 * Web Examples:: How to use this thing.
48 @node Types and the Web
49 @subsection Types and the Web
51 It is a truth universally acknowledged, that a program with good use of
52 data types, will be free from many common bugs. Unfortunately, the
53 common practice in web programming seems to ignore this maxim. This
54 subsection makes the case for expressive data types in web programming.
56 By ``expressive data types'', we mean that the data types @emph{say}
57 something about how a program solves a problem. For example, if we
58 choose to represent dates using SRFI 19 date records (@pxref{SRFI-19}),
59 this indicates that there is a part of the program that will always have
60 valid dates. Error handling for a number of basic cases, like invalid
61 dates, occurs on the boundary in which we produce a SRFI 19 date record
62 from other types, like strings.
64 With regards to the web, data types are helpful in the two broad phases
65 of HTTP messages: parsing and generation.
67 Consider a server, which has to parse a request, and produce a response.
68 Guile will parse the request into an HTTP request object
69 (@pxref{Requests}), with each header parsed into an appropriate Scheme
70 data type. This transition from an incoming stream of characters to
71 typed data is a state change in a program---the strings might parse, or
72 they might not, and something has to happen if they do not. (Guile
73 throws an error in this case.) But after you have the parsed request,
74 ``client'' code (code built on top of the Guile web framework) will not
75 have to check for syntactic validity. The types already make this
78 This state change on the parsing boundary makes programs more robust,
79 as they themselves are freed from the need to do a number of common
80 error checks, and they can use normal Scheme procedures to handle a
81 request instead of ad-hoc string parsers.
83 The need for types on the response generation side (in a server) is more
84 subtle, though not less important. Consider the example of a POST
85 handler, which prints out the text that a user submits from a form.
86 Such a handler might include a procedure like this:
89 ;; First, a helper procedure
90 (define (para . contents)
91 (string-append "<p>" (string-concatenate contents) "</p>"))
93 ;; Now the meat of our simple web application
94 (define (you-said text)
95 (para "You said: " text))
97 (display (you-said "Hi!"))
98 @print{} <p>You said: Hi!</p>
101 This is a perfectly valid implementation, provided that the incoming
102 text does not contain the special HTML characters @samp{<}, @samp{>}, or
103 @samp{&}. But this provision of a restricted character set is not
104 reflected anywhere in the program itself: we must @emph{assume} that the
105 programmer understands this, and performs the check elsewhere.
107 Unfortunately, the short history of the practice of programming does not
108 bear out this assumption. A @dfn{cross-site scripting} (@acronym{XSS})
109 vulnerability is just such a common error in which unfiltered user input
110 is allowed into the output. A user could submit a crafted comment to
111 your web site which results in visitors running malicious Javascript,
112 within the security context of your domain:
115 (display (you-said "<script src=\"http://bad.com/nasty.js\" />"))
116 @print{} <p>You said: <script src="http://bad.com/nasty.js" /></p>
119 The fundamental problem here is that both user data and the program
120 template are represented using strings. This identity means that types
121 can't help the programmer to make a distinction between these two, so
124 There are a number of possible solutions, but perhaps the best is to
125 treat HTML not as strings, but as native s-expressions: as SXML. The
126 basic idea is that HTML is either text, represented by a string, or an
127 element, represented as a tagged list. So @samp{foo} becomes
128 @samp{"foo"}, and @samp{<b>foo</b>} becomes @samp{(b "foo")}.
129 Attributes, if present, go in a tagged list headed by @samp{@@}, like
130 @samp{(img (@@ (src "http://example.com/foo.png")))}. @xref{sxml
131 simple}, for more information.
133 The good thing about SXML is that HTML elements cannot be confused with
134 text. Let's make a new definition of @code{para}:
137 (define (para . contents)
140 (use-modules (sxml simple))
141 (sxml->xml (you-said "Hi!"))
142 @print{} <p>You said: Hi!</p>
144 (sxml->xml (you-said "<i>Rats, foiled again!</i>"))
145 @print{} <p>You said: <i>Rats, foiled again!</i></p>
148 So we see in the second example that HTML elements cannot be unwittingly
149 introduced into the output. However it is now perfectly acceptable to
150 pass SXML to @code{you-said}; in fact, that is the big advantage of SXML
151 over everything-as-a-string.
154 (sxml->xml (you-said (you-said "<Hi!>")))
155 @print{} <p>You said: <p>You said: <Hi!></p></p>
158 The SXML types allow procedures to @emph{compose}. The types make
159 manifest which parts are HTML elements, and which are text. So you
160 needn't worry about escaping user input; the type transition back to a
161 string handles that for you. @acronym{XSS} vulnerabilities are a thing
164 Well. That's all very nice and opinionated and such, but how do I use
168 @subsection Universal Resource Identifiers
170 Guile provides a standard data type for Universal Resource Identifiers
171 (URIs), as defined in RFC 3986.
173 The generic URI syntax is as follows:
176 URI := scheme ":" ["//" [userinfo "@@"] host [":" port]] path \
177 [ "?" query ] [ "#" fragment ]
180 For example, in the URI, @indicateurl{http://www.gnu.org/help/}, the
181 scheme is @code{http}, the host is @code{www.gnu.org}, the path is
182 @code{/help/}, and there is no userinfo, port, query, or fragment. All
183 URIs have a scheme and a path (though the path might be empty). Some
184 URIs have a host, and some of those have ports and userinfo. Any URI
185 might have a query part or a fragment.
187 Userinfo is something of an abstraction, as some legacy URI schemes
188 allowed userinfo of the form @code{@var{username}:@var{passwd}}. But
189 since passwords do not belong in URIs, the RFC does not want to condone
190 this practice, so it calls anything before the @code{@@} sign
193 Properly speaking, a fragment is not part of a URI. For example, when a
194 web browser follows a link to @indicateurl{http://example.com/#foo}, it
195 sends a request for @indicateurl{http://example.com/}, then looks in the
196 resulting page for the fragment identified @code{foo} reference. A
197 fragment identifies a part of a resource, not the resource itself. But
198 it is useful to have a fragment field in the URI record itself, so we
199 hope you will forgive the inconsistency.
202 (use-modules (web uri))
205 The following procedures can be found in the @code{(web uri)}
206 module. Load it into your Guile, using a form like the above, to have
209 @deffn {Scheme Procedure} build-uri scheme [#:userinfo=@code{#f}] [#:host=@code{#f}] @
210 [#:port=@code{#f}] [#:path=@code{""}] [#:query=@code{#f}] @
211 [#:fragment=@code{#f}] [#:validate?=@code{#t}]
212 Construct a URI object. @var{scheme} should be a symbol, and the rest
213 of the fields are either strings or @code{#f}. If @var{validate?} is
214 true, also run some consistency checks to make sure that the constructed
218 @deffn {Scheme Procedure} uri? x
219 @deffnx {Scheme Procedure} uri-scheme uri
220 @deffnx {Scheme Procedure} uri-userinfo uri
221 @deffnx {Scheme Procedure} uri-host uri
222 @deffnx {Scheme Procedure} uri-port uri
223 @deffnx {Scheme Procedure} uri-path uri
224 @deffnx {Scheme Procedure} uri-query uri
225 @deffnx {Scheme Procedure} uri-fragment uri
226 A predicate and field accessors for the URI record type. The URI scheme
227 will be a symbol, and the rest either strings or @code{#f} if not
231 @deffn {Scheme Procedure} string->uri string
232 Parse @var{string} into a URI object. Return @code{#f} if the string
236 @deffn {Scheme Procedure} uri->string uri
237 Serialize @var{uri} to a string. If the URI has a port that is the
238 default port for its scheme, the port is not included in the
242 @deffn {Scheme Procedure} declare-default-port! scheme port
243 Declare a default port for the given URI scheme.
246 @deffn {Scheme Procedure} uri-decode str [#:encoding=@code{"utf-8"}]
247 Percent-decode the given @var{str}, according to @var{encoding}, which
248 should be the name of a character encoding.
250 Note that this function should not generally be applied to a full URI
251 string. For paths, use split-and-decode-uri-path instead. For query
252 strings, split the query on @code{&} and @code{=} boundaries, and decode
253 the components separately.
255 Note also that percent-encoded strings encode @emph{bytes}, not
256 characters. There is no guarantee that a given byte sequence is a valid
257 string encoding. Therefore this routine may signal an error if the
258 decoded bytes are not valid for the given encoding. Pass @code{#f} for
259 @var{encoding} if you want decoded bytes as a bytevector directly.
260 @xref{Ports, @code{set-port-encoding!}}, for more information on
263 Returns a string of the decoded characters, or a bytevector if
264 @var{encoding} was @code{#f}.
267 Fixme: clarify return type. indicate default values. type of
270 @deffn {Scheme Procedure} uri-encode str [#:encoding=@code{"utf-8"}] [#:unescaped-chars]
271 Percent-encode any character not in the character set,
272 @var{unescaped-chars}.
274 The default character set includes alphanumerics from ASCII, as well as
275 the special characters @samp{-}, @samp{.}, @samp{_}, and @samp{~}. Any
276 other character will be percent-encoded, by writing out the character to
277 a bytevector within the given @var{encoding}, then encoding each byte as
278 @code{%@var{HH}}, where @var{HH} is the hexadecimal representation of
282 @deffn {Scheme Procedure} split-and-decode-uri-path path
283 Split @var{path} into its components, and decode each component,
284 removing empty components.
286 For example, @code{"/foo/bar%20baz/"} decodes to the two-element list,
287 @code{("foo" "bar baz")}.
290 @deffn {Scheme Procedure} encode-and-join-uri-path parts
291 URI-encode each element of @var{parts}, which should be a list of
292 strings, and join the parts together with @code{/} as a delimiter.
294 For example, the list @code{("scrambled eggs" "biscuits&gravy")} encodes
295 as @code{"scrambled%20eggs/biscuits%26gravy"}.
299 @subsection The Hyper-Text Transfer Protocol
301 The initial motivation for including web functionality in Guile, rather
302 than rely on an external package, was to establish a standard base on
303 which people can share code. To that end, we continue the focus on data
304 types by providing a number of low-level parsers and unparsers for
305 elements of the HTTP protocol.
307 If you are want to skip the low-level details for now and move on to web
308 pages, @pxref{Web Client}, and @pxref{Web Server}. Otherwise, load the
309 HTTP module, and read on.
312 (use-modules (web http))
315 The focus of the @code{(web http)} module is to parse and unparse
316 standard HTTP headers, representing them to Guile as native data
317 structures. For example, a @code{Date:} header will be represented as a
318 SRFI-19 date record (@pxref{SRFI-19}), rather than as a string.
320 Guile tries to follow RFCs fairly strictly---the road to perdition being
321 paved with compatibility hacks---though some allowances are made for
322 not-too-divergent texts.
324 Header names are represented as lower-case symbols.
326 @deffn {Scheme Procedure} string->header name
327 Parse @var{name} to a symbolic header name.
330 @deffn {Scheme Procedure} header->string sym
331 Return the string form for the header named @var{sym}.
337 (string->header "Content-Length")
338 @result{} content-length
339 (header->string 'content-length)
340 @result{} "Content-Length"
342 (string->header "FOO")
344 (header->string 'foo)
348 Guile keeps a registry of known headers, their string names, and some
349 parsing and serialization procedures. If a header is unknown, its
350 string name is simply its symbol name in title-case.
352 @deffn {Scheme Procedure} known-header? sym
353 Return @code{#t} iff @var{sym} is a known header, with associated
354 parsers and serialization procedures.
357 @deffn {Scheme Procedure} header-parser sym
358 Return the value parser for headers named @var{sym}. The result is a
359 procedure that takes one argument, a string, and returns the parsed
360 value. If the header isn't known to Guile, a default parser is returned
361 that passes through the string unchanged.
364 @deffn {Scheme Procedure} header-validator sym
365 Return a predicate which returns @code{#t} if the given value is valid
366 for headers named @var{sym}. The default validator for unknown headers
370 @deffn {Scheme Procedure} header-writer sym
371 Return a procedure that writes values for headers named @var{sym} to a
372 port. The resulting procedure takes two arguments: a value and a port.
373 The default writer is @code{display}.
376 For more on the set of headers that Guile knows about out of the box,
377 @pxref{HTTP Headers}. To add your own, use the @code{declare-header!}
380 @deffn {Scheme Procedure} declare-header! name parser validator writer [#:multiple?=@code{#f}]
381 Declare a parser, validator, and writer for a given header.
384 For example, let's say you are running a web server behind some sort of
385 proxy, and your proxy adds an @code{X-Client-Address} header, indicating
386 the IPv4 address of the original client. You would like for the HTTP
387 request record to parse out this header to a Scheme value, instead of
388 leaving it as a string. You could register this header with Guile's
389 HTTP stack like this:
392 (declare-header! "X-Client-Address"
396 (and (integer? ip) (exact? ip) (<= 0 ip #xffffffff)))
398 (display (inet-ntoa ip) port)))
401 @deffn {Scheme Procedure} valid-header? sym val
402 Return a true value iff @var{val} is a valid Scheme value for the header
406 Now that we have a generic interface for reading and writing headers, we
409 @deffn {Scheme Procedure} read-header port
410 Read one HTTP header from @var{port}. Return two values: the header
411 name and the parsed Scheme value. May raise an exception if the header
412 was known but the value was invalid.
414 Returns the end-of-file object for both values if the end of the message
415 body was reached (i.e., a blank line).
418 @deffn {Scheme Procedure} parse-header name val
419 Parse @var{val}, a string, with the parser for the header named
420 @var{name}. Returns the parsed value.
423 @deffn {Scheme Procedure} write-header name val port
424 Write the given header name and value to @var{port}, using the writer
425 from @code{header-writer}.
428 @deffn {Scheme Procedure} read-headers port
429 Read the headers of an HTTP message from @var{port}, returning the
430 headers as an ordered alist.
433 @deffn {Scheme Procedure} write-headers headers port
434 Write the given header alist to @var{port}. Doesn't write the final
435 @samp{\r\n}, as the user might want to add another header.
438 The @code{(web http)} module also has some utility procedures to read
439 and write request and response lines.
441 @deffn {Scheme Procedure} parse-http-method str [start] [end]
442 Parse an HTTP method from @var{str}. The result is an upper-case symbol,
446 @deffn {Scheme Procedure} parse-http-version str [start] [end]
447 Parse an HTTP version from @var{str}, returning it as a major-minor
448 pair. For example, @code{HTTP/1.1} parses as the pair of integers,
452 @deffn {Scheme Procedure} parse-request-uri str [start] [end]
453 Parse a URI from an HTTP request line. Note that URIs in requests do not
454 have to have a scheme or host name. The result is a URI object.
457 @deffn {Scheme Procedure} read-request-line port
458 Read the first line of an HTTP request from @var{port}, returning three
459 values: the method, the URI, and the version.
462 @deffn {Scheme Procedure} write-request-line method uri version port
463 Write the first line of an HTTP request to @var{port}.
466 @deffn {Scheme Procedure} read-response-line port
467 Read the first line of an HTTP response from @var{port}, returning three
468 values: the HTTP version, the response code, and the "reason phrase".
471 @deffn {Scheme Procedure} write-response-line version code reason-phrase port
472 Write the first line of an HTTP response to @var{port}.
477 @subsection HTTP Headers
479 In addition to defining the infrastructure to parse headers, the
480 @code{(web http)} module defines specific parsers and unparsers for all
481 headers defined in the HTTP/1.1 standard.
483 For example, if you receive a header named @samp{Accept-Language} with a
484 value @samp{en, es;q=0.8}, Guile parses it as a quality list (defined
488 (parse-header 'accept-language "en, es;q=0.8")
489 @result{} ((1000 . "en") (800 . "es"))
492 The format of the value for @samp{Accept-Language} headers is defined
493 below, along with all other headers defined in the HTTP standard. (If
494 the header were unknown, the value would have been returned as a
497 For brevity, the header definitions below are given in the form,
498 @var{Type} @code{@var{name}}, indicating that values for the header
499 @code{@var{name}} will be of the given @var{Type}. Since Guile
500 internally treats header names in lower case, in this document we give
501 types title-cased names. A short description of the each header's
502 purpose and an example follow.
504 For full details on the meanings of all of these headers, see the HTTP
505 1.1 standard, RFC 2616.
507 @subsubsection HTTP Header Types
509 Here we define the types that are used below, when defining headers.
511 @deftp {HTTP Header Type} Date
515 @deftp {HTTP Header Type} KVList
516 A list whose elements are keys or key-value pairs. Keys are parsed to
517 symbols. Values are strings by default. Non-string values are the
518 exception, and are mentioned explicitly below, as appropriate.
521 @deftp {HTTP Header Type} SList
525 @deftp {HTTP Header Type} Quality
526 An exact integer between 0 and 1000. Qualities are used to express
527 preference, given multiple options. An option with a quality of 870,
528 for example, is preferred over an option with quality 500.
530 (Qualities are written out over the wire as numbers between 0.0 and
531 1.0, but since the standard only allows three digits after the decimal,
532 it's equivalent to integers between 0 and 1000, so that's what Guile
536 @deftp {HTTP Header Type} QList
537 A quality list: a list of pairs, the car of which is a quality, and the
538 cdr a string. Used to express a list of options, along with their
542 @deftp {HTTP Header Type} ETag
543 An entity tag, represented as a pair. The car of the pair is an opaque
544 string, and the cdr is @code{#t} if the entity tag is a ``strong'' entity
545 tag, and @code{#f} otherwise.
548 @subsubsection General Headers
550 General HTTP headers may be present in any HTTP message.
552 @deftypevr {HTTP Header} KVList cache-control
553 A key-value list of cache-control directives. See RFC 2616, for more
556 If present, parameters to @code{max-age}, @code{max-stale},
557 @code{min-fresh}, and @code{s-maxage} are all parsed as non-negative
560 If present, parameters to @code{private} and @code{no-cache} are parsed
561 as lists of header names, as symbols.
564 (parse-header 'cache-control "no-cache,no-store"
565 @result{} (no-cache no-store)
566 (parse-header 'cache-control "no-cache=\"Authorization,Date\",no-store"
567 @result{} ((no-cache . (authorization date)) no-store)
568 (parse-header 'cache-control "no-cache=\"Authorization,Date\",max-age=10"
569 @result{} ((no-cache . (authorization date)) (max-age . 10))
573 @deftypevr {HTTP Header} List connection
574 A list of header names that apply only to this HTTP connection, as
575 symbols. Additionally, the symbol @samp{close} may be present, to
576 indicate that the server should close the connection after responding to
579 (parse-header 'connection "close")
584 @deftypevr {HTTP Header} Date date
585 The date that a given HTTP message was originated.
587 (parse-header 'date "Tue, 15 Nov 1994 08:12:31 GMT")
588 @result{} #<date ...>
592 @deftypevr {HTTP Header} KVList pragma
593 A key-value list of implementation-specific directives.
595 (parse-header 'pragma "no-cache, broccoli=tasty")
596 @result{} (no-cache (broccoli . "tasty"))
600 @deftypevr {HTTP Header} List trailer
601 A list of header names which will appear after the message body, instead
602 of with the message headers.
604 (parse-header 'trailer "ETag")
609 @deftypevr {HTTP Header} List transfer-encoding
610 A list of transfer codings, expressed as key-value lists. The only
611 transfer coding defined by the specification is @code{chunked}.
613 (parse-header 'transfer-encoding "chunked")
614 @result{} ((chunked))
618 @deftypevr {HTTP Header} List upgrade
619 A list of strings, indicating additional protocols that a server could use
620 in response to a request.
622 (parse-header 'upgrade "WebSocket")
623 @result{} ("WebSocket")
627 FIXME: parse out more fully?
628 @deftypevr {HTTP Header} List via
629 A list of strings, indicating the protocol versions and hosts of
630 intermediate servers and proxies. There may be multiple @code{via}
631 headers in one message.
633 (parse-header 'via "1.0 venus, 1.1 mars")
634 @result{} ("1.0 venus" "1.1 mars")
638 @deftypevr {HTTP Header} List warning
639 A list of warnings given by a server or intermediate proxy. Each
640 warning is a itself a list of four elements: a code, as an exact integer
641 between 0 and 1000, a host as a string, the warning text as a string,
642 and either @code{#f} or a SRFI-19 date.
644 There may be multiple @code{warning} headers in one message.
646 (parse-header 'warning "123 foo \"core breach imminent\"")
647 @result{} ((123 "foo" "core-breach imminent" #f))
652 @subsubsection Entity Headers
654 Entity headers may be present in any HTTP message, and refer to the
655 resource referenced in the HTTP request or response.
657 @deftypevr {HTTP Header} List allow
658 A list of allowed methods on a given resource, as symbols.
660 (parse-header 'allow "GET, HEAD")
665 @deftypevr {HTTP Header} List content-encoding
666 A list of content codings, as symbols.
668 (parse-header 'content-encoding "gzip")
673 @deftypevr {HTTP Header} List content-language
674 The languages that a resource is in, as strings.
676 (parse-header 'content-language "en")
681 @deftypevr {HTTP Header} UInt content-length
682 The number of bytes in a resource, as an exact, non-negative integer.
684 (parse-header 'content-length "300")
689 @deftypevr {HTTP Header} URI content-location
690 The canonical URI for a resource, in the case that it is also accessible
691 from a different URI.
693 (parse-header 'content-location "http://example.com/foo")
694 @result{} #<<uri> ...>
698 @deftypevr {HTTP Header} String content-md5
699 The MD5 digest of a resource.
701 (parse-header 'content-md5 "ffaea1a79810785575e29e2bd45e2fa5")
702 @result{} "ffaea1a79810785575e29e2bd45e2fa5"
706 @deftypevr {HTTP Header} List content-range
707 A range specification, as a list of three elements: the symbol
708 @code{bytes}, either the symbol @code{*} or a pair of integers,
709 indicating the byte rage, and either @code{*} or an integer, for the
710 instance length. Used to indicate that a response only includes part of
713 (parse-header 'content-range "bytes 10-20/*")
714 @result{} (bytes (10 . 20) *)
718 @deftypevr {HTTP Header} List content-type
719 The MIME type of a resource, as a symbol, along with any parameters.
721 (parse-header 'content-length "text/plain")
722 @result{} (text/plain)
723 (parse-header 'content-length "text/plain;charset=utf-8")
724 @result{} (text/plain (charset . "utf-8"))
726 Note that the @code{charset} parameter is something is a misnomer, and
727 the HTTP specification admits this. It specifies the @emph{encoding} of
728 the characters, not the character set.
731 @deftypevr {HTTP Header} Date expires
732 The date/time after which the resource given in a response is considered
735 (parse-header 'expires "Tue, 15 Nov 1994 08:12:31 GMT")
736 @result{} #<date ...>
740 @deftypevr {HTTP Header} Date last-modified
741 The date/time on which the resource given in a response was last
744 (parse-header 'expires "Tue, 15 Nov 1994 08:12:31 GMT")
745 @result{} #<date ...>
750 @subsubsection Request Headers
752 Request headers may only appear in an HTTP request, not in a response.
754 @deftypevr {HTTP Header} List accept
755 A list of preferred media types for a response. Each element of the
756 list is itself a list, in the same format as @code{content-type}.
758 (parse-header 'accept "text/html,text/plain;charset=utf-8")
759 @result{} ((text/html) (text/plain (charset . "utf-8")))
761 Preference is expressed with quality values:
763 (parse-header 'accept "text/html;q=0.8,text/plain;q=0.6")
764 @result{} ((text/html (q . 800)) (text/plain (q . 600)))
768 @deftypevr {HTTP Header} QList accept-charset
769 A quality list of acceptable charsets. Note again that what HTTP calls
770 a ``charset'' is what Guile calls a ``character encoding''.
772 (parse-header 'accept-charset "iso-8859-5, unicode-1-1;q=0.8")
773 @result{} ((1000 . "iso-8859-5") (800 . "unicode-1-1"))
777 @deftypevr {HTTP Header} QList accept-encoding
778 A quality list of acceptable content codings.
780 (parse-header 'accept-encoding "gzip,identity=0.8")
781 @result{} ((1000 . "gzip") (800 . "identity"))
785 @deftypevr {HTTP Header} QList accept-language
786 A quality list of acceptable languages.
788 (parse-header 'accept-language "cn,en=0.75")
789 @result{} ((1000 . "cn") (750 . "en"))
793 @deftypevr {HTTP Header} Pair authorization
794 Authorization credentials. The car of the pair indicates the
795 authentication scheme, like @code{basic}. For basic authentication, the
796 cdr of the pair will be the base64-encoded @samp{@var{user}:@var{pass}}
797 string. For other authentication schemes, like @code{digest}, the cdr
798 will be a key-value list of credentials.
800 (parse-header 'authorization "Basic QWxhZGRpbjpvcGVuIHNlc2FtZQ=="
801 @result{} (basic . "QWxhZGRpbjpvcGVuIHNlc2FtZQ==")
805 @deftypevr {HTTP Header} List expect
806 A list of expectations that a client has of a server. The expectations
809 (parse-header 'expect "100-continue")
810 @result{} ((100-continue))
814 @deftypevr {HTTP Header} String from
815 The email address of a user making an HTTP request.
817 (parse-header 'from "bob@@example.com")
818 @result{} "bob@@example.com"
822 @deftypevr {HTTP Header} Pair host
823 The host for the resource being requested, as a hostname-port pair. If
824 no port is given, the port is @code{#f}.
826 (parse-header 'host "gnu.org:80")
827 @result{} ("gnu.org" . 80)
828 (parse-header 'host "gnu.org")
829 @result{} ("gnu.org" . #f)
833 @deftypevr {HTTP Header} *|List if-match
834 A set of etags, indicating that the request should proceed if and only
835 if the etag of the resource is in that set. Either the symbol @code{*},
836 indicating any etag, or a list of entity tags.
838 (parse-header 'if-match "*")
840 (parse-header 'if-match "asdfadf")
841 @result{} (("asdfadf" . #t))
842 (parse-header 'if-match W/"asdfadf")
843 @result{} (("asdfadf" . #f))
847 @deftypevr {HTTP Header} Date if-modified-since
848 Indicates that a response should proceed if and only if the resource has
849 been modified since the given date.
851 (parse-header 'if-modified-since "Tue, 15 Nov 1994 08:12:31 GMT")
852 @result{} #<date ...>
856 @deftypevr {HTTP Header} *|List if-none-match
857 A set of etags, indicating that the request should proceed if and only
858 if the etag of the resource is not in the set. Either the symbol
859 @code{*}, indicating any etag, or a list of entity tags.
861 (parse-header 'if-none-match "*")
866 @deftypevr {HTTP Header} ETag|Date if-range
867 Indicates that the range request should proceed if and only if the
868 resource matches a modification date or an etag. Either an entity tag,
871 (parse-header 'if-range "\"original-etag\"")
872 @result{} ("original-etag" . #t)
876 @deftypevr {HTTP Header} Date if-unmodified-since
877 Indicates that a response should proceed if and only if the resource has
878 not been modified since the given date.
880 (parse-header 'if-not-modified-since "Tue, 15 Nov 1994 08:12:31 GMT")
881 @result{} #<date ...>
885 @deftypevr {HTTP Header} UInt max-forwards
886 The maximum number of proxy or gateway hops that a request should be
889 (parse-header 'max-forwards "10")
894 @deftypevr {HTTP Header} Pair proxy-authorization
895 Authorization credentials for a proxy connection. See the documentation
896 for @code{authorization} above for more information on the format.
898 (parse-header 'proxy-authorization "Digest foo=bar,baz=qux"
899 @result{} (digest (foo . "bar") (baz . "qux"))
903 @deftypevr {HTTP Header} Pair range
904 A range request, indicating that the client wants only part of a
905 resource. The car of the pair is the symbol @code{bytes}, and the cdr
906 is a list of pairs. Each element of the cdr indicates a range; the car
907 is the first byte position and the cdr is the last byte position, as
908 integers, or @code{#f} if not given.
910 (parse-header 'range "bytes=10-30,50-")
911 @result{} (bytes (10 . 30) (50 . #f))
915 @deftypevr {HTTP Header} URI referer
916 The URI of the resource that referred the user to this resource. The
917 name of the header is a misspelling, but we are stuck with it.
919 (parse-header 'referer "http://www.gnu.org/")
924 @deftypevr {HTTP Header} List te
925 A list of transfer codings, expressed as key-value lists. A common
926 transfer coding is @code{trailers}.
928 (parse-header 'te "trailers")
929 @result{} ((trailers))
933 @deftypevr {HTTP Header} String user-agent
934 A string indicating the user agent making the request. The
935 specification defines a structured format for this header, but it is
936 widely disregarded, so Guile does not attempt to parse strictly.
938 (parse-header 'user-agent "Mozilla/5.0")
939 @result{} "Mozilla/5.0"
944 @subsubsection Response Headers
946 @deftypevr {HTTP Header} List accept-ranges
947 A list of range units that the server supports, as symbols.
949 (parse-header 'accept-ranges "bytes")
954 @deftypevr {HTTP Header} UInt age
955 The age of a cached response, in seconds.
957 (parse-header 'age "3600")
962 @deftypevr {HTTP Header} ETag etag
963 The entity-tag of the resource.
965 (parse-header 'etag "\"foo\"")
966 @result{} ("foo" . #t)
970 @deftypevr {HTTP Header} URI location
971 A URI on which a request may be completed. Used in combination with a
972 redirecting status code to perform client-side redirection.
974 (parse-header 'location "http://example.com/other")
979 @deftypevr {HTTP Header} List proxy-authenticate
980 A list of challenges to a proxy, indicating the need for authentication.
982 (parse-header 'proxy-authenticate "Basic realm=\"foo\"")
983 @result{} ((basic (realm . "foo")))
987 @deftypevr {HTTP Header} UInt|Date retry-after
988 Used in combination with a server-busy status code, like 503, to
989 indicate that a client should retry later. Either a number of seconds,
992 (parse-header 'retry-after "60")
997 @deftypevr {HTTP Header} String server
998 A string identifying the server.
1000 (parse-header 'server "My first web server")
1001 @result{} "My first web server"
1005 @deftypevr {HTTP Header} *|List vary
1006 A set of request headers that were used in computing this response.
1007 Used to indicate that server-side content negotiation was performed, for
1008 example in response to the @code{accept-language} header. Can also be
1009 the symbol @code{*}, indicating that all headers were considered.
1011 (parse-header 'vary "Accept-Language, Accept")
1012 @result{} (accept-language accept)
1016 @deftypevr {HTTP Header} List www-authenticate
1017 A list of challenges to a user, indicating the need for authentication.
1019 (parse-header 'www-authenticate "Basic realm=\"foo\"")
1020 @result{} ((basic (realm . "foo")))
1024 @node Transfer Codings
1025 @subsection Transfer Codings
1027 HTTP 1.1 allows for various transfer codings to be applied to message
1028 bodies. These include various types of compression, and HTTP chunked
1029 encoding. Currently, only chunked encoding is supported by guile.
1031 Chunked coding is an optional coding that may be applied to message
1032 bodies, to allow messages whose length is not known beforehand to be
1033 returned. Such messages can be split into chunks, terminated by a final
1036 In order to make dealing with encodings more simple, guile provides
1037 procedures to create ports that ``wrap'' existing ports, applying
1038 transformations transparently under the hood.
1040 @deffn {Scheme Procedure} make-chunked-input-port port [#:keep-alive?=#f]
1041 Returns a new port, that transparently reads and decodes chunk-encoded
1042 data from @var{port}. If no more chunk-encoded data is available, it
1043 returns the end-of-file object. When the port is closed, @var{port} will
1044 also be closed, unless @var{keep-alive?} is true.
1048 (use-modules (ice-9 rdelim))
1050 (define s "5\r\nFirst\r\nA\r\n line\n Sec\r\n8\r\nond line\r\n0\r\n")
1051 (define p (make-chunked-input-port (open-input-string s)))
1053 @result{} "First line"
1055 @result{} "Second line"
1058 @deffn {Scheme Procedure} make-chunked-output-port port [#:keep-alive?=#f]
1059 Returns a new port, which transparently encodes data as chunk-encoded
1060 before writing it to @var{port}. Whenever a write occurs on this port,
1061 it buffers it, until the port is flushed, at which point it writes a
1062 chunk containing all the data written so far. When the port is closed,
1063 the data remaining is written to @var{port}, as is the terminating zero
1064 chunk. It also causes @var{port} to be closed, unless @var{keep-alive?}
1067 Note. Forcing a chunked output port when there is no data is buffered
1068 does not write a zero chunk, as this would cause the data to be
1069 interpreted incorrectly by the client.
1073 (call-with-output-string
1075 (define out* (make-chunked-output-port out #:keep-alive? #t))
1076 (display "first chunk" out*)
1078 (force-output out*) ; note this does not write a zero chunk
1079 (display "second chunk" out*)
1081 @result{} "b\r\nfirst chunk\r\nc\r\nsecond chunk\r\n0\r\n"
1085 @subsection HTTP Requests
1088 (use-modules (web request))
1091 The request module contains a data type for HTTP requests.
1093 @subsubsection An Important Note on Character Sets
1095 HTTP requests consist of two parts: the request proper, consisting of a
1096 request line and a set of headers, and (optionally) a body. The body
1097 might have a binary content-type, and even in the textual case its
1098 length is specified in bytes, not characters.
1100 Therefore, HTTP is a fundamentally binary protocol. However the request
1101 line and headers are specified to be in a subset of ASCII, so they can
1102 be treated as text, provided that the port's encoding is set to an
1103 ASCII-compatible one-byte-per-character encoding. ISO-8859-1 (latin-1)
1104 is just such an encoding, and happens to be very efficient for Guile.
1106 So what Guile does when reading requests from the wire, or writing them
1107 out, is to set the port's encoding to latin-1, and treating the request
1110 The request body is another issue. For binary data, the data is
1111 probably in a bytevector, so we use the R6RS binary output procedures to
1112 write out the binary payload. Textual data usually has to be written
1113 out to some character encoding, usually UTF-8, and then the resulting
1114 bytevector is written out to the port.
1116 In summary, Guile reads and writes HTTP over latin-1 sockets, without
1117 any loss of generality.
1119 @subsubsection Request API
1121 @deffn {Scheme Procedure} request?
1122 @deffnx {Scheme Procedure} request-method
1123 @deffnx {Scheme Procedure} request-uri
1124 @deffnx {Scheme Procedure} request-version
1125 @deffnx {Scheme Procedure} request-headers
1126 @deffnx {Scheme Procedure} request-meta
1127 @deffnx {Scheme Procedure} request-port
1128 A predicate and field accessors for the request type. The fields are as
1132 The HTTP method, for example, @code{GET}.
1134 The URI as a URI record.
1136 The HTTP version pair, like @code{(1 . 1)}.
1138 The request headers, as an alist of parsed values.
1140 An arbitrary alist of other data, for example information returned in
1141 the @code{sockaddr} from @code{accept} (@pxref{Network Sockets and
1144 The port on which to read or write a request body, if any.
1148 @deffn {Scheme Procedure} read-request port [meta='()]
1149 Read an HTTP request from @var{port}, optionally attaching the given
1150 metadata, @var{meta}.
1152 As a side effect, sets the encoding on @var{port} to ISO-8859-1
1153 (latin-1), so that reading one character reads one byte. See the
1154 discussion of character sets above, for more information.
1156 Note that the body is not part of the request. Once you have read a
1157 request, you may read the body separately, and likewise for writing
1161 @deffn {Scheme Procedure} build-request uri [#:method='GET] [#:version='(1 . 1)] [#:headers='()] [#:port=#f] [#:meta='()] [#:validate-headers?=#t]
1162 Construct an HTTP request object. If @var{validate-headers?} is true,
1163 the headers are each run through their respective validators.
1166 @deffn {Scheme Procedure} write-request r port
1167 Write the given HTTP request to @var{port}.
1169 Return a new request, whose @code{request-port} will continue writing
1170 on @var{port}, perhaps using some transfer encoding.
1173 @deffn {Scheme Procedure} read-request-body r
1174 Reads the request body from @var{r}, as a bytevector. Return @code{#f}
1175 if there was no request body.
1178 @deffn {Scheme Procedure} write-request-body r bv
1179 Write @var{bv}, a bytevector, to the port corresponding to the HTTP
1183 The various headers that are typically associated with HTTP requests may
1184 be accessed with these dedicated accessors. @xref{HTTP Headers}, for
1185 more information on the format of parsed headers.
1187 @deffn {Scheme Procedure} request-accept request [default='()]
1188 @deffnx {Scheme Procedure} request-accept-charset request [default='()]
1189 @deffnx {Scheme Procedure} request-accept-encoding request [default='()]
1190 @deffnx {Scheme Procedure} request-accept-language request [default='()]
1191 @deffnx {Scheme Procedure} request-allow request [default='()]
1192 @deffnx {Scheme Procedure} request-authorization request [default=#f]
1193 @deffnx {Scheme Procedure} request-cache-control request [default='()]
1194 @deffnx {Scheme Procedure} request-connection request [default='()]
1195 @deffnx {Scheme Procedure} request-content-encoding request [default='()]
1196 @deffnx {Scheme Procedure} request-content-language request [default='()]
1197 @deffnx {Scheme Procedure} request-content-length request [default=#f]
1198 @deffnx {Scheme Procedure} request-content-location request [default=#f]
1199 @deffnx {Scheme Procedure} request-content-md5 request [default=#f]
1200 @deffnx {Scheme Procedure} request-content-range request [default=#f]
1201 @deffnx {Scheme Procedure} request-content-type request [default=#f]
1202 @deffnx {Scheme Procedure} request-date request [default=#f]
1203 @deffnx {Scheme Procedure} request-expect request [default='()]
1204 @deffnx {Scheme Procedure} request-expires request [default=#f]
1205 @deffnx {Scheme Procedure} request-from request [default=#f]
1206 @deffnx {Scheme Procedure} request-host request [default=#f]
1207 @deffnx {Scheme Procedure} request-if-match request [default=#f]
1208 @deffnx {Scheme Procedure} request-if-modified-since request [default=#f]
1209 @deffnx {Scheme Procedure} request-if-none-match request [default=#f]
1210 @deffnx {Scheme Procedure} request-if-range request [default=#f]
1211 @deffnx {Scheme Procedure} request-if-unmodified-since request [default=#f]
1212 @deffnx {Scheme Procedure} request-last-modified request [default=#f]
1213 @deffnx {Scheme Procedure} request-max-forwards request [default=#f]
1214 @deffnx {Scheme Procedure} request-pragma request [default='()]
1215 @deffnx {Scheme Procedure} request-proxy-authorization request [default=#f]
1216 @deffnx {Scheme Procedure} request-range request [default=#f]
1217 @deffnx {Scheme Procedure} request-referer request [default=#f]
1218 @deffnx {Scheme Procedure} request-te request [default=#f]
1219 @deffnx {Scheme Procedure} request-trailer request [default='()]
1220 @deffnx {Scheme Procedure} request-transfer-encoding request [default='()]
1221 @deffnx {Scheme Procedure} request-upgrade request [default='()]
1222 @deffnx {Scheme Procedure} request-user-agent request [default=#f]
1223 @deffnx {Scheme Procedure} request-via request [default='()]
1224 @deffnx {Scheme Procedure} request-warning request [default='()]
1225 Return the given request header, or @var{default} if none was present.
1228 @deffn {Scheme Procedure} request-absolute-uri r [default-host=#f] [default-port=#f]
1229 A helper routine to determine the absolute URI of a request, using the
1230 @code{host} header and the default host and port.
1235 @subsection HTTP Responses
1238 (use-modules (web response))
1241 As with requests (@pxref{Requests}), Guile offers a data type for HTTP
1242 responses. Again, the body is represented separately from the request.
1244 @deffn {Scheme Procedure} response?
1245 @deffnx {Scheme Procedure} response-version
1246 @deffnx {Scheme Procedure} response-code
1247 @deffnx {Scheme Procedure} response-reason-phrase response
1248 @deffnx {Scheme Procedure} response-headers
1249 @deffnx {Scheme Procedure} response-port
1250 A predicate and field accessors for the response type. The fields are as
1254 The HTTP version pair, like @code{(1 . 1)}.
1256 The HTTP response code, like @code{200}.
1258 The reason phrase, or the standard reason phrase for the response's
1261 The response headers, as an alist of parsed values.
1263 The port on which to read or write a response body, if any.
1267 @deffn {Scheme Procedure} read-response port
1268 Read an HTTP response from @var{port}.
1270 As a side effect, sets the encoding on @var{port} to ISO-8859-1
1271 (latin-1), so that reading one character reads one byte. See the
1272 discussion of character sets in @ref{Responses}, for more information.
1275 @deffn {Scheme Procedure} build-response [#:version='(1 . 1)] [#:code=200] [#:reason-phrase=#f] [#:headers='()] [#:port=#f] [#:validate-headers?=#t]
1276 Construct an HTTP response object. If @var{validate-headers?} is true,
1277 the headers are each run through their respective validators.
1280 @deffn {Scheme Procedure} adapt-response-version response version
1281 Adapt the given response to a different HTTP version. Return a new HTTP
1284 The idea is that many applications might just build a response for the
1285 default HTTP version, and this method could handle a number of
1286 programmatic transformations to respond to older HTTP versions (0.9 and
1287 1.0). But currently this function is a bit heavy-handed, just updating
1291 @deffn {Scheme Procedure} write-response r port
1292 Write the given HTTP response to @var{port}.
1294 Return a new response, whose @code{response-port} will continue writing
1295 on @var{port}, perhaps using some transfer encoding.
1298 @deffn {Scheme Procedure} response-must-not-include-body? r
1299 Some responses, like those with status code 304, are specified as never
1300 having bodies. This predicate returns @code{#t} for those responses.
1302 Note also, though, that responses to @code{HEAD} requests must also not
1306 @deffn {Scheme Procedure} read-response-body r
1307 Read the response body from @var{r}, as a bytevector. Returns @code{#f}
1308 if there was no response body.
1311 @deffn {Scheme Procedure} write-response-body r bv
1312 Write @var{bv}, a bytevector, to the port corresponding to the HTTP
1316 As with requests, the various headers that are typically associated with
1317 HTTP responses may be accessed with these dedicated accessors.
1318 @xref{HTTP Headers}, for more information on the format of parsed
1321 @deffn {Scheme Procedure} response-accept-ranges response [default=#f]
1322 @deffnx {Scheme Procedure} response-age response [default='()]
1323 @deffnx {Scheme Procedure} response-allow response [default='()]
1324 @deffnx {Scheme Procedure} response-cache-control response [default='()]
1325 @deffnx {Scheme Procedure} response-connection response [default='()]
1326 @deffnx {Scheme Procedure} response-content-encoding response [default='()]
1327 @deffnx {Scheme Procedure} response-content-language response [default='()]
1328 @deffnx {Scheme Procedure} response-content-length response [default=#f]
1329 @deffnx {Scheme Procedure} response-content-location response [default=#f]
1330 @deffnx {Scheme Procedure} response-content-md5 response [default=#f]
1331 @deffnx {Scheme Procedure} response-content-range response [default=#f]
1332 @deffnx {Scheme Procedure} response-content-type response [default=#f]
1333 @deffnx {Scheme Procedure} response-date response [default=#f]
1334 @deffnx {Scheme Procedure} response-etag response [default=#f]
1335 @deffnx {Scheme Procedure} response-expires response [default=#f]
1336 @deffnx {Scheme Procedure} response-last-modified response [default=#f]
1337 @deffnx {Scheme Procedure} response-location response [default=#f]
1338 @deffnx {Scheme Procedure} response-pragma response [default='()]
1339 @deffnx {Scheme Procedure} response-proxy-authenticate response [default=#f]
1340 @deffnx {Scheme Procedure} response-retry-after response [default=#f]
1341 @deffnx {Scheme Procedure} response-server response [default=#f]
1342 @deffnx {Scheme Procedure} response-trailer response [default='()]
1343 @deffnx {Scheme Procedure} response-transfer-encoding response [default='()]
1344 @deffnx {Scheme Procedure} response-upgrade response [default='()]
1345 @deffnx {Scheme Procedure} response-vary response [default='()]
1346 @deffnx {Scheme Procedure} response-via response [default='()]
1347 @deffnx {Scheme Procedure} response-warning response [default='()]
1348 @deffnx {Scheme Procedure} response-www-authenticate response [default=#f]
1349 Return the given response header, or @var{default} if none was present.
1354 @subsection Web Client
1356 @code{(web client)} provides a simple, synchronous HTTP client, built on
1357 the lower-level HTTP, request, and response modules.
1359 @deffn {Scheme Procedure} open-socket-for-uri uri
1362 @deffn {Scheme Procedure} http-get uri [#:port=(open-socket-for-uri uri)] [#:version='(1 . 1)] [#:keep-alive?=#f] [#:extra-headers='()] [#:decode-body?=#t]
1363 Connect to the server corresponding to @var{uri} and ask for the
1364 resource, using the @code{GET} method. If you already have a port open,
1365 pass it as @var{port}. The port will be closed at the end of the
1366 request unless @var{keep-alive?} is true. Any extra headers in the
1367 alist @var{extra-headers} will be added to the request.
1369 If @var{decode-body?} is true, as is the default, the body of the
1370 response will be decoded to string, if it is a textual content-type.
1371 Otherwise it will be returned as a bytevector.
1374 @code{http-get} is useful for making one-off requests to web sites. If
1375 you are writing a web spider or some other client that needs to handle a
1376 number of requests in parallel, it's better to build an event-driven URL
1377 fetcher, similar in structure to the web server (@pxref{Web Server}).
1379 Another option, good but not as performant, would be to use threads,
1380 possibly via par-map or futures.
1382 More helper procedures for the other common HTTP verbs would be a good
1383 addition to this module. Send your code to
1384 @email{guile-user@@gnu.org}.
1388 @subsection Web Server
1390 @code{(web server)} is a generic web server interface, along with a main
1391 loop implementation for web servers controlled by Guile.
1394 (use-modules (web server))
1397 The lowest layer is the @code{<server-impl>} object, which defines a set
1398 of hooks to open a server, read a request from a client, write a
1399 response to a client, and close a server. These hooks -- @code{open},
1400 @code{read}, @code{write}, and @code{close}, respectively -- are bound
1401 together in a @code{<server-impl>} object. Procedures in this module take a
1402 @code{<server-impl>} object, if needed.
1404 A @code{<server-impl>} may also be looked up by name. If you pass the
1405 @code{http} symbol to @code{run-server}, Guile looks for a variable
1406 named @code{http} in the @code{(web server http)} module, which should
1407 be bound to a @code{<server-impl>} object. Such a binding is made by
1408 instantiation of the @code{define-server-impl} syntax. In this way the
1409 run-server loop can automatically load other backends if available.
1411 The life cycle of a server goes as follows:
1415 The @code{open} hook is called, to open the server. @code{open} takes 0 or
1416 more arguments, depending on the backend, and returns an opaque
1417 server socket object, or signals an error.
1420 The @code{read} hook is called, to read a request from a new client.
1421 The @code{read} hook takes one argument, the server socket. It should
1422 return three values: an opaque client socket, the request, and the
1423 request body. The request should be a @code{<request>} object, from
1424 @code{(web request)}. The body should be a string or a bytevector, or
1425 @code{#f} if there is no body.
1427 If the read failed, the @code{read} hook may return #f for the client
1428 socket, request, and body.
1431 A user-provided handler procedure is called, with the request and body
1432 as its arguments. The handler should return two values: the response,
1433 as a @code{<response>} record from @code{(web response)}, and the
1434 response body as bytevector, or @code{#f} if not present.
1436 The respose and response body are run through @code{sanitize-response},
1437 documented below. This allows the handler writer to take some
1438 convenient shortcuts: for example, instead of a @code{<response>}, the
1439 handler can simply return an alist of headers, in which case a default
1440 response object is constructed with those headers. Instead of a
1441 bytevector for the body, the handler can return a string, which will be
1442 serialized into an appropriate encoding; or it can return a procedure,
1443 which will be called on a port to write out the data. See the
1444 @code{sanitize-response} documentation, for more.
1447 The @code{write} hook is called with three arguments: the client
1448 socket, the response, and the body. The @code{write} hook returns no
1452 At this point the request handling is complete. For a loop, we
1453 loop back and try to read a new request.
1456 If the user interrupts the loop, the @code{close} hook is called on
1460 A user may define a server implementation with the following form:
1462 @deffn {Scheme Procedure} define-server-impl name open read write close
1463 Make a @code{<server-impl>} object with the hooks @var{open},
1464 @var{read}, @var{write}, and @var{close}, and bind it to the symbol
1465 @var{name} in the current module.
1468 @deffn {Scheme Procedure} lookup-server-impl impl
1469 Look up a server implementation. If @var{impl} is a server
1470 implementation already, it is returned directly. If it is a symbol, the
1471 binding named @var{impl} in the @code{(web server @var{impl})} module is
1472 looked up. Otherwise an error is signaled.
1474 Currently a server implementation is a somewhat opaque type, useful only
1475 for passing to other procedures in this module, like @code{read-client}.
1478 The @code{(web server)} module defines a number of routines that use
1479 @code{<server-impl>} objects to implement parts of a web server. Given
1480 that we don't expose the accessors for the various fields of a
1481 @code{<server-impl>}, indeed these routines are the only procedures with
1482 any access to the impl objects.
1484 @deffn {Scheme Procedure} open-server impl open-params
1485 Open a server for the given implementation. Return one value, the new
1486 server object. The implementation's @code{open} procedure is applied to
1487 @var{open-params}, which should be a list.
1490 @deffn {Scheme Procedure} read-client impl server
1491 Read a new client from @var{server}, by applying the implementation's
1492 @code{read} procedure to the server. If successful, return three
1493 values: an object corresponding to the client, a request object, and the
1494 request body. If any exception occurs, return @code{#f} for all three
1498 @deffn {Scheme Procedure} handle-request handler request body state
1499 Handle a given request, returning the response and body.
1501 The response and response body are produced by calling the given
1502 @var{handler} with @var{request} and @var{body} as arguments.
1504 The elements of @var{state} are also passed to @var{handler} as
1505 arguments, and may be returned as additional values. The new
1506 @var{state}, collected from the @var{handler}'s return values, is then
1507 returned as a list. The idea is that a server loop receives a handler
1508 from the user, along with whatever state values the user is interested
1509 in, allowing the user's handler to explicitly manage its state.
1512 @deffn {Scheme Procedure} sanitize-response request response body
1513 "Sanitize" the given response and body, making them appropriate for the
1516 As a convenience to web handler authors, @var{response} may be given as
1517 an alist of headers, in which case it is used to construct a default
1518 response. Ensures that the response version corresponds to the request
1519 version. If @var{body} is a string, encodes the string to a bytevector,
1520 in an encoding appropriate for @var{response}. Adds a
1521 @code{content-length} and @code{content-type} header, as necessary.
1523 If @var{body} is a procedure, it is called with a port as an argument,
1524 and the output collected as a bytevector. In the future we might try to
1525 instead use a compressing, chunk-encoded port, and call this procedure
1526 later, in the write-client procedure. Authors are advised not to rely on
1527 the procedure being called at any particular time.
1530 @deffn {Scheme Procedure} write-client impl server client response body
1531 Write an HTTP response and body to @var{client}. If the server and
1532 client support persistent connections, it is the implementation's
1533 responsibility to keep track of the client thereafter, presumably by
1534 attaching it to the @var{server} argument somehow.
1537 @deffn {Scheme Procedure} close-server impl server
1538 Release resources allocated by a previous invocation of
1542 Given the procedures above, it is a small matter to make a web server:
1544 @deffn {Scheme Procedure} serve-one-client handler impl server state
1545 Read one request from @var{server}, call @var{handler} on the request
1546 and body, and write the response to the client. Return the new state
1547 produced by the handler procedure.
1550 @deffn {Scheme Procedure} run-server handler [impl='http] [open-params='()] . state
1551 Run Guile's built-in web server.
1553 @var{handler} should be a procedure that takes two or more arguments,
1554 the HTTP request and request body, and returns two or more values, the
1555 response and response body.
1557 For examples, skip ahead to the next section, @ref{Web Examples}.
1559 The response and body will be run through @code{sanitize-response}
1560 before sending back to the client.
1562 Additional arguments to @var{handler} are taken from @var{state}.
1563 Additional return values are accumulated into a new @var{state}, which
1564 will be used for subsequent requests. In this way a handler can
1565 explicitly manage its state.
1568 The default web server implementation is @code{http}, which binds to a
1569 socket, listening for request on that port.
1571 @deffn {HTTP Implementation} http [#:host=#f] [#:family=AF_INET] [#:addr=INADDR_LOOPBACK] [#:port 8080] [#:socket]
1572 The default HTTP implementation. We document it as a function with
1573 keyword arguments, because that is precisely the way that it is -- all
1574 of the @var{open-params} to @code{run-server} get passed to the
1575 implementation's open function.
1578 ;; The defaults: localhost:8080
1579 (run-server handler)
1581 (run-server handler 'http '())
1582 ;; On a different port
1583 (run-server handler 'http '(#:port 8081))
1585 (run-server handler 'http '(#:family AF_INET6 #:port 8081))
1587 (run-server handler 'http `(#:socket ,(sudo-make-me-a-socket)))
1592 @subsection Web Examples
1594 Well, enough about the tedious internals. Let's make a web application!
1596 @subsubsection Hello, World!
1598 The first program we have to write, of course, is ``Hello, World!''.
1599 This means that we have to implement a web handler that does what we
1602 Now we define a handler, a function of two arguments and two return
1606 (define (handler request request-body)
1607 (values @var{response} @var{response-body}))
1610 In this first example, we take advantage of a short-cut, returning an
1611 alist of headers instead of a proper response object. The response body
1615 (define (hello-world-handler request request-body)
1616 (values '((content-type . (text/plain)))
1620 Now let's test it, by running a server with this handler. Load up the
1621 web server module if you haven't yet done so, and run a server with this
1625 (use-modules (web server))
1626 (run-server hello-world-handler)
1629 By default, the web server listens for requests on
1630 @code{localhost:8080}. Visit that address in your web browser to
1631 test. If you see the string, @code{Hello World!}, sweet!
1633 @subsubsection Inspecting the Request
1635 The Hello World program above is a general greeter, responding to all
1636 URIs. To make a more exclusive greeter, we need to inspect the request
1637 object, and conditionally produce different results. So let's load up
1638 the request, response, and URI modules, and do just that.
1641 (use-modules (web server)) ; you probably did this already
1642 (use-modules (web request)
1646 (define (request-path-components request)
1647 (split-and-decode-uri-path (uri-path (request-uri request))))
1649 (define (hello-hacker-handler request body)
1650 (if (equal? (request-path-components request)
1652 (values '((content-type . (text/plain)))
1654 (not-found request)))
1656 (run-server hello-hacker-handler)
1659 Here we see that we have defined a helper to return the components of
1660 the URI path as a list of strings, and used that to check for a request
1661 to @code{/hacker/}. Then the success case is just as before -- visit
1662 @code{http://localhost:8080/hacker/} in your browser to check.
1664 You should always match against URI path components as decoded by
1665 @code{split-and-decode-uri-path}. The above example will work for
1666 @code{/hacker/}, @code{//hacker///}, and @code{/h%61ck%65r}.
1668 But we forgot to define @code{not-found}! If you are pasting these
1669 examples into a REPL, accessing any other URI in your web browser will
1670 drop your Guile console into the debugger:
1673 <unnamed port>:38:7: In procedure module-lookup:
1674 <unnamed port>:38:7: Unbound variable: not-found
1676 Entering a new prompt. Type `,bt' for a backtrace or `,q' to continue.
1677 scheme@@(guile-user) [1]>
1680 So let's define the function, right there in the debugger. As you
1681 probably know, we'll want to return a 404 response.
1684 ;; Paste this in your REPL
1685 (define (not-found request)
1686 (values (build-response #:code 404)
1687 (string-append "Resource not found: "
1688 (uri->string (request-uri request)))))
1690 ;; Now paste this to let the web server keep going:
1694 Now if you access @code{http://localhost/foo/}, you get this error
1695 message. (Note that some popular web browsers won't show
1696 server-generated 404 messages, showing their own instead, unless the 404
1697 message body is long enough.)
1699 @subsubsection Higher-Level Interfaces
1701 The web handler interface is a common baseline that all kinds of Guile
1702 web applications can use. You will usually want to build something on
1703 top of it, however, especially when producing HTML. Here is a simple
1704 example that builds up HTML output using SXML (@pxref{sxml simple}).
1706 First, load up the modules:
1709 (use-modules (web server)
1715 Now we define a simple templating function that takes a list of HTML
1716 body elements, as SXML, and puts them in our super template:
1719 (define (templatize title body)
1720 `(html (head (title ,title))
1724 For example, the simplest Hello HTML can be produced like this:
1727 (sxml->xml (templatize "Hello!" '((b "Hi!"))))
1729 <html><head><title>Hello!</title></head><body><b>Hi!</b></body></html>
1732 Much better to work with Scheme data types than to work with HTML as
1733 strings. Now we define a little response helper:
1736 (define* (respond #:optional body #:key
1738 (title "Hello hello!")
1739 (doctype "<!DOCTYPE html>\n")
1740 (content-type-params '((charset . "utf-8")))
1741 (content-type 'text/html)
1743 (sxml (and body (templatize title body))))
1744 (values (build-response
1746 #:headers `((content-type
1747 . (,content-type ,@@content-type-params))
1752 (if doctype (display doctype port))
1753 (sxml->xml sxml port))))))
1756 Here we see the power of keyword arguments with default initializers. By
1757 the time the arguments are fully parsed, the @code{sxml} local variable
1758 will hold the templated SXML, ready for sending out to the client.
1760 Also, instead of returning the body as a string, @code{respond} gives a
1761 procedure, which will be called by the web server to write out the
1762 response to the client.
1764 Now, a simple example using this responder, which lays out the incoming
1765 headers in an HTML table.
1768 (define (debug-page request body)
1770 `((h1 "hello world!")
1772 (tr (th "header") (th "value"))
1773 ,@@(map (lambda (pair)
1774 `(tr (td (tt ,(with-output-to-string
1775 (lambda () (display (car pair))))))
1776 (td (tt ,(with-output-to-string
1778 (write (cdr pair))))))))
1779 (request-headers request))))))
1781 (run-server debug-page)
1784 Now if you visit any local address in your web browser, we actually see
1787 @subsubsection Conclusion
1789 Well, this is about as far as Guile's built-in web support goes, for
1790 now. There are many ways to make a web application, but hopefully by
1791 standardizing the most fundamental data types, users will be able to
1792 choose the approach that suits them best, while also being able to
1793 switch between implementations of the server. This is a relatively new
1794 part of Guile, so if you have feedback, let us know, and we can take it
1795 into account. Happy hacking on the web!
1798 @c TeX-master: "guile.texi"