2 @c This is part of the GNU Guile Reference Manual.
3 @c Copyright (C) 2010 Free Software Foundation, Inc.
4 @c See the file guile.texi for copying conditions.
7 @section @acronym{HTTP}, the Web, and All That
12 When Guile started back in the mid-nineties, the GNU system was still
13 focused on producing a good POSIX implementation. This is why Guile's
14 POSIX support is good, and has been so for a while.
16 But times change, and in a way these days the web is the new POSIX: a
17 standard and a motley set of implementations on which much computing is
18 done. So today's Guile also supports the web at the programming
19 language level, by defining common data types and operations for the
20 technologies underpinning the web: URIs, HTTP, and XML.
22 It is particularly important to define native web data types. Though
23 the web is text in motion, programming the web in text is like
24 programming with @code{goto}: muddy, and error-prone. Most current
25 security problems on the web are due to treating the web as text instead
26 of as instances of the proper data types.
28 In addition, common web data types help programmers to share code.
30 Well. That's all very nice and opinionated and such, but how do I use
34 * URIs:: Universal Resource Identifiers.
35 * HTTP:: The Hyper-Text Transfer Protocol.
36 * Requests:: HTTP requests.
37 * Responses:: HTTP responses.
38 * Web Handlers:: A simple web application interface.
39 * Web Server:: Serving HTTP to the internet.
43 @subsection Universal Resource Identifiers
45 Guile provides a standard data type for Universal Resource Identifiers
46 (URIs), as defined in RFC 3986.
48 The generic URI syntax is as follows:
51 URI := scheme ":" ["//" [userinfo "@"] host [":" port]] path \
52 [ "?" query ] [ "#" fragment ]
55 So, all URIs have a scheme and a path. Some URIs have a host, and some
56 of those have ports and userinfo. Any URI might have a query part or a
59 Userinfo is something of an abstraction, as some legacy URI schemes
60 allowed userinfo of the form @code{@var{username}:@var{passwd}}.
61 Passwords don't belong in URIs, so the RFC does not want to condone
62 this, but neither can it say that what is before the @code{@} sign is
63 just a username, so the RFC punts on the issue and calls it
66 Also, strictly speaking, a URI with a fragment is a @dfn{URI
67 reference}. A fragment is typically not serialized when sending a URI
68 over the wire; that is, it is not part of the identifier of a resource.
69 It only identifies a part of a given resource. But it's useful to have
70 a field for it in the URI record itself, so we hope you will forgive the
74 (use-modules (web uri))
77 The following procedures can be found in the @code{(web uri)}
78 module. Load it into your Guile, using a form like the above, to have
81 @defun build-uri scheme [#:userinfo] [#:host] [#:port] [#:path] [#:query] [#:fragment] [#:validate?]
82 Construct a URI object. If @var{validate?} is true, also run some
83 consistency checks to make sure that the constructed URI is valid.
88 @defunx uri-userinfo uri
93 @defunx uri-fragment uri
94 A predicate and field accessors for the URI record type.
97 @defun declare-default-port! scheme port
98 Declare a default port for the given URI scheme.
100 Default ports are for printing URI objects: a default port is not
104 @defun parse-uri string
105 Parse @var{string} into a URI object. Returns @code{#f} if the string
109 @defun unparse-uri uri
110 Serialize @var{uri} to a string.
113 @defun uri-decode str [#:charset]
114 Percent-decode the given @var{str}, according to @var{charset}.
116 Note that this function should not generally be applied to a full URI
117 string. For paths, use split-and-decode-uri-path instead. For query
118 strings, split the query on @code{&} and @code{=} boundaries, and decode
119 the components separately.
121 Note that percent-encoded strings encode @emph{bytes}, not characters.
122 There is no guarantee that a given byte sequence is a valid string
123 encoding. Therefore this routine may signal an error if the decoded
124 bytes are not valid for the given encoding. Pass @code{#f} for
125 @var{charset} if you want decoded bytes as a bytevector directly.
128 @defun uri-encode str [#:charset] [#:unescaped-chars]
129 Percent-encode any character not in @var{unescaped-chars}.
131 Percent-encoding first writes out the given character to a bytevector
132 within the given @var{charset}, then encodes each byte as
133 @code{%@var{HH}}, where @var{HH} is the hexadecimal representation of
137 @defun split-and-decode-uri-path path
138 Split @var{path} into its components, and decode each component,
139 removing empty components.
141 For example, @code{"/foo/bar/"} decodes to the two-element list,
142 @code{("foo" "bar")}.
145 @defun encode-and-join-uri-path parts
146 URI-encode each element of @var{parts}, which should be a list of
147 strings, and join the parts together with @code{/} as a delimiter.
151 @subsection The Hyper-Text Transfer Protocol
153 The initial motivation for including web functionality in Guile, rather
154 than rely on an external package, was to establish a standard base on
155 which people can share code. To that end, we continue the focus on data
156 types by providing a number of low-level parsers and unparsers for
157 elements of the HTTP protocol.
159 If you are want to skip the low-level details for now and move on to web
160 pages, @pxref{Web Server}. Otherwise, load the HTTP module, and read
164 (use-modules (web http))
167 The focus of the @code{(web http)} module is to parse and unparse
168 standard HTTP headers, representing them to Guile as native data
169 structures. For example, a @code{Date:} header will be represented as a
170 SRFI-19 date record (@pxref{SRFI-19}), rather than as a string.
172 Guile tries to follow RFCs fairly strictly---the road to perdition being
173 paved with compatibility hacks---though some allowances are made for
174 not-too-divergent texts.
176 The first bit is to define a registry of parsers, validators, and
177 unparsers, keyed by header name. That is the function of the
178 @code{<header-decl>} object.
180 @defun make-header-decl sym name multiple? parser validator writer
181 @defunx header-decl? x
182 @defunx header-decl-sym decl
183 @defunx header-decl-name decl
184 @defunx header-decl-multiple? decl
185 @defunx header-decl-parser decl
186 @defunx header-decl-validator decl
187 @defunx header-decl-writer decl.
188 A constructor, predicate, and field accessors for the
189 @code{<header-decl>} type. The fields are as follows:
193 The symbol name for this header field, always in lower-case. For
194 example, @code{"Content-Length"} has a symbolic name of
195 @code{content-length}.
197 The string name of the header, in its preferred capitalization.
199 @code{#t} iff this header may appear multiple times in a message.
201 A procedure which takes a string and returns a parsed value.
203 A predicate, returning @code{#t} iff the value is valid for this header.
205 A writer, which writes a value to the port given in the second argument.
209 @defun declare-header! sym name [#:multiple?] [#:parser] [#:validator] [#:writer]
210 Make a header declaration, as above, and register it by symbol and by
214 @defun lookup-header-decl name
215 Return the @var{header-decl} object registered for the given @var{name}.
217 @var{name} may be a symbol or a string. Strings are mapped to headers in
218 a case-insensitive fashion.
221 @defun valid-header? sym val
222 Returns a true value iff @var{val} is a valid Scheme value for the
223 header with name @var{sym}.
226 Now that we have a generic interface for reading and writing headers, we
229 @defun read-header port
230 Reads one HTTP header from @var{port}. Returns two values: the header
231 name and the parsed Scheme value. May raise an exception if the header
232 was known but the value was invalid.
234 Returns @var{#f} for both values if the end of the message body was
235 reached (i.e., a blank line).
238 @defun parse-header name val
239 Parse @var{val}, a string, with the parser for the header named
242 Returns two values, the header name and parsed value. If a parser was
243 found, the header name will be returned as a symbol. If a parser was not
244 found, both the header name and the value are returned as strings.
247 @defun write-header name val port
248 Writes the given header name and value to @var{port}. If @var{name} is a
249 symbol, looks up a declared header and uses that writer. Otherwise the
250 value is written using @var{display}.
253 @defun read-headers port
254 Read an HTTP message from @var{port}, returning the headers as an
258 @defun write-headers headers port
259 Write the given header alist to @var{port}. Doesn't write the final
260 \r\n, as the user might want to add another header.
263 The @code{(web http)} module also has some utility procedures to read
264 and write request and response lines.
266 @defun parse-http-method str [start] [end]
267 Parse an HTTP method from @var{str}. The result is an upper-case symbol,
271 @defun parse-http-version str [start] [end]
272 Parse an HTTP version from @var{str}, returning it as a major-minor
273 pair. For example, @code{HTTP/1.1} parses as the pair of integers,
277 @defun parse-request-uri str [start] [end]
278 Parse a URI from an HTTP request line. Note that URIs in requests do not
279 have to have a scheme or host name. The result is a URI object.
282 @defun read-request-line port
283 Read the first line of an HTTP request from @var{port}, returning three
284 values: the method, the URI, and the version.
287 @defun write-request-line method uri version port
288 Write the first line of an HTTP request to @var{port}.
291 @defun read-response-line port
292 Read the first line of an HTTP response from @var{port}, returning three
293 values: the HTTP version, the response code, and the "reason phrase".
296 @defun write-response-line version code reason-phrase port
297 Write the first line of an HTTP response to @var{port}.
302 @subsection HTTP Requests
305 (use-modules (web request))
311 @defun request-method
317 @defun request-version
320 @defun request-headers
329 @defun read-request port [meta]
330 Read an HTTP request from @var{port}, optionally attaching the given
331 metadata, @var{meta}.
333 As a side effect, sets the encoding on @var{port} to ISO-8859-1
334 (latin-1), so that reading one character reads one byte. See the
335 discussion of character sets in "HTTP Requests" in the manual, for more
339 @defun build-request [#:method] [#:uri] [#:version] [#:headers] [#:port] [#:meta] [#:validate-headers?]
340 Construct an HTTP request object. If @var{validate-headers?} is true,
341 the headers are each run through their respective validators.
344 @defun write-request r port
345 Write the given HTTP request to @var{port}.
347 Returns a new request, whose @code{request-port} will continue writing
348 on @var{port}, perhaps using some transfer encoding.
351 @defun read-request-body/latin-1 r
352 Reads the request body from @var{r}, as a string.
354 Assumes that the request port has ISO-8859-1 encoding, so that the
355 number of characters to read is the same as the
356 @code{request-content-length}. Returns @code{#f} if there was no request
360 @defun write-request-body/latin-1 r body
361 Write @var{body}, a string encodable in ISO-8859-1, to the port
362 corresponding to the HTTP request @var{r}.
365 @defun read-request-body/bytevector r
366 Reads the request body from @var{r}, as a bytevector. Returns @code{#f}
367 if there was no request body.
370 @defun write-request-body/bytevector r bv
371 Write @var{body}, a bytevector, to the port corresponding to the HTTP
375 @defun request-accept request [default='()]
376 @defunx request-accept-charset request [default='()]
377 @defunx request-accept-encoding request [default='()]
378 @defunx request-accept-language request [default='()]
379 @defunx request-allow request [default='()]
380 @defunx request-authorization request [default=#f]
381 @defunx request-cache-control request [default='()]
382 @defunx request-connection request [default='()]
383 @defunx request-content-encoding request [default='()]
384 @defunx request-content-language request [default='()]
385 @defunx request-content-length request [default=#f]
386 @defunx request-content-location request [default=#f]
387 @defunx request-content-md5 request [default=#f]
388 @defunx request-content-range request [default=#f]
389 @defunx request-content-type request [default=#f]
390 @defunx request-date request [default=#f]
391 @defunx request-expect request [default='()]
392 @defunx request-expires request [default=#f]
393 @defunx request-from request [default=#f]
394 @defunx request-host request [default=#f]
395 @defunx request-if-match request [default=#f]
396 @defunx request-if-modified-since request [default=#f]
397 @defunx request-if-none-match request [default=#f]
398 @defunx request-if-range request [default=#f]
399 @defunx request-if-unmodified-since request [default=#f]
400 @defunx request-last-modified request [default=#f]
401 @defunx request-max-forwards request [default=#f]
402 @defunx request-pragma request [default='()]
403 @defunx request-proxy-authorization request [default=#f]
404 @defunx request-range request [default=#f]
405 @defunx request-referer request [default=#f]
406 @defunx request-te request [default=#f]
407 @defunx request-trailer request [default='()]
408 @defunx request-transfer-encoding request [default='()]
409 @defunx request-upgrade request [default='()]
410 @defunx request-user-agent request [default=#f]
411 @defunx request-via request [default='()]
412 @defunx request-warning request [default='()]
415 @defun request-absolute-uri r [default-host] [default-port]
421 @subsection HTTP Responses
424 (use-modules (web response))
431 @defun response-version
437 @defun response-reason-phrase response
438 Return the reason phrase given in @var{response}, or the standard reason
439 phrase for the response's code.
442 @defun response-headers
448 @defun read-response port
449 Read an HTTP response from @var{port}, optionally attaching the given
450 metadata, @var{meta}.
452 As a side effect, sets the encoding on @var{port} to ISO-8859-1
453 (latin-1), so that reading one character reads one byte. See the
454 discussion of character sets in "HTTP Responses" in the manual, for more
458 @defun build-response [#:version] [#:code] [#:reason-phrase] [#:headers] [#:port]
459 Construct an HTTP response object. If @var{validate-headers?} is true,
460 the headers are each run through their respective validators.
463 @defun extend-response r k v . additional
464 Extend an HTTP response by setting additional HTTP headers @var{k},
465 @var{v}. Returns a new HTTP response.
468 @defun adapt-response-version response version
469 Adapt the given response to a different HTTP version. Returns a new HTTP
472 The idea is that many applications might just build a response for the
473 default HTTP version, and this method could handle a number of
474 programmatic transformations to respond to older HTTP versions (0.9 and
475 1.0). But currently this function is a bit heavy-handed, just updating
479 @defun write-response r port
480 Write the given HTTP response to @var{port}.
482 Returns a new response, whose @code{response-port} will continue writing
483 on @var{port}, perhaps using some transfer encoding.
486 @defun read-response-body/latin-1 r
487 Reads the response body from @var{r}, as a string.
489 Assumes that the response port has ISO-8859-1 encoding, so that the
490 number of characters to read is the same as the
491 @code{response-content-length}. Returns @code{#f} if there was no
495 @defun write-response-body/latin-1 r body
496 Write @var{body}, a string encodable in ISO-8859-1, to the port
497 corresponding to the HTTP response @var{r}.
500 @defun read-response-body/bytevector r
501 Reads the response body from @var{r}, as a bytevector. Returns @code{#f}
502 if there was no response body.
505 @defun write-response-body/bytevector r bv
506 Write @var{body}, a bytevector, to the port corresponding to the HTTP
510 @defun response-accept-ranges response [default=#f]
511 @defunx response-age response [default='()]
512 @defunx response-allow response [default='()]
513 @defunx response-cache-control response [default='()]
514 @defunx response-connection response [default='()]
515 @defunx response-content-encoding response [default='()]
516 @defunx response-content-language response [default='()]
517 @defunx response-content-length response [default=#f]
518 @defunx response-content-location response [default=#f]
519 @defunx response-content-md5 response [default=#f]
520 @defunx response-content-range response [default=#f]
521 @defunx response-content-type response [default=#f]
522 @defunx response-date response [default=#f]
523 @defunx response-etag response [default=#f]
524 @defunx response-expires response [default=#f]
525 @defunx response-last-modified response [default=#f]
526 @defunx response-location response [default=#f]
527 @defunx response-pragma response [default='()]
528 @defunx response-proxy-authenticate response [default=#f]
529 @defunx response-retry-after response [default=#f]
530 @defunx response-server response [default=#f]
531 @defunx response-trailer response [default='()]
532 @defunx response-transfer-encoding response [default='()]
533 @defunx response-upgrade response [default='()]
534 @defunx response-vary response [default='()]
535 @defunx response-via response [default='()]
536 @defunx response-warning response [default='()]
537 @defunx response-www-authenticate response [default=#f]
542 @subsection Web Handlers
544 from request to response
547 @subsection Web Server
549 @code{(web server)} is a generic web server interface, along with a main
550 loop implementation for web servers controlled by Guile.
552 The lowest layer is the <server-impl> object, which defines a set of
553 hooks to open a server, read a request from a client, write a
554 response to a client, and close a server. These hooks -- open,
555 read, write, and close, respectively -- are bound together in a
556 <server-impl> object. Procedures in this module take a
557 <server-impl> object, if needed.
559 A <server-impl> may also be looked up by name. If you pass the
560 @code{http} symbol to @code{run-server}, Guile looks for a variable named
561 @code{http} in the @code{(web server http)} module, which should be bound to a
562 <server-impl> object. Such a binding is made by instantiation of
563 the @code{define-server-impl} syntax. In this way the run-server loop can
564 automatically load other backends if available.
566 The life cycle of a server goes as follows:
570 The @code{open} hook is called, to open the server. @code{open} takes 0 or
571 more arguments, depending on the backend, and returns an opaque
572 server socket object, or signals an error.
575 The @code{read} hook is called, to read a request from a new client.
576 The @code{read} hook takes one arguments, the server socket. It
577 should return three values: an opaque client socket, the
578 request, and the request body. The request should be a
579 @code{<request>} object, from @code{(web request)}. The body should be a
580 string or a bytevector, or @code{#f} if there is no body.
582 If the read failed, the @code{read} hook may return #f for the client
583 socket, request, and body.
586 A user-provided handler procedure is called, with the request
587 and body as its arguments. The handler should return two
588 values: the response, as a @code{<response>} record from @code{(web
589 response)}, and the response body as a string, bytevector, or
590 @code{#f} if not present. We also allow the reponse to be simply an
591 alist of headers, in which case a default response object is
592 constructed with those headers.
595 The @code{write} hook is called with three arguments: the client
596 socket, the response, and the body. The @code{write} hook returns no
600 At this point the request handling is complete. For a loop, we
601 loop back and try to read a new request.
604 If the user interrupts the loop, the @code{close} hook is called on
608 @defun define-server-impl name open read write close
611 @defun lookup-server-impl impl
612 Look up a server implementation. If @var{impl} is a server
613 implementation already, it is returned directly. If it is a symbol, the
614 binding named @var{impl} in the @code{(web server @var{impl})} module is
615 looked up. Otherwise an error is signaled.
617 Currently a server implementation is a somewhat opaque type, useful only
618 for passing to other procedures in this module, like @code{read-client}.
621 @defun open-server impl open-params
622 Open a server for the given implementation. Returns one value, the new
623 server object. The implementation's @code{open} procedure is applied to
624 @var{open-params}, which should be a list.
627 @defun read-client impl server
628 Read a new client from @var{server}, by applying the implementation's
629 @code{read} procedure to the server. If successful, returns three
630 values: an object corresponding to the client, a request object, and the
631 request body. If any exception occurs, returns @code{#f} for all three
635 @defun handle-request handler request body state
636 Handle a given request, returning the response and body.
638 The response and response body are produced by calling the given
639 @var{handler} with @var{request} and @var{body} as arguments.
641 The elements of @var{state} are also passed to @var{handler} as
642 arguments, and may be returned as additional values. The new
643 @var{state}, collected from the @var{handler}'s return values, is then
644 returned as a list. The idea is that a server loop receives a handler
645 from the user, along with whatever state values the user is interested
646 in, allowing the user's handler to explicitly manage its state.
649 @defun sanitize-response request response body
650 "Sanitize" the given response and body, making them appropriate for the
653 As a convenience to web handler authors, @var{response} may be given as
654 an alist of headers, in which case it is used to construct a default
655 response. Ensures that the response version corresponds to the request
656 version. If @var{body} is a string, encodes the string to a bytevector,
657 in an encoding appropriate for @var{response}. Adds a
658 @code{content-length} and @code{content-type} header, as necessary.
660 If @var{body} is a procedure, it is called with a port as an argument,
661 and the output collected as a bytevector. In the future we might try to
662 instead use a compressing, chunk-encoded port, and call this procedure
663 later, in the write-client procedure. Authors are advised not to rely on
664 the procedure being called at any particular time.
667 @defun write-client impl server client response body
668 Write an HTTP response and body to @var{client}. If the server and
669 client support persistent connections, it is the implementation's
670 responsibility to keep track of the client thereafter, presumably by
671 attaching it to the @var{server} argument somehow.
674 @defun close-server impl server
675 Release resources allocated by a previous invocation of
679 @defun serve-one-client handler impl server state
680 Read one request from @var{server}, call @var{handler} on the request
681 and body, and write the response to the client. Returns the new state
682 produced by the handler procedure.
685 @defun run-server handler [impl] [open-params] . state
686 Run Guile's built-in web server.
688 @var{handler} should be a procedure that takes two or more arguments,
689 the HTTP request and request body, and returns two or more values, the
690 response and response body.
692 For example, here is a simple "Hello, World!" server:
695 (define (handler request body)
696 (values '((content-type . ("text/plain")))
701 The response and body will be run through @code{sanitize-response}
702 before sending back to the client.
704 Additional arguments to @var{handler} are taken from @var{state}.
705 Additional return values are accumulated into a new @var{state}, which
706 will be used for subsequent requests. In this way a handler can
707 explicitly manage its state.
709 The default server implementation is @code{http}, which accepts
710 @var{open-params} like @code{(#:port 8081)}, among others. See "Web
711 Server" in the manual, for more information.
715 (use-modules (web server))
720 @c TeX-master: "guile.texi"