Merge commit '1e3fd6a0c81bb3e9900a93a9d1923cc788de0f99'
[bpt/guile.git] / doc / ref / web.texi
1 @c -*-texinfo-*-
2 @c This is part of the GNU Guile Reference Manual.
3 @c Copyright (C) 2010, 2011, 2012, 2013 Free Software Foundation, Inc.
4 @c See the file guile.texi for copying conditions.
5
6 @node Web
7 @section @acronym{HTTP}, the Web, and All That
8 @cindex Web
9 @cindex WWW
10 @cindex HTTP
11
12 It has always been possible to connect computers together and share
13 information between them, but the rise of the World Wide Web over the
14 last couple of decades has made it much easier to do so. The result is
15 a richly connected network of computation, in which Guile forms a part.
16
17 By ``the web'', we mean the HTTP protocol@footnote{Yes, the P is for
18 protocol, but this phrase appears repeatedly in RFC 2616.} as handled by
19 servers, clients, proxies, caches, and the various kinds of messages and
20 message components that can be sent and received by that protocol,
21 notably HTML.
22
23 On one level, the web is text in motion: the protocols themselves are
24 textual (though the payload may be binary), and it's possible to create
25 a socket and speak text to the web. But such an approach is obviously
26 primitive. This section details the higher-level data types and
27 operations provided by Guile: URIs, HTTP request and response records,
28 and a conventional web server implementation.
29
30 The material in this section is arranged in ascending order, in which
31 later concepts build on previous ones. If you prefer to start with the
32 highest-level perspective, @pxref{Web Examples}, and work your way
33 back.
34
35 @menu
36 * Types and the Web:: Types prevent bugs and security problems.
37 * URIs:: Universal Resource Identifiers.
38 * HTTP:: The Hyper-Text Transfer Protocol.
39 * HTTP Headers:: How Guile represents specific header values.
40 * Transfer Codings:: HTTP Transfer Codings.
41 * Requests:: HTTP requests.
42 * Responses:: HTTP responses.
43 * Web Client:: Accessing web resources over HTTP.
44 * Web Server:: Serving HTTP to the internet.
45 * Web Examples:: How to use this thing.
46 @end menu
47
48 @node Types and the Web
49 @subsection Types and the Web
50
51 It is a truth universally acknowledged, that a program with good use of
52 data types, will be free from many common bugs. Unfortunately, the
53 common practice in web programming seems to ignore this maxim. This
54 subsection makes the case for expressive data types in web programming.
55
56 By ``expressive data types'', we mean that the data types @emph{say}
57 something about how a program solves a problem. For example, if we
58 choose to represent dates using SRFI 19 date records (@pxref{SRFI-19}),
59 this indicates that there is a part of the program that will always have
60 valid dates. Error handling for a number of basic cases, like invalid
61 dates, occurs on the boundary in which we produce a SRFI 19 date record
62 from other types, like strings.
63
64 With regards to the web, data types are helpful in the two broad phases
65 of HTTP messages: parsing and generation.
66
67 Consider a server, which has to parse a request, and produce a response.
68 Guile will parse the request into an HTTP request object
69 (@pxref{Requests}), with each header parsed into an appropriate Scheme
70 data type. This transition from an incoming stream of characters to
71 typed data is a state change in a program---the strings might parse, or
72 they might not, and something has to happen if they do not. (Guile
73 throws an error in this case.) But after you have the parsed request,
74 ``client'' code (code built on top of the Guile web framework) will not
75 have to check for syntactic validity. The types already make this
76 information manifest.
77
78 This state change on the parsing boundary makes programs more robust,
79 as they themselves are freed from the need to do a number of common
80 error checks, and they can use normal Scheme procedures to handle a
81 request instead of ad-hoc string parsers.
82
83 The need for types on the response generation side (in a server) is more
84 subtle, though not less important. Consider the example of a POST
85 handler, which prints out the text that a user submits from a form.
86 Such a handler might include a procedure like this:
87
88 @example
89 ;; First, a helper procedure
90 (define (para . contents)
91 (string-append "<p>" (string-concatenate contents) "</p>"))
92
93 ;; Now the meat of our simple web application
94 (define (you-said text)
95 (para "You said: " text))
96
97 (display (you-said "Hi!"))
98 @print{} <p>You said: Hi!</p>
99 @end example
100
101 This is a perfectly valid implementation, provided that the incoming
102 text does not contain the special HTML characters @samp{<}, @samp{>}, or
103 @samp{&}. But this provision of a restricted character set is not
104 reflected anywhere in the program itself: we must @emph{assume} that the
105 programmer understands this, and performs the check elsewhere.
106
107 Unfortunately, the short history of the practice of programming does not
108 bear out this assumption. A @dfn{cross-site scripting} (@acronym{XSS})
109 vulnerability is just such a common error in which unfiltered user input
110 is allowed into the output. A user could submit a crafted comment to
111 your web site which results in visitors running malicious Javascript,
112 within the security context of your domain:
113
114 @example
115 (display (you-said "<script src=\"http://bad.com/nasty.js\" />"))
116 @print{} <p>You said: <script src="http://bad.com/nasty.js" /></p>
117 @end example
118
119 The fundamental problem here is that both user data and the program
120 template are represented using strings. This identity means that types
121 can't help the programmer to make a distinction between these two, so
122 they get confused.
123
124 There are a number of possible solutions, but perhaps the best is to
125 treat HTML not as strings, but as native s-expressions: as SXML. The
126 basic idea is that HTML is either text, represented by a string, or an
127 element, represented as a tagged list. So @samp{foo} becomes
128 @samp{"foo"}, and @samp{<b>foo</b>} becomes @samp{(b "foo")}.
129 Attributes, if present, go in a tagged list headed by @samp{@@}, like
130 @samp{(img (@@ (src "http://example.com/foo.png")))}. @xref{SXML}, for
131 more information.
132
133 The good thing about SXML is that HTML elements cannot be confused with
134 text. Let's make a new definition of @code{para}:
135
136 @example
137 (define (para . contents)
138 `(p ,@@contents))
139
140 (use-modules (sxml simple))
141 (sxml->xml (you-said "Hi!"))
142 @print{} <p>You said: Hi!</p>
143
144 (sxml->xml (you-said "<i>Rats, foiled again!</i>"))
145 @print{} <p>You said: &lt;i&gt;Rats, foiled again!&lt;/i&gt;</p>
146 @end example
147
148 So we see in the second example that HTML elements cannot be unwittingly
149 introduced into the output. However it is now perfectly acceptable to
150 pass SXML to @code{you-said}; in fact, that is the big advantage of SXML
151 over everything-as-a-string.
152
153 @example
154 (sxml->xml (you-said (you-said "<Hi!>")))
155 @print{} <p>You said: <p>You said: &lt;Hi!&gt;</p></p>
156 @end example
157
158 The SXML types allow procedures to @emph{compose}. The types make
159 manifest which parts are HTML elements, and which are text. So you
160 needn't worry about escaping user input; the type transition back to a
161 string handles that for you. @acronym{XSS} vulnerabilities are a thing
162 of the past.
163
164 Well. That's all very nice and opinionated and such, but how do I use
165 the thing? Read on!
166
167 @node URIs
168 @subsection Universal Resource Identifiers
169
170 Guile provides a standard data type for Universal Resource Identifiers
171 (URIs), as defined in RFC 3986.
172
173 The generic URI syntax is as follows:
174
175 @example
176 URI := scheme ":" ["//" [userinfo "@@"] host [":" port]] path \
177 [ "?" query ] [ "#" fragment ]
178 @end example
179
180 For example, in the URI, @indicateurl{http://www.gnu.org/help/}, the
181 scheme is @code{http}, the host is @code{www.gnu.org}, the path is
182 @code{/help/}, and there is no userinfo, port, query, or fragment. All
183 URIs have a scheme and a path (though the path might be empty). Some
184 URIs have a host, and some of those have ports and userinfo. Any URI
185 might have a query part or a fragment.
186
187 Userinfo is something of an abstraction, as some legacy URI schemes
188 allowed userinfo of the form @code{@var{username}:@var{passwd}}. But
189 since passwords do not belong in URIs, the RFC does not want to condone
190 this practice, so it calls anything before the @code{@@} sign
191 @dfn{userinfo}.
192
193 Properly speaking, a fragment is not part of a URI. For example, when a
194 web browser follows a link to @indicateurl{http://example.com/#foo}, it
195 sends a request for @indicateurl{http://example.com/}, then looks in the
196 resulting page for the fragment identified @code{foo} reference. A
197 fragment identifies a part of a resource, not the resource itself. But
198 it is useful to have a fragment field in the URI record itself, so we
199 hope you will forgive the inconsistency.
200
201 @example
202 (use-modules (web uri))
203 @end example
204
205 The following procedures can be found in the @code{(web uri)}
206 module. Load it into your Guile, using a form like the above, to have
207 access to them.
208
209 @deffn {Scheme Procedure} build-uri scheme @
210 [#:userinfo=@code{#f}] [#:host=@code{#f}] [#:port=@code{#f}] @
211 [#:path=@code{""}] [#:query=@code{#f}] [#:fragment=@code{#f}] @
212 [#:validate?=@code{#t}]
213 Construct a URI object. @var{scheme} should be a symbol, @var{port}
214 either a positive, exact integer or @code{#f}, and the rest of the
215 fields are either strings or @code{#f}. If @var{validate?} is true,
216 also run some consistency checks to make sure that the constructed URI
217 is valid.
218 @end deffn
219
220 @deffn {Scheme Procedure} uri? obj
221 @deffnx {Scheme Procedure} uri-scheme uri
222 @deffnx {Scheme Procedure} uri-userinfo uri
223 @deffnx {Scheme Procedure} uri-host uri
224 @deffnx {Scheme Procedure} uri-port uri
225 @deffnx {Scheme Procedure} uri-path uri
226 @deffnx {Scheme Procedure} uri-query uri
227 @deffnx {Scheme Procedure} uri-fragment uri
228 A predicate and field accessors for the URI record type. The URI scheme
229 will be a symbol, the port either a positive, exact integer or @code{#f},
230 and the rest either strings or @code{#f} if not present.
231 @end deffn
232
233 @deffn {Scheme Procedure} string->uri string
234 Parse @var{string} into a URI object. Return @code{#f} if the string
235 could not be parsed.
236 @end deffn
237
238 @deffn {Scheme Procedure} uri->string uri
239 Serialize @var{uri} to a string. If the URI has a port that is the
240 default port for its scheme, the port is not included in the
241 serialization.
242 @end deffn
243
244 @deffn {Scheme Procedure} declare-default-port! scheme port
245 Declare a default port for the given URI scheme.
246 @end deffn
247
248 @deffn {Scheme Procedure} uri-decode str [#:encoding=@code{"utf-8"}]
249 Percent-decode the given @var{str}, according to @var{encoding}, which
250 should be the name of a character encoding.
251
252 Note that this function should not generally be applied to a full URI
253 string. For paths, use @code{split-and-decode-uri-path} instead. For
254 query strings, split the query on @code{&} and @code{=} boundaries, and
255 decode the components separately.
256
257 Note also that percent-encoded strings encode @emph{bytes}, not
258 characters. There is no guarantee that a given byte sequence is a valid
259 string encoding. Therefore this routine may signal an error if the
260 decoded bytes are not valid for the given encoding. Pass @code{#f} for
261 @var{encoding} if you want decoded bytes as a bytevector directly.
262 @xref{Ports, @code{set-port-encoding!}}, for more information on
263 character encodings.
264
265 Returns a string of the decoded characters, or a bytevector if
266 @var{encoding} was @code{#f}.
267 @end deffn
268
269 Fixme: clarify return type. indicate default values. type of
270 unescaped-chars.
271
272 @deffn {Scheme Procedure} uri-encode str [#:encoding=@code{"utf-8"}] [#:unescaped-chars]
273 Percent-encode any character not in the character set,
274 @var{unescaped-chars}.
275
276 The default character set includes alphanumerics from ASCII, as well as
277 the special characters @samp{-}, @samp{.}, @samp{_}, and @samp{~}. Any
278 other character will be percent-encoded, by writing out the character to
279 a bytevector within the given @var{encoding}, then encoding each byte as
280 @code{%@var{HH}}, where @var{HH} is the hexadecimal representation of
281 the byte.
282 @end deffn
283
284 @deffn {Scheme Procedure} split-and-decode-uri-path path
285 Split @var{path} into its components, and decode each component,
286 removing empty components.
287
288 For example, @code{"/foo/bar%20baz/"} decodes to the two-element list,
289 @code{("foo" "bar baz")}.
290 @end deffn
291
292 @deffn {Scheme Procedure} encode-and-join-uri-path parts
293 URI-encode each element of @var{parts}, which should be a list of
294 strings, and join the parts together with @code{/} as a delimiter.
295
296 For example, the list @code{("scrambled eggs" "biscuits&gravy")} encodes
297 as @code{"scrambled%20eggs/biscuits%26gravy"}.
298 @end deffn
299
300 @node HTTP
301 @subsection The Hyper-Text Transfer Protocol
302
303 The initial motivation for including web functionality in Guile, rather
304 than rely on an external package, was to establish a standard base on
305 which people can share code. To that end, we continue the focus on data
306 types by providing a number of low-level parsers and unparsers for
307 elements of the HTTP protocol.
308
309 If you are want to skip the low-level details for now and move on to web
310 pages, @pxref{Web Client}, and @pxref{Web Server}. Otherwise, load the
311 HTTP module, and read on.
312
313 @example
314 (use-modules (web http))
315 @end example
316
317 The focus of the @code{(web http)} module is to parse and unparse
318 standard HTTP headers, representing them to Guile as native data
319 structures. For example, a @code{Date:} header will be represented as a
320 SRFI-19 date record (@pxref{SRFI-19}), rather than as a string.
321
322 Guile tries to follow RFCs fairly strictly---the road to perdition being
323 paved with compatibility hacks---though some allowances are made for
324 not-too-divergent texts.
325
326 Header names are represented as lower-case symbols.
327
328 @deffn {Scheme Procedure} string->header name
329 Parse @var{name} to a symbolic header name.
330 @end deffn
331
332 @deffn {Scheme Procedure} header->string sym
333 Return the string form for the header named @var{sym}.
334 @end deffn
335
336 For example:
337
338 @example
339 (string->header "Content-Length")
340 @result{} content-length
341 (header->string 'content-length)
342 @result{} "Content-Length"
343
344 (string->header "FOO")
345 @result{} foo
346 (header->string 'foo)
347 @result{} "Foo"
348 @end example
349
350 Guile keeps a registry of known headers, their string names, and some
351 parsing and serialization procedures. If a header is unknown, its
352 string name is simply its symbol name in title-case.
353
354 @deffn {Scheme Procedure} known-header? sym
355 Return @code{#t} if @var{sym} is a known header, with associated
356 parsers and serialization procedures, or @code{#f} otherwise.
357 @end deffn
358
359 @deffn {Scheme Procedure} header-parser sym
360 Return the value parser for headers named @var{sym}. The result is a
361 procedure that takes one argument, a string, and returns the parsed
362 value. If the header isn't known to Guile, a default parser is returned
363 that passes through the string unchanged.
364 @end deffn
365
366 @deffn {Scheme Procedure} header-validator sym
367 Return a predicate which returns @code{#t} if the given value is valid
368 for headers named @var{sym}. The default validator for unknown headers
369 is @code{string?}.
370 @end deffn
371
372 @deffn {Scheme Procedure} header-writer sym
373 Return a procedure that writes values for headers named @var{sym} to a
374 port. The resulting procedure takes two arguments: a value and a port.
375 The default writer is @code{display}.
376 @end deffn
377
378 For more on the set of headers that Guile knows about out of the box,
379 @pxref{HTTP Headers}. To add your own, use the @code{declare-header!}
380 procedure:
381
382 @deffn {Scheme Procedure} declare-header! name parser validator writer @
383 [#:multiple?=@code{#f}]
384 Declare a parser, validator, and writer for a given header.
385 @end deffn
386
387 For example, let's say you are running a web server behind some sort of
388 proxy, and your proxy adds an @code{X-Client-Address} header, indicating
389 the IPv4 address of the original client. You would like for the HTTP
390 request record to parse out this header to a Scheme value, instead of
391 leaving it as a string. You could register this header with Guile's
392 HTTP stack like this:
393
394 @example
395 (declare-header! "X-Client-Address"
396 (lambda (str)
397 (inet-aton str))
398 (lambda (ip)
399 (and (integer? ip) (exact? ip) (<= 0 ip #xffffffff)))
400 (lambda (ip port)
401 (display (inet-ntoa ip) port)))
402 @end example
403
404 @deffn {Scheme Procedure} declare-opaque-header! name
405 A specialised version of @code{declare-header!} for the case in which
406 you want a header's value to be returned/written ``as-is''.
407 @end deffn
408
409 @deffn {Scheme Procedure} valid-header? sym val
410 Return a true value if @var{val} is a valid Scheme value for the header
411 with name @var{sym}, or @code{#f} otherwise.
412 @end deffn
413
414 Now that we have a generic interface for reading and writing headers, we
415 do just that.
416
417 @deffn {Scheme Procedure} read-header port
418 Read one HTTP header from @var{port}. Return two values: the header
419 name and the parsed Scheme value. May raise an exception if the header
420 was known but the value was invalid.
421
422 Returns the end-of-file object for both values if the end of the message
423 body was reached (i.e., a blank line).
424 @end deffn
425
426 @deffn {Scheme Procedure} parse-header name val
427 Parse @var{val}, a string, with the parser for the header named
428 @var{name}. Returns the parsed value.
429 @end deffn
430
431 @deffn {Scheme Procedure} write-header name val port
432 Write the given header name and value to @var{port}, using the writer
433 from @code{header-writer}.
434 @end deffn
435
436 @deffn {Scheme Procedure} read-headers port
437 Read the headers of an HTTP message from @var{port}, returning them
438 as an ordered alist.
439 @end deffn
440
441 @deffn {Scheme Procedure} write-headers headers port
442 Write the given header alist to @var{port}. Doesn't write the final
443 @samp{\r\n}, as the user might want to add another header.
444 @end deffn
445
446 The @code{(web http)} module also has some utility procedures to read
447 and write request and response lines.
448
449 @deffn {Scheme Procedure} parse-http-method str [start] [end]
450 Parse an HTTP method from @var{str}. The result is an upper-case symbol,
451 like @code{GET}.
452 @end deffn
453
454 @deffn {Scheme Procedure} parse-http-version str [start] [end]
455 Parse an HTTP version from @var{str}, returning it as a major--minor
456 pair. For example, @code{HTTP/1.1} parses as the pair of integers,
457 @code{(1 . 1)}.
458 @end deffn
459
460 @deffn {Scheme Procedure} parse-request-uri str [start] [end]
461 Parse a URI from an HTTP request line. Note that URIs in requests do not
462 have to have a scheme or host name. The result is a URI object.
463 @end deffn
464
465 @deffn {Scheme Procedure} read-request-line port
466 Read the first line of an HTTP request from @var{port}, returning three
467 values: the method, the URI, and the version.
468 @end deffn
469
470 @deffn {Scheme Procedure} write-request-line method uri version port
471 Write the first line of an HTTP request to @var{port}.
472 @end deffn
473
474 @deffn {Scheme Procedure} read-response-line port
475 Read the first line of an HTTP response from @var{port}, returning three
476 values: the HTTP version, the response code, and the ``reason phrase''.
477 @end deffn
478
479 @deffn {Scheme Procedure} write-response-line version code reason-phrase port
480 Write the first line of an HTTP response to @var{port}.
481 @end deffn
482
483
484 @node HTTP Headers
485 @subsection HTTP Headers
486
487 In addition to defining the infrastructure to parse headers, the
488 @code{(web http)} module defines specific parsers and unparsers for all
489 headers defined in the HTTP/1.1 standard.
490
491 For example, if you receive a header named @samp{Accept-Language} with a
492 value @samp{en, es;q=0.8}, Guile parses it as a quality list (defined
493 below):
494
495 @example
496 (parse-header 'accept-language "en, es;q=0.8")
497 @result{} ((1000 . "en") (800 . "es"))
498 @end example
499
500 The format of the value for @samp{Accept-Language} headers is defined
501 below, along with all other headers defined in the HTTP standard. (If
502 the header were unknown, the value would have been returned as a
503 string.)
504
505 For brevity, the header definitions below are given in the form,
506 @var{Type} @code{@var{name}}, indicating that values for the header
507 @code{@var{name}} will be of the given @var{Type}. Since Guile
508 internally treats header names in lower case, in this document we give
509 types title-cased names. A short description of the each header's
510 purpose and an example follow.
511
512 For full details on the meanings of all of these headers, see the HTTP
513 1.1 standard, RFC 2616.
514
515 @subsubsection HTTP Header Types
516
517 Here we define the types that are used below, when defining headers.
518
519 @deftp {HTTP Header Type} Date
520 A SRFI-19 date.
521 @end deftp
522
523 @deftp {HTTP Header Type} KVList
524 A list whose elements are keys or key-value pairs. Keys are parsed to
525 symbols. Values are strings by default. Non-string values are the
526 exception, and are mentioned explicitly below, as appropriate.
527 @end deftp
528
529 @deftp {HTTP Header Type} SList
530 A list of strings.
531 @end deftp
532
533 @deftp {HTTP Header Type} Quality
534 An exact integer between 0 and 1000. Qualities are used to express
535 preference, given multiple options. An option with a quality of 870,
536 for example, is preferred over an option with quality 500.
537
538 (Qualities are written out over the wire as numbers between 0.0 and
539 1.0, but since the standard only allows three digits after the decimal,
540 it's equivalent to integers between 0 and 1000, so that's what Guile
541 uses.)
542 @end deftp
543
544 @deftp {HTTP Header Type} QList
545 A quality list: a list of pairs, the car of which is a quality, and the
546 cdr a string. Used to express a list of options, along with their
547 qualities.
548 @end deftp
549
550 @deftp {HTTP Header Type} ETag
551 An entity tag, represented as a pair. The car of the pair is an opaque
552 string, and the cdr is @code{#t} if the entity tag is a ``strong'' entity
553 tag, and @code{#f} otherwise.
554 @end deftp
555
556 @subsubsection General Headers
557
558 General HTTP headers may be present in any HTTP message.
559
560 @deftypevr {HTTP Header} KVList cache-control
561 A key-value list of cache-control directives. See RFC 2616, for more
562 details.
563
564 If present, parameters to @code{max-age}, @code{max-stale},
565 @code{min-fresh}, and @code{s-maxage} are all parsed as non-negative
566 integers.
567
568 If present, parameters to @code{private} and @code{no-cache} are parsed
569 as lists of header names, as symbols.
570
571 @example
572 (parse-header 'cache-control "no-cache,no-store"
573 @result{} (no-cache no-store)
574 (parse-header 'cache-control "no-cache=\"Authorization,Date\",no-store"
575 @result{} ((no-cache . (authorization date)) no-store)
576 (parse-header 'cache-control "no-cache=\"Authorization,Date\",max-age=10"
577 @result{} ((no-cache . (authorization date)) (max-age . 10))
578 @end example
579 @end deftypevr
580
581 @deftypevr {HTTP Header} List connection
582 A list of header names that apply only to this HTTP connection, as
583 symbols. Additionally, the symbol @samp{close} may be present, to
584 indicate that the server should close the connection after responding to
585 the request.
586 @example
587 (parse-header 'connection "close")
588 @result{} (close)
589 @end example
590 @end deftypevr
591
592 @deftypevr {HTTP Header} Date date
593 The date that a given HTTP message was originated.
594 @example
595 (parse-header 'date "Tue, 15 Nov 1994 08:12:31 GMT")
596 @result{} #<date ...>
597 @end example
598 @end deftypevr
599
600 @deftypevr {HTTP Header} KVList pragma
601 A key-value list of implementation-specific directives.
602 @example
603 (parse-header 'pragma "no-cache, broccoli=tasty")
604 @result{} (no-cache (broccoli . "tasty"))
605 @end example
606 @end deftypevr
607
608 @deftypevr {HTTP Header} List trailer
609 A list of header names which will appear after the message body, instead
610 of with the message headers.
611 @example
612 (parse-header 'trailer "ETag")
613 @result{} (etag)
614 @end example
615 @end deftypevr
616
617 @deftypevr {HTTP Header} List transfer-encoding
618 A list of transfer codings, expressed as key-value lists. The only
619 transfer coding defined by the specification is @code{chunked}.
620 @example
621 (parse-header 'transfer-encoding "chunked")
622 @result{} ((chunked))
623 @end example
624 @end deftypevr
625
626 @deftypevr {HTTP Header} List upgrade
627 A list of strings, indicating additional protocols that a server could use
628 in response to a request.
629 @example
630 (parse-header 'upgrade "WebSocket")
631 @result{} ("WebSocket")
632 @end example
633 @end deftypevr
634
635 FIXME: parse out more fully?
636 @deftypevr {HTTP Header} List via
637 A list of strings, indicating the protocol versions and hosts of
638 intermediate servers and proxies. There may be multiple @code{via}
639 headers in one message.
640 @example
641 (parse-header 'via "1.0 venus, 1.1 mars")
642 @result{} ("1.0 venus" "1.1 mars")
643 @end example
644 @end deftypevr
645
646 @deftypevr {HTTP Header} List warning
647 A list of warnings given by a server or intermediate proxy. Each
648 warning is a itself a list of four elements: a code, as an exact integer
649 between 0 and 1000, a host as a string, the warning text as a string,
650 and either @code{#f} or a SRFI-19 date.
651
652 There may be multiple @code{warning} headers in one message.
653 @example
654 (parse-header 'warning "123 foo \"core breach imminent\"")
655 @result{} ((123 "foo" "core-breach imminent" #f))
656 @end example
657 @end deftypevr
658
659
660 @subsubsection Entity Headers
661
662 Entity headers may be present in any HTTP message, and refer to the
663 resource referenced in the HTTP request or response.
664
665 @deftypevr {HTTP Header} List allow
666 A list of allowed methods on a given resource, as symbols.
667 @example
668 (parse-header 'allow "GET, HEAD")
669 @result{} (GET HEAD)
670 @end example
671 @end deftypevr
672
673 @deftypevr {HTTP Header} List content-encoding
674 A list of content codings, as symbols.
675 @example
676 (parse-header 'content-encoding "gzip")
677 @result{} (gzip)
678 @end example
679 @end deftypevr
680
681 @deftypevr {HTTP Header} List content-language
682 The languages that a resource is in, as strings.
683 @example
684 (parse-header 'content-language "en")
685 @result{} ("en")
686 @end example
687 @end deftypevr
688
689 @deftypevr {HTTP Header} UInt content-length
690 The number of bytes in a resource, as an exact, non-negative integer.
691 @example
692 (parse-header 'content-length "300")
693 @result{} 300
694 @end example
695 @end deftypevr
696
697 @deftypevr {HTTP Header} URI content-location
698 The canonical URI for a resource, in the case that it is also accessible
699 from a different URI.
700 @example
701 (parse-header 'content-location "http://example.com/foo")
702 @result{} #<<uri> ...>
703 @end example
704 @end deftypevr
705
706 @deftypevr {HTTP Header} String content-md5
707 The MD5 digest of a resource.
708 @example
709 (parse-header 'content-md5 "ffaea1a79810785575e29e2bd45e2fa5")
710 @result{} "ffaea1a79810785575e29e2bd45e2fa5"
711 @end example
712 @end deftypevr
713
714 @deftypevr {HTTP Header} List content-range
715 A range specification, as a list of three elements: the symbol
716 @code{bytes}, either the symbol @code{*} or a pair of integers,
717 indicating the byte rage, and either @code{*} or an integer, for the
718 instance length. Used to indicate that a response only includes part of
719 a resource.
720 @example
721 (parse-header 'content-range "bytes 10-20/*")
722 @result{} (bytes (10 . 20) *)
723 @end example
724 @end deftypevr
725
726 @deftypevr {HTTP Header} List content-type
727 The MIME type of a resource, as a symbol, along with any parameters.
728 @example
729 (parse-header 'content-length "text/plain")
730 @result{} (text/plain)
731 (parse-header 'content-length "text/plain;charset=utf-8")
732 @result{} (text/plain (charset . "utf-8"))
733 @end example
734 Note that the @code{charset} parameter is something is a misnomer, and
735 the HTTP specification admits this. It specifies the @emph{encoding} of
736 the characters, not the character set.
737 @end deftypevr
738
739 @deftypevr {HTTP Header} Date expires
740 The date/time after which the resource given in a response is considered
741 stale.
742 @example
743 (parse-header 'expires "Tue, 15 Nov 1994 08:12:31 GMT")
744 @result{} #<date ...>
745 @end example
746 @end deftypevr
747
748 @deftypevr {HTTP Header} Date last-modified
749 The date/time on which the resource given in a response was last
750 modified.
751 @example
752 (parse-header 'expires "Tue, 15 Nov 1994 08:12:31 GMT")
753 @result{} #<date ...>
754 @end example
755 @end deftypevr
756
757
758 @subsubsection Request Headers
759
760 Request headers may only appear in an HTTP request, not in a response.
761
762 @deftypevr {HTTP Header} List accept
763 A list of preferred media types for a response. Each element of the
764 list is itself a list, in the same format as @code{content-type}.
765 @example
766 (parse-header 'accept "text/html,text/plain;charset=utf-8")
767 @result{} ((text/html) (text/plain (charset . "utf-8")))
768 @end example
769 Preference is expressed with quality values:
770 @example
771 (parse-header 'accept "text/html;q=0.8,text/plain;q=0.6")
772 @result{} ((text/html (q . 800)) (text/plain (q . 600)))
773 @end example
774 @end deftypevr
775
776 @deftypevr {HTTP Header} QList accept-charset
777 A quality list of acceptable charsets. Note again that what HTTP calls
778 a ``charset'' is what Guile calls a ``character encoding''.
779 @example
780 (parse-header 'accept-charset "iso-8859-5, unicode-1-1;q=0.8")
781 @result{} ((1000 . "iso-8859-5") (800 . "unicode-1-1"))
782 @end example
783 @end deftypevr
784
785 @deftypevr {HTTP Header} QList accept-encoding
786 A quality list of acceptable content codings.
787 @example
788 (parse-header 'accept-encoding "gzip,identity=0.8")
789 @result{} ((1000 . "gzip") (800 . "identity"))
790 @end example
791 @end deftypevr
792
793 @deftypevr {HTTP Header} QList accept-language
794 A quality list of acceptable languages.
795 @example
796 (parse-header 'accept-language "cn,en=0.75")
797 @result{} ((1000 . "cn") (750 . "en"))
798 @end example
799 @end deftypevr
800
801 @deftypevr {HTTP Header} Pair authorization
802 Authorization credentials. The car of the pair indicates the
803 authentication scheme, like @code{basic}. For basic authentication, the
804 cdr of the pair will be the base64-encoded @samp{@var{user}:@var{pass}}
805 string. For other authentication schemes, like @code{digest}, the cdr
806 will be a key-value list of credentials.
807 @example
808 (parse-header 'authorization "Basic QWxhZGRpbjpvcGVuIHNlc2FtZQ=="
809 @result{} (basic . "QWxhZGRpbjpvcGVuIHNlc2FtZQ==")
810 @end example
811 @end deftypevr
812
813 @deftypevr {HTTP Header} List expect
814 A list of expectations that a client has of a server. The expectations
815 are key-value lists.
816 @example
817 (parse-header 'expect "100-continue")
818 @result{} ((100-continue))
819 @end example
820 @end deftypevr
821
822 @deftypevr {HTTP Header} String from
823 The email address of a user making an HTTP request.
824 @example
825 (parse-header 'from "bob@@example.com")
826 @result{} "bob@@example.com"
827 @end example
828 @end deftypevr
829
830 @deftypevr {HTTP Header} Pair host
831 The host for the resource being requested, as a hostname-port pair. If
832 no port is given, the port is @code{#f}.
833 @example
834 (parse-header 'host "gnu.org:80")
835 @result{} ("gnu.org" . 80)
836 (parse-header 'host "gnu.org")
837 @result{} ("gnu.org" . #f)
838 @end example
839 @end deftypevr
840
841 @deftypevr {HTTP Header} *|List if-match
842 A set of etags, indicating that the request should proceed if and only
843 if the etag of the resource is in that set. Either the symbol @code{*},
844 indicating any etag, or a list of entity tags.
845 @example
846 (parse-header 'if-match "*")
847 @result{} *
848 (parse-header 'if-match "asdfadf")
849 @result{} (("asdfadf" . #t))
850 (parse-header 'if-match W/"asdfadf")
851 @result{} (("asdfadf" . #f))
852 @end example
853 @end deftypevr
854
855 @deftypevr {HTTP Header} Date if-modified-since
856 Indicates that a response should proceed if and only if the resource has
857 been modified since the given date.
858 @example
859 (parse-header 'if-modified-since "Tue, 15 Nov 1994 08:12:31 GMT")
860 @result{} #<date ...>
861 @end example
862 @end deftypevr
863
864 @deftypevr {HTTP Header} *|List if-none-match
865 A set of etags, indicating that the request should proceed if and only
866 if the etag of the resource is not in the set. Either the symbol
867 @code{*}, indicating any etag, or a list of entity tags.
868 @example
869 (parse-header 'if-none-match "*")
870 @result{} *
871 @end example
872 @end deftypevr
873
874 @deftypevr {HTTP Header} ETag|Date if-range
875 Indicates that the range request should proceed if and only if the
876 resource matches a modification date or an etag. Either an entity tag,
877 or a SRFI-19 date.
878 @example
879 (parse-header 'if-range "\"original-etag\"")
880 @result{} ("original-etag" . #t)
881 @end example
882 @end deftypevr
883
884 @deftypevr {HTTP Header} Date if-unmodified-since
885 Indicates that a response should proceed if and only if the resource has
886 not been modified since the given date.
887 @example
888 (parse-header 'if-not-modified-since "Tue, 15 Nov 1994 08:12:31 GMT")
889 @result{} #<date ...>
890 @end example
891 @end deftypevr
892
893 @deftypevr {HTTP Header} UInt max-forwards
894 The maximum number of proxy or gateway hops that a request should be
895 subject to.
896 @example
897 (parse-header 'max-forwards "10")
898 @result{} 10
899 @end example
900 @end deftypevr
901
902 @deftypevr {HTTP Header} Pair proxy-authorization
903 Authorization credentials for a proxy connection. See the documentation
904 for @code{authorization} above for more information on the format.
905 @example
906 (parse-header 'proxy-authorization "Digest foo=bar,baz=qux"
907 @result{} (digest (foo . "bar") (baz . "qux"))
908 @end example
909 @end deftypevr
910
911 @deftypevr {HTTP Header} Pair range
912 A range request, indicating that the client wants only part of a
913 resource. The car of the pair is the symbol @code{bytes}, and the cdr
914 is a list of pairs. Each element of the cdr indicates a range; the car
915 is the first byte position and the cdr is the last byte position, as
916 integers, or @code{#f} if not given.
917 @example
918 (parse-header 'range "bytes=10-30,50-")
919 @result{} (bytes (10 . 30) (50 . #f))
920 @end example
921 @end deftypevr
922
923 @deftypevr {HTTP Header} URI referer
924 The URI of the resource that referred the user to this resource. The
925 name of the header is a misspelling, but we are stuck with it.
926 @example
927 (parse-header 'referer "http://www.gnu.org/")
928 @result{} #<uri ...>
929 @end example
930 @end deftypevr
931
932 @deftypevr {HTTP Header} List te
933 A list of transfer codings, expressed as key-value lists. A common
934 transfer coding is @code{trailers}.
935 @example
936 (parse-header 'te "trailers")
937 @result{} ((trailers))
938 @end example
939 @end deftypevr
940
941 @deftypevr {HTTP Header} String user-agent
942 A string indicating the user agent making the request. The
943 specification defines a structured format for this header, but it is
944 widely disregarded, so Guile does not attempt to parse strictly.
945 @example
946 (parse-header 'user-agent "Mozilla/5.0")
947 @result{} "Mozilla/5.0"
948 @end example
949 @end deftypevr
950
951
952 @subsubsection Response Headers
953
954 @deftypevr {HTTP Header} List accept-ranges
955 A list of range units that the server supports, as symbols.
956 @example
957 (parse-header 'accept-ranges "bytes")
958 @result{} (bytes)
959 @end example
960 @end deftypevr
961
962 @deftypevr {HTTP Header} UInt age
963 The age of a cached response, in seconds.
964 @example
965 (parse-header 'age "3600")
966 @result{} 3600
967 @end example
968 @end deftypevr
969
970 @deftypevr {HTTP Header} ETag etag
971 The entity-tag of the resource.
972 @example
973 (parse-header 'etag "\"foo\"")
974 @result{} ("foo" . #t)
975 @end example
976 @end deftypevr
977
978 @deftypevr {HTTP Header} URI location
979 A URI on which a request may be completed. Used in combination with a
980 redirecting status code to perform client-side redirection.
981 @example
982 (parse-header 'location "http://example.com/other")
983 @result{} #<uri ...>
984 @end example
985 @end deftypevr
986
987 @deftypevr {HTTP Header} List proxy-authenticate
988 A list of challenges to a proxy, indicating the need for authentication.
989 @example
990 (parse-header 'proxy-authenticate "Basic realm=\"foo\"")
991 @result{} ((basic (realm . "foo")))
992 @end example
993 @end deftypevr
994
995 @deftypevr {HTTP Header} UInt|Date retry-after
996 Used in combination with a server-busy status code, like 503, to
997 indicate that a client should retry later. Either a number of seconds,
998 or a date.
999 @example
1000 (parse-header 'retry-after "60")
1001 @result{} 60
1002 @end example
1003 @end deftypevr
1004
1005 @deftypevr {HTTP Header} String server
1006 A string identifying the server.
1007 @example
1008 (parse-header 'server "My first web server")
1009 @result{} "My first web server"
1010 @end example
1011 @end deftypevr
1012
1013 @deftypevr {HTTP Header} *|List vary
1014 A set of request headers that were used in computing this response.
1015 Used to indicate that server-side content negotiation was performed, for
1016 example in response to the @code{accept-language} header. Can also be
1017 the symbol @code{*}, indicating that all headers were considered.
1018 @example
1019 (parse-header 'vary "Accept-Language, Accept")
1020 @result{} (accept-language accept)
1021 @end example
1022 @end deftypevr
1023
1024 @deftypevr {HTTP Header} List www-authenticate
1025 A list of challenges to a user, indicating the need for authentication.
1026 @example
1027 (parse-header 'www-authenticate "Basic realm=\"foo\"")
1028 @result{} ((basic (realm . "foo")))
1029 @end example
1030 @end deftypevr
1031
1032 @node Transfer Codings
1033 @subsection Transfer Codings
1034
1035 HTTP 1.1 allows for various transfer codings to be applied to message
1036 bodies. These include various types of compression, and HTTP chunked
1037 encoding. Currently, only chunked encoding is supported by guile.
1038
1039 Chunked coding is an optional coding that may be applied to message
1040 bodies, to allow messages whose length is not known beforehand to be
1041 returned. Such messages can be split into chunks, terminated by a final
1042 zero length chunk.
1043
1044 In order to make dealing with encodings more simple, guile provides
1045 procedures to create ports that ``wrap'' existing ports, applying
1046 transformations transparently under the hood.
1047
1048 These procedures are in the @code{(web http)} module.
1049
1050 @example
1051 (use-modules (web http))
1052 @end example
1053
1054 @deffn {Scheme Procedure} make-chunked-input-port port [#:keep-alive?=#f]
1055 Returns a new port, that transparently reads and decodes chunk-encoded
1056 data from @var{port}. If no more chunk-encoded data is available, it
1057 returns the end-of-file object. When the port is closed, @var{port} will
1058 also be closed, unless @var{keep-alive?} is true.
1059 @end deffn
1060
1061 @example
1062 (use-modules (ice-9 rdelim))
1063
1064 (define s "5\r\nFirst\r\nA\r\n line\n Sec\r\n8\r\nond line\r\n0\r\n")
1065 (define p (make-chunked-input-port (open-input-string s)))
1066 (read-line s)
1067 @result{} "First line"
1068 (read-line s)
1069 @result{} "Second line"
1070 @end example
1071
1072 @deffn {Scheme Procedure} make-chunked-output-port port [#:keep-alive?=#f]
1073 Returns a new port, which transparently encodes data as chunk-encoded
1074 before writing it to @var{port}. Whenever a write occurs on this port,
1075 it buffers it, until the port is flushed, at which point it writes a
1076 chunk containing all the data written so far. When the port is closed,
1077 the data remaining is written to @var{port}, as is the terminating zero
1078 chunk. It also causes @var{port} to be closed, unless @var{keep-alive?}
1079 is true.
1080
1081 Note. Forcing a chunked output port when there is no data is buffered
1082 does not write a zero chunk, as this would cause the data to be
1083 interpreted incorrectly by the client.
1084 @end deffn
1085
1086 @example
1087 (call-with-output-string
1088 (lambda (out)
1089 (define out* (make-chunked-output-port out #:keep-alive? #t))
1090 (display "first chunk" out*)
1091 (force-output out*)
1092 (force-output out*) ; note this does not write a zero chunk
1093 (display "second chunk" out*)
1094 (close-port out*)))
1095 @result{} "b\r\nfirst chunk\r\nc\r\nsecond chunk\r\n0\r\n"
1096 @end example
1097
1098 @node Requests
1099 @subsection HTTP Requests
1100
1101 @example
1102 (use-modules (web request))
1103 @end example
1104
1105 The request module contains a data type for HTTP requests.
1106
1107 @subsubsection An Important Note on Character Sets
1108
1109 HTTP requests consist of two parts: the request proper, consisting of a
1110 request line and a set of headers, and (optionally) a body. The body
1111 might have a binary content-type, and even in the textual case its
1112 length is specified in bytes, not characters.
1113
1114 Therefore, HTTP is a fundamentally binary protocol. However the request
1115 line and headers are specified to be in a subset of ASCII, so they can
1116 be treated as text, provided that the port's encoding is set to an
1117 ASCII-compatible one-byte-per-character encoding. ISO-8859-1 (latin-1)
1118 is just such an encoding, and happens to be very efficient for Guile.
1119
1120 So what Guile does when reading requests from the wire, or writing them
1121 out, is to set the port's encoding to latin-1, and treating the request
1122 headers as text.
1123
1124 The request body is another issue. For binary data, the data is
1125 probably in a bytevector, so we use the R6RS binary output procedures to
1126 write out the binary payload. Textual data usually has to be written
1127 out to some character encoding, usually UTF-8, and then the resulting
1128 bytevector is written out to the port.
1129
1130 In summary, Guile reads and writes HTTP over latin-1 sockets, without
1131 any loss of generality.
1132
1133 @subsubsection Request API
1134
1135 @deffn {Scheme Procedure} request? obj
1136 @deffnx {Scheme Procedure} request-method request
1137 @deffnx {Scheme Procedure} request-uri request
1138 @deffnx {Scheme Procedure} request-version request
1139 @deffnx {Scheme Procedure} request-headers request
1140 @deffnx {Scheme Procedure} request-meta request
1141 @deffnx {Scheme Procedure} request-port request
1142 A predicate and field accessors for the request type. The fields are as
1143 follows:
1144 @table @code
1145 @item method
1146 The HTTP method, for example, @code{GET}.
1147 @item uri
1148 The URI as a URI record.
1149 @item version
1150 The HTTP version pair, like @code{(1 . 1)}.
1151 @item headers
1152 The request headers, as an alist of parsed values.
1153 @item meta
1154 An arbitrary alist of other data, for example information returned in
1155 the @code{sockaddr} from @code{accept} (@pxref{Network Sockets and
1156 Communication}).
1157 @item port
1158 The port on which to read or write a request body, if any.
1159 @end table
1160 @end deffn
1161
1162 @deffn {Scheme Procedure} read-request port [meta='()]
1163 Read an HTTP request from @var{port}, optionally attaching the given
1164 metadata, @var{meta}.
1165
1166 As a side effect, sets the encoding on @var{port} to ISO-8859-1
1167 (latin-1), so that reading one character reads one byte. See the
1168 discussion of character sets above, for more information.
1169
1170 Note that the body is not part of the request. Once you have read a
1171 request, you may read the body separately, and likewise for writing
1172 requests.
1173 @end deffn
1174
1175 @deffn {Scheme Procedure} build-request uri [#:method='GET] @
1176 [#:version='(1 . 1)] [#:headers='()] [#:port=#f] [#:meta='()] @
1177 [#:validate-headers?=#t]
1178 Construct an HTTP request object. If @var{validate-headers?} is true,
1179 the headers are each run through their respective validators.
1180 @end deffn
1181
1182 @deffn {Scheme Procedure} write-request r port
1183 Write the given HTTP request to @var{port}.
1184
1185 Return a new request, whose @code{request-port} will continue writing
1186 on @var{port}, perhaps using some transfer encoding.
1187 @end deffn
1188
1189 @deffn {Scheme Procedure} read-request-body r
1190 Reads the request body from @var{r}, as a bytevector. Return @code{#f}
1191 if there was no request body.
1192 @end deffn
1193
1194 @deffn {Scheme Procedure} write-request-body r bv
1195 Write @var{bv}, a bytevector, to the port corresponding to the HTTP
1196 request @var{r}.
1197 @end deffn
1198
1199 The various headers that are typically associated with HTTP requests may
1200 be accessed with these dedicated accessors. @xref{HTTP Headers}, for
1201 more information on the format of parsed headers.
1202
1203 @deffn {Scheme Procedure} request-accept request [default='()]
1204 @deffnx {Scheme Procedure} request-accept-charset request [default='()]
1205 @deffnx {Scheme Procedure} request-accept-encoding request [default='()]
1206 @deffnx {Scheme Procedure} request-accept-language request [default='()]
1207 @deffnx {Scheme Procedure} request-allow request [default='()]
1208 @deffnx {Scheme Procedure} request-authorization request [default=#f]
1209 @deffnx {Scheme Procedure} request-cache-control request [default='()]
1210 @deffnx {Scheme Procedure} request-connection request [default='()]
1211 @deffnx {Scheme Procedure} request-content-encoding request [default='()]
1212 @deffnx {Scheme Procedure} request-content-language request [default='()]
1213 @deffnx {Scheme Procedure} request-content-length request [default=#f]
1214 @deffnx {Scheme Procedure} request-content-location request [default=#f]
1215 @deffnx {Scheme Procedure} request-content-md5 request [default=#f]
1216 @deffnx {Scheme Procedure} request-content-range request [default=#f]
1217 @deffnx {Scheme Procedure} request-content-type request [default=#f]
1218 @deffnx {Scheme Procedure} request-date request [default=#f]
1219 @deffnx {Scheme Procedure} request-expect request [default='()]
1220 @deffnx {Scheme Procedure} request-expires request [default=#f]
1221 @deffnx {Scheme Procedure} request-from request [default=#f]
1222 @deffnx {Scheme Procedure} request-host request [default=#f]
1223 @deffnx {Scheme Procedure} request-if-match request [default=#f]
1224 @deffnx {Scheme Procedure} request-if-modified-since request [default=#f]
1225 @deffnx {Scheme Procedure} request-if-none-match request [default=#f]
1226 @deffnx {Scheme Procedure} request-if-range request [default=#f]
1227 @deffnx {Scheme Procedure} request-if-unmodified-since request [default=#f]
1228 @deffnx {Scheme Procedure} request-last-modified request [default=#f]
1229 @deffnx {Scheme Procedure} request-max-forwards request [default=#f]
1230 @deffnx {Scheme Procedure} request-pragma request [default='()]
1231 @deffnx {Scheme Procedure} request-proxy-authorization request [default=#f]
1232 @deffnx {Scheme Procedure} request-range request [default=#f]
1233 @deffnx {Scheme Procedure} request-referer request [default=#f]
1234 @deffnx {Scheme Procedure} request-te request [default=#f]
1235 @deffnx {Scheme Procedure} request-trailer request [default='()]
1236 @deffnx {Scheme Procedure} request-transfer-encoding request [default='()]
1237 @deffnx {Scheme Procedure} request-upgrade request [default='()]
1238 @deffnx {Scheme Procedure} request-user-agent request [default=#f]
1239 @deffnx {Scheme Procedure} request-via request [default='()]
1240 @deffnx {Scheme Procedure} request-warning request [default='()]
1241 Return the given request header, or @var{default} if none was present.
1242 @end deffn
1243
1244 @deffn {Scheme Procedure} request-absolute-uri r [default-host=#f] [default-port=#f]
1245 A helper routine to determine the absolute URI of a request, using the
1246 @code{host} header and the default host and port.
1247 @end deffn
1248
1249
1250 @node Responses
1251 @subsection HTTP Responses
1252
1253 @example
1254 (use-modules (web response))
1255 @end example
1256
1257 As with requests (@pxref{Requests}), Guile offers a data type for HTTP
1258 responses. Again, the body is represented separately from the request.
1259
1260 @deffn {Scheme Procedure} response? obj
1261 @deffnx {Scheme Procedure} response-version response
1262 @deffnx {Scheme Procedure} response-code response
1263 @deffnx {Scheme Procedure} response-reason-phrase response
1264 @deffnx {Scheme Procedure} response-headers response
1265 @deffnx {Scheme Procedure} response-port response
1266 A predicate and field accessors for the response type. The fields are as
1267 follows:
1268 @table @code
1269 @item version
1270 The HTTP version pair, like @code{(1 . 1)}.
1271 @item code
1272 The HTTP response code, like @code{200}.
1273 @item reason-phrase
1274 The reason phrase, or the standard reason phrase for the response's
1275 code.
1276 @item headers
1277 The response headers, as an alist of parsed values.
1278 @item port
1279 The port on which to read or write a response body, if any.
1280 @end table
1281 @end deffn
1282
1283 @deffn {Scheme Procedure} read-response port
1284 Read an HTTP response from @var{port}.
1285
1286 As a side effect, sets the encoding on @var{port} to ISO-8859-1
1287 (latin-1), so that reading one character reads one byte. See the
1288 discussion of character sets in @ref{Responses}, for more information.
1289 @end deffn
1290
1291 @deffn {Scheme Procedure} build-response [#:version='(1 . 1)] [#:code=200] [#:reason-phrase=#f] [#:headers='()] [#:port=#f] [#:validate-headers?=#t]
1292 Construct an HTTP response object. If @var{validate-headers?} is true,
1293 the headers are each run through their respective validators.
1294 @end deffn
1295
1296 @deffn {Scheme Procedure} adapt-response-version response version
1297 Adapt the given response to a different HTTP version. Return a new HTTP
1298 response.
1299
1300 The idea is that many applications might just build a response for the
1301 default HTTP version, and this method could handle a number of
1302 programmatic transformations to respond to older HTTP versions (0.9 and
1303 1.0). But currently this function is a bit heavy-handed, just updating
1304 the version field.
1305 @end deffn
1306
1307 @deffn {Scheme Procedure} write-response r port
1308 Write the given HTTP response to @var{port}.
1309
1310 Return a new response, whose @code{response-port} will continue writing
1311 on @var{port}, perhaps using some transfer encoding.
1312 @end deffn
1313
1314 @deffn {Scheme Procedure} response-must-not-include-body? r
1315 Some responses, like those with status code 304, are specified as never
1316 having bodies. This predicate returns @code{#t} for those responses.
1317
1318 Note also, though, that responses to @code{HEAD} requests must also not
1319 have a body.
1320 @end deffn
1321
1322 @deffn {Scheme Procedure} response-body-port r [#:decode?=#t] [#:keep-alive?=#t]
1323 Return an input port from which the body of @var{r} can be read. The encoding
1324 of the returned port is set according to @var{r}'s @code{content-type} header,
1325 when it's textual, except if @var{decode?} is @code{#f}. Return @code{#f}
1326 when no body is available.
1327
1328 When @var{keep-alive?} is @code{#f}, closing the returned port also closes
1329 @var{r}'s response port.
1330 @end deffn
1331
1332 @deffn {Scheme Procedure} read-response-body r
1333 Read the response body from @var{r}, as a bytevector. Returns @code{#f}
1334 if there was no response body.
1335 @end deffn
1336
1337 @deffn {Scheme Procedure} write-response-body r bv
1338 Write @var{bv}, a bytevector, to the port corresponding to the HTTP
1339 response @var{r}.
1340 @end deffn
1341
1342 As with requests, the various headers that are typically associated with
1343 HTTP responses may be accessed with these dedicated accessors.
1344 @xref{HTTP Headers}, for more information on the format of parsed
1345 headers.
1346
1347 @deffn {Scheme Procedure} response-accept-ranges response [default=#f]
1348 @deffnx {Scheme Procedure} response-age response [default='()]
1349 @deffnx {Scheme Procedure} response-allow response [default='()]
1350 @deffnx {Scheme Procedure} response-cache-control response [default='()]
1351 @deffnx {Scheme Procedure} response-connection response [default='()]
1352 @deffnx {Scheme Procedure} response-content-encoding response [default='()]
1353 @deffnx {Scheme Procedure} response-content-language response [default='()]
1354 @deffnx {Scheme Procedure} response-content-length response [default=#f]
1355 @deffnx {Scheme Procedure} response-content-location response [default=#f]
1356 @deffnx {Scheme Procedure} response-content-md5 response [default=#f]
1357 @deffnx {Scheme Procedure} response-content-range response [default=#f]
1358 @deffnx {Scheme Procedure} response-content-type response [default=#f]
1359 @deffnx {Scheme Procedure} response-date response [default=#f]
1360 @deffnx {Scheme Procedure} response-etag response [default=#f]
1361 @deffnx {Scheme Procedure} response-expires response [default=#f]
1362 @deffnx {Scheme Procedure} response-last-modified response [default=#f]
1363 @deffnx {Scheme Procedure} response-location response [default=#f]
1364 @deffnx {Scheme Procedure} response-pragma response [default='()]
1365 @deffnx {Scheme Procedure} response-proxy-authenticate response [default=#f]
1366 @deffnx {Scheme Procedure} response-retry-after response [default=#f]
1367 @deffnx {Scheme Procedure} response-server response [default=#f]
1368 @deffnx {Scheme Procedure} response-trailer response [default='()]
1369 @deffnx {Scheme Procedure} response-transfer-encoding response [default='()]
1370 @deffnx {Scheme Procedure} response-upgrade response [default='()]
1371 @deffnx {Scheme Procedure} response-vary response [default='()]
1372 @deffnx {Scheme Procedure} response-via response [default='()]
1373 @deffnx {Scheme Procedure} response-warning response [default='()]
1374 @deffnx {Scheme Procedure} response-www-authenticate response [default=#f]
1375 Return the given response header, or @var{default} if none was present.
1376 @end deffn
1377
1378 @deffn {Scheme Procedure} text-content-type? @var{type}
1379 Return @code{#t} if @var{type}, a symbol as returned by
1380 @code{response-content-type}, represents a textual type such as
1381 @code{text/plain}.
1382 @end deffn
1383
1384
1385 @node Web Client
1386 @subsection Web Client
1387
1388 @code{(web client)} provides a simple, synchronous HTTP client, built on
1389 the lower-level HTTP, request, and response modules.
1390
1391 @example
1392 (use-modules (web client))
1393 @end example
1394
1395 @deffn {Scheme Procedure} open-socket-for-uri uri
1396 Return an open input/output port for a connection to URI.
1397 @end deffn
1398
1399 @deffn {Scheme Procedure} http-get uri arg...
1400 @deffnx {Scheme Procedure} http-head uri arg...
1401 @deffnx {Scheme Procedure} http-post uri arg...
1402 @deffnx {Scheme Procedure} http-put uri arg...
1403 @deffnx {Scheme Procedure} http-delete uri arg...
1404 @deffnx {Scheme Procedure} http-trace uri arg...
1405 @deffnx {Scheme Procedure} http-options uri arg...
1406
1407 Connect to the server corresponding to @var{uri} and make a request over
1408 HTTP, using the appropriate method (@code{GET}, @code{HEAD}, etc.).
1409
1410 All of these procedures have the same prototype: a URI followed by an
1411 optional sequence of keyword arguments. These keyword arguments allow
1412 you to modify the requests in various ways, for example attaching a body
1413 to the request, or setting specific headers. The following table lists
1414 the keyword arguments and their default values.
1415
1416 @table @code
1417 @item #:body #f
1418 @item #:port (open-socket-for-uri @var{uri})]
1419 @item #:version '(1 . 1)
1420 @item #:keep-alive? #f
1421 @item #:headers '()
1422 @item #:decode-body? #t
1423 @item #:streaming? #f
1424 @end table
1425
1426 If you already have a port open, pass it as @var{port}. Otherwise, a
1427 connection will be opened to the server corresponding to @var{uri}. Any
1428 extra headers in the alist @var{headers} will be added to the request.
1429
1430 If @var{body} is not @code{#f}, a message body will also be sent with
1431 the HTTP request. If @var{body} is a string, it is encoded according to
1432 the content-type in @var{headers}, defaulting to UTF-8. Otherwise
1433 @var{body} should be a bytevector, or @code{#f} for no body. Although a
1434 message body may be sent with any request, usually only @code{POST} and
1435 @code{PUT} requests have bodies.
1436
1437 If @var{decode-body?} is true, as is the default, the body of the
1438 response will be decoded to string, if it is a textual content-type.
1439 Otherwise it will be returned as a bytevector.
1440
1441 However, if @var{streaming?} is true, instead of eagerly reading the
1442 response body from the server, this function only reads off the headers.
1443 The response body will be returned as a port on which the data may be
1444 read.
1445
1446 Unless @var{keep-alive?} is true, the port will be closed after the full
1447 response body has been read.
1448
1449 Returns two values: the response read from the server, and the response
1450 body as a string, bytevector, #f value, or as a port (if
1451 @var{streaming?} is true).
1452 @end deffn
1453
1454 @code{http-get} is useful for making one-off requests to web sites. If
1455 you are writing a web spider or some other client that needs to handle a
1456 number of requests in parallel, it's better to build an event-driven URL
1457 fetcher, similar in structure to the web server (@pxref{Web Server}).
1458
1459 Another option, good but not as performant, would be to use threads,
1460 possibly via par-map or futures.
1461
1462 @deffn {Scheme Parameter} current-http-proxy
1463 Either @code{#f} or a non-empty string containing the URL of the HTTP
1464 proxy server to be used by the procedures in the @code{(web client)}
1465 module, including @code{open-socket-for-uri}. Its initial value is
1466 based on the @env{http_proxy} environment variable.
1467
1468 @example
1469 (current-http-proxy) @result{} "http://localhost:8123/"
1470 (parameterize ((current-http-proxy #f))
1471 (http-get "http://example.com/")) ; temporarily bypass proxy
1472 (current-http-proxy) @result{} "http://localhost:8123/"
1473 @end example
1474 @end deffn
1475
1476
1477 @node Web Server
1478 @subsection Web Server
1479
1480 @code{(web server)} is a generic web server interface, along with a main
1481 loop implementation for web servers controlled by Guile.
1482
1483 @example
1484 (use-modules (web server))
1485 @end example
1486
1487 The lowest layer is the @code{<server-impl>} object, which defines a set
1488 of hooks to open a server, read a request from a client, write a
1489 response to a client, and close a server. These hooks -- @code{open},
1490 @code{read}, @code{write}, and @code{close}, respectively -- are bound
1491 together in a @code{<server-impl>} object. Procedures in this module take a
1492 @code{<server-impl>} object, if needed.
1493
1494 A @code{<server-impl>} may also be looked up by name. If you pass the
1495 @code{http} symbol to @code{run-server}, Guile looks for a variable
1496 named @code{http} in the @code{(web server http)} module, which should
1497 be bound to a @code{<server-impl>} object. Such a binding is made by
1498 instantiation of the @code{define-server-impl} syntax. In this way the
1499 run-server loop can automatically load other backends if available.
1500
1501 The life cycle of a server goes as follows:
1502
1503 @enumerate
1504 @item
1505 The @code{open} hook is called, to open the server. @code{open} takes
1506 zero or more arguments, depending on the backend, and returns an opaque
1507 server socket object, or signals an error.
1508
1509 @item
1510 The @code{read} hook is called, to read a request from a new client.
1511 The @code{read} hook takes one argument, the server socket. It should
1512 return three values: an opaque client socket, the request, and the
1513 request body. The request should be a @code{<request>} object, from
1514 @code{(web request)}. The body should be a string or a bytevector, or
1515 @code{#f} if there is no body.
1516
1517 If the read failed, the @code{read} hook may return #f for the client
1518 socket, request, and body.
1519
1520 @item
1521 A user-provided handler procedure is called, with the request and body
1522 as its arguments. The handler should return two values: the response,
1523 as a @code{<response>} record from @code{(web response)}, and the
1524 response body as bytevector, or @code{#f} if not present.
1525
1526 The respose and response body are run through @code{sanitize-response},
1527 documented below. This allows the handler writer to take some
1528 convenient shortcuts: for example, instead of a @code{<response>}, the
1529 handler can simply return an alist of headers, in which case a default
1530 response object is constructed with those headers. Instead of a
1531 bytevector for the body, the handler can return a string, which will be
1532 serialized into an appropriate encoding; or it can return a procedure,
1533 which will be called on a port to write out the data. See the
1534 @code{sanitize-response} documentation, for more.
1535
1536 @item
1537 The @code{write} hook is called with three arguments: the client
1538 socket, the response, and the body. The @code{write} hook returns no
1539 values.
1540
1541 @item
1542 At this point the request handling is complete. For a loop, we
1543 loop back and try to read a new request.
1544
1545 @item
1546 If the user interrupts the loop, the @code{close} hook is called on
1547 the server socket.
1548 @end enumerate
1549
1550 A user may define a server implementation with the following form:
1551
1552 @deffn {Scheme Syntax} define-server-impl name open read write close
1553 Make a @code{<server-impl>} object with the hooks @var{open},
1554 @var{read}, @var{write}, and @var{close}, and bind it to the symbol
1555 @var{name} in the current module.
1556 @end deffn
1557
1558 @deffn {Scheme Procedure} lookup-server-impl impl
1559 Look up a server implementation. If @var{impl} is a server
1560 implementation already, it is returned directly. If it is a symbol, the
1561 binding named @var{impl} in the @code{(web server @var{impl})} module is
1562 looked up. Otherwise an error is signaled.
1563
1564 Currently a server implementation is a somewhat opaque type, useful only
1565 for passing to other procedures in this module, like @code{read-client}.
1566 @end deffn
1567
1568 The @code{(web server)} module defines a number of routines that use
1569 @code{<server-impl>} objects to implement parts of a web server. Given
1570 that we don't expose the accessors for the various fields of a
1571 @code{<server-impl>}, indeed these routines are the only procedures with
1572 any access to the impl objects.
1573
1574 @deffn {Scheme Procedure} open-server impl open-params
1575 Open a server for the given implementation. Return one value, the new
1576 server object. The implementation's @code{open} procedure is applied to
1577 @var{open-params}, which should be a list.
1578 @end deffn
1579
1580 @deffn {Scheme Procedure} read-client impl server
1581 Read a new client from @var{server}, by applying the implementation's
1582 @code{read} procedure to the server. If successful, return three
1583 values: an object corresponding to the client, a request object, and the
1584 request body. If any exception occurs, return @code{#f} for all three
1585 values.
1586 @end deffn
1587
1588 @deffn {Scheme Procedure} handle-request handler request body state
1589 Handle a given request, returning the response and body.
1590
1591 The response and response body are produced by calling the given
1592 @var{handler} with @var{request} and @var{body} as arguments.
1593
1594 The elements of @var{state} are also passed to @var{handler} as
1595 arguments, and may be returned as additional values. The new
1596 @var{state}, collected from the @var{handler}'s return values, is then
1597 returned as a list. The idea is that a server loop receives a handler
1598 from the user, along with whatever state values the user is interested
1599 in, allowing the user's handler to explicitly manage its state.
1600 @end deffn
1601
1602 @deffn {Scheme Procedure} sanitize-response request response body
1603 ``Sanitize'' the given response and body, making them appropriate for
1604 the given request.
1605
1606 As a convenience to web handler authors, @var{response} may be given as
1607 an alist of headers, in which case it is used to construct a default
1608 response. Ensures that the response version corresponds to the request
1609 version. If @var{body} is a string, encodes the string to a bytevector,
1610 in an encoding appropriate for @var{response}. Adds a
1611 @code{content-length} and @code{content-type} header, as necessary.
1612
1613 If @var{body} is a procedure, it is called with a port as an argument,
1614 and the output collected as a bytevector. In the future we might try to
1615 instead use a compressing, chunk-encoded port, and call this procedure
1616 later, in the write-client procedure. Authors are advised not to rely on
1617 the procedure being called at any particular time.
1618 @end deffn
1619
1620 @deffn {Scheme Procedure} write-client impl server client response body
1621 Write an HTTP response and body to @var{client}. If the server and
1622 client support persistent connections, it is the implementation's
1623 responsibility to keep track of the client thereafter, presumably by
1624 attaching it to the @var{server} argument somehow.
1625 @end deffn
1626
1627 @deffn {Scheme Procedure} close-server impl server
1628 Release resources allocated by a previous invocation of
1629 @code{open-server}.
1630 @end deffn
1631
1632 Given the procedures above, it is a small matter to make a web server:
1633
1634 @deffn {Scheme Procedure} serve-one-client handler impl server state
1635 Read one request from @var{server}, call @var{handler} on the request
1636 and body, and write the response to the client. Return the new state
1637 produced by the handler procedure.
1638 @end deffn
1639
1640 @deffn {Scheme Procedure} run-server handler @
1641 [impl='http] [open-params='()] @
1642 arg @dots{}
1643 Run Guile's built-in web server.
1644
1645 @var{handler} should be a procedure that takes two or more arguments,
1646 the HTTP request and request body, and returns two or more values, the
1647 response and response body.
1648
1649 For examples, skip ahead to the next section, @ref{Web Examples}.
1650
1651 The response and body will be run through @code{sanitize-response}
1652 before sending back to the client.
1653
1654 Additional arguments to @var{handler} are taken from @var{arg}
1655 @enddots{}. These arguments comprise a @dfn{state}. Additional return
1656 values are accumulated into a new state, which will be used for
1657 subsequent requests. In this way a handler can explicitly manage its
1658 state.
1659 @end deffn
1660
1661 The default web server implementation is @code{http}, which binds to a
1662 socket, listening for request on that port.
1663
1664 @deffn {HTTP Implementation} http [#:host=#f] @
1665 [#:family=AF_INET] @
1666 [#:addr=INADDR_LOOPBACK] @
1667 [#:port 8080] [#:socket]
1668 The default HTTP implementation. We document it as a function with
1669 keyword arguments, because that is precisely the way that it is -- all
1670 of the @var{open-params} to @code{run-server} get passed to the
1671 implementation's open function.
1672
1673 @example
1674 ;; The defaults: localhost:8080
1675 (run-server handler)
1676 ;; Same thing
1677 (run-server handler 'http '())
1678 ;; On a different port
1679 (run-server handler 'http '(#:port 8081))
1680 ;; IPv6
1681 (run-server handler 'http '(#:family AF_INET6 #:port 8081))
1682 ;; Custom socket
1683 (run-server handler 'http `(#:socket ,(sudo-make-me-a-socket)))
1684 @end example
1685 @end deffn
1686
1687 @node Web Examples
1688 @subsection Web Examples
1689
1690 Well, enough about the tedious internals. Let's make a web application!
1691
1692 @subsubsection Hello, World!
1693
1694 The first program we have to write, of course, is ``Hello, World!''.
1695 This means that we have to implement a web handler that does what we
1696 want.
1697
1698 Now we define a handler, a function of two arguments and two return
1699 values:
1700
1701 @example
1702 (define (handler request request-body)
1703 (values @var{response} @var{response-body}))
1704 @end example
1705
1706 In this first example, we take advantage of a short-cut, returning an
1707 alist of headers instead of a proper response object. The response body
1708 is our payload:
1709
1710 @example
1711 (define (hello-world-handler request request-body)
1712 (values '((content-type . (text/plain)))
1713 "Hello World!"))
1714 @end example
1715
1716 Now let's test it, by running a server with this handler. Load up the
1717 web server module if you haven't yet done so, and run a server with this
1718 handler:
1719
1720 @example
1721 (use-modules (web server))
1722 (run-server hello-world-handler)
1723 @end example
1724
1725 By default, the web server listens for requests on
1726 @code{localhost:8080}. Visit that address in your web browser to
1727 test. If you see the string, @code{Hello World!}, sweet!
1728
1729 @subsubsection Inspecting the Request
1730
1731 The Hello World program above is a general greeter, responding to all
1732 URIs. To make a more exclusive greeter, we need to inspect the request
1733 object, and conditionally produce different results. So let's load up
1734 the request, response, and URI modules, and do just that.
1735
1736 @example
1737 (use-modules (web server)) ; you probably did this already
1738 (use-modules (web request)
1739 (web response)
1740 (web uri))
1741
1742 (define (request-path-components request)
1743 (split-and-decode-uri-path (uri-path (request-uri request))))
1744
1745 (define (hello-hacker-handler request body)
1746 (if (equal? (request-path-components request)
1747 '("hacker"))
1748 (values '((content-type . (text/plain)))
1749 "Hello hacker!")
1750 (not-found request)))
1751
1752 (run-server hello-hacker-handler)
1753 @end example
1754
1755 Here we see that we have defined a helper to return the components of
1756 the URI path as a list of strings, and used that to check for a request
1757 to @code{/hacker/}. Then the success case is just as before -- visit
1758 @code{http://localhost:8080/hacker/} in your browser to check.
1759
1760 You should always match against URI path components as decoded by
1761 @code{split-and-decode-uri-path}. The above example will work for
1762 @code{/hacker/}, @code{//hacker///}, and @code{/h%61ck%65r}.
1763
1764 But we forgot to define @code{not-found}! If you are pasting these
1765 examples into a REPL, accessing any other URI in your web browser will
1766 drop your Guile console into the debugger:
1767
1768 @example
1769 <unnamed port>:38:7: In procedure module-lookup:
1770 <unnamed port>:38:7: Unbound variable: not-found
1771
1772 Entering a new prompt. Type `,bt' for a backtrace or `,q' to continue.
1773 scheme@@(guile-user) [1]>
1774 @end example
1775
1776 So let's define the function, right there in the debugger. As you
1777 probably know, we'll want to return a 404 response.
1778
1779 @example
1780 ;; Paste this in your REPL
1781 (define (not-found request)
1782 (values (build-response #:code 404)
1783 (string-append "Resource not found: "
1784 (uri->string (request-uri request)))))
1785
1786 ;; Now paste this to let the web server keep going:
1787 ,continue
1788 @end example
1789
1790 Now if you access @code{http://localhost/foo/}, you get this error
1791 message. (Note that some popular web browsers won't show
1792 server-generated 404 messages, showing their own instead, unless the 404
1793 message body is long enough.)
1794
1795 @subsubsection Higher-Level Interfaces
1796
1797 The web handler interface is a common baseline that all kinds of Guile
1798 web applications can use. You will usually want to build something on
1799 top of it, however, especially when producing HTML. Here is a simple
1800 example that builds up HTML output using SXML (@pxref{SXML}).
1801
1802 First, load up the modules:
1803
1804 @example
1805 (use-modules (web server)
1806 (web request)
1807 (web response)
1808 (sxml simple))
1809 @end example
1810
1811 Now we define a simple templating function that takes a list of HTML
1812 body elements, as SXML, and puts them in our super template:
1813
1814 @example
1815 (define (templatize title body)
1816 `(html (head (title ,title))
1817 (body ,@@body)))
1818 @end example
1819
1820 For example, the simplest Hello HTML can be produced like this:
1821
1822 @example
1823 (sxml->xml (templatize "Hello!" '((b "Hi!"))))
1824 @print{}
1825 <html><head><title>Hello!</title></head><body><b>Hi!</b></body></html>
1826 @end example
1827
1828 Much better to work with Scheme data types than to work with HTML as
1829 strings. Now we define a little response helper:
1830
1831 @example
1832 (define* (respond #:optional body #:key
1833 (status 200)
1834 (title "Hello hello!")
1835 (doctype "<!DOCTYPE html>\n")
1836 (content-type-params '((charset . "utf-8")))
1837 (content-type 'text/html)
1838 (extra-headers '())
1839 (sxml (and body (templatize title body))))
1840 (values (build-response
1841 #:code status
1842 #:headers `((content-type
1843 . (,content-type ,@@content-type-params))
1844 ,@@extra-headers))
1845 (lambda (port)
1846 (if sxml
1847 (begin
1848 (if doctype (display doctype port))
1849 (sxml->xml sxml port))))))
1850 @end example
1851
1852 Here we see the power of keyword arguments with default initializers. By
1853 the time the arguments are fully parsed, the @code{sxml} local variable
1854 will hold the templated SXML, ready for sending out to the client.
1855
1856 Also, instead of returning the body as a string, @code{respond} gives a
1857 procedure, which will be called by the web server to write out the
1858 response to the client.
1859
1860 Now, a simple example using this responder, which lays out the incoming
1861 headers in an HTML table.
1862
1863 @example
1864 (define (debug-page request body)
1865 (respond
1866 `((h1 "hello world!")
1867 (table
1868 (tr (th "header") (th "value"))
1869 ,@@(map (lambda (pair)
1870 `(tr (td (tt ,(with-output-to-string
1871 (lambda () (display (car pair))))))
1872 (td (tt ,(with-output-to-string
1873 (lambda ()
1874 (write (cdr pair))))))))
1875 (request-headers request))))))
1876
1877 (run-server debug-page)
1878 @end example
1879
1880 Now if you visit any local address in your web browser, we actually see
1881 some HTML, finally.
1882
1883 @subsubsection Conclusion
1884
1885 Well, this is about as far as Guile's built-in web support goes, for
1886 now. There are many ways to make a web application, but hopefully by
1887 standardizing the most fundamental data types, users will be able to
1888 choose the approach that suits them best, while also being able to
1889 switch between implementations of the server. This is a relatively new
1890 part of Guile, so if you have feedback, let us know, and we can take it
1891 into account. Happy hacking on the web!
1892
1893 @c Local Variables:
1894 @c TeX-master: "guile.texi"
1895 @c End: