read-response-body always returns bytevector or #f
[bpt/guile.git] / doc / ref / web.texi
1 @c -*-texinfo-*-
2 @c This is part of the GNU Guile Reference Manual.
3 @c Copyright (C) 2010, 2011, 2012 Free Software Foundation, Inc.
4 @c See the file guile.texi for copying conditions.
5
6 @node Web
7 @section @acronym{HTTP}, the Web, and All That
8 @cindex Web
9 @cindex WWW
10 @cindex HTTP
11
12 It has always been possible to connect computers together and share
13 information between them, but the rise of the World-Wide Web over the
14 last couple of decades has made it much easier to do so. The result is
15 a richly connected network of computation, in which Guile forms a part.
16
17 By ``the web'', we mean the HTTP protocol@footnote{Yes, the P is for
18 protocol, but this phrase appears repeatedly in RFC 2616.} as handled by
19 servers, clients, proxies, caches, and the various kinds of messages and
20 message components that can be sent and received by that protocol,
21 notably HTML.
22
23 On one level, the web is text in motion: the protocols themselves are
24 textual (though the payload may be binary), and it's possible to create
25 a socket and speak text to the web. But such an approach is obviously
26 primitive. This section details the higher-level data types and
27 operations provided by Guile: URIs, HTTP request and response records,
28 and a conventional web server implementation.
29
30 The material in this section is arranged in ascending order, in which
31 later concepts build on previous ones. If you prefer to start with the
32 highest-level perspective, @pxref{Web Examples}, and work your way
33 back.
34
35 @menu
36 * Types and the Web:: Types prevent bugs and security problems.
37 * URIs:: Universal Resource Identifiers.
38 * HTTP:: The Hyper-Text Transfer Protocol.
39 * HTTP Headers:: How Guile represents specific header values.
40 * Transfer Codings:: HTTP Transfer Codings.
41 * Requests:: HTTP requests.
42 * Responses:: HTTP responses.
43 * Web Client:: Accessing web resources over HTTP.
44 * Web Server:: Serving HTTP to the internet.
45 * Web Examples:: How to use this thing.
46 @end menu
47
48 @node Types and the Web
49 @subsection Types and the Web
50
51 It is a truth universally acknowledged, that a program with good use of
52 data types, will be free from many common bugs. Unfortunately, the
53 common practice in web programming seems to ignore this maxim. This
54 subsection makes the case for expressive data types in web programming.
55
56 By ``expressive data types'', we mean that the data types @emph{say}
57 something about how a program solves a problem. For example, if we
58 choose to represent dates using SRFI 19 date records (@pxref{SRFI-19}),
59 this indicates that there is a part of the program that will always have
60 valid dates. Error handling for a number of basic cases, like invalid
61 dates, occurs on the boundary in which we produce a SRFI 19 date record
62 from other types, like strings.
63
64 With regards to the web, data types are helpful in the two broad phases
65 of HTTP messages: parsing and generation.
66
67 Consider a server, which has to parse a request, and produce a response.
68 Guile will parse the request into an HTTP request object
69 (@pxref{Requests}), with each header parsed into an appropriate Scheme
70 data type. This transition from an incoming stream of characters to
71 typed data is a state change in a program---the strings might parse, or
72 they might not, and something has to happen if they do not. (Guile
73 throws an error in this case.) But after you have the parsed request,
74 ``client'' code (code built on top of the Guile web framework) will not
75 have to check for syntactic validity. The types already make this
76 information manifest.
77
78 This state change on the parsing boundary makes programs more robust,
79 as they themselves are freed from the need to do a number of common
80 error checks, and they can use normal Scheme procedures to handle a
81 request instead of ad-hoc string parsers.
82
83 The need for types on the response generation side (in a server) is more
84 subtle, though not less important. Consider the example of a POST
85 handler, which prints out the text that a user submits from a form.
86 Such a handler might include a procedure like this:
87
88 @example
89 ;; First, a helper procedure
90 (define (para . contents)
91 (string-append "<p>" (string-concatenate contents) "</p>"))
92
93 ;; Now the meat of our simple web application
94 (define (you-said text)
95 (para "You said: " text))
96
97 (display (you-said "Hi!"))
98 @print{} <p>You said: Hi!</p>
99 @end example
100
101 This is a perfectly valid implementation, provided that the incoming
102 text does not contain the special HTML characters @samp{<}, @samp{>}, or
103 @samp{&}. But this provision of a restricted character set is not
104 reflected anywhere in the program itself: we must @emph{assume} that the
105 programmer understands this, and performs the check elsewhere.
106
107 Unfortunately, the short history of the practice of programming does not
108 bear out this assumption. A @dfn{cross-site scripting} (@acronym{XSS})
109 vulnerability is just such a common error in which unfiltered user input
110 is allowed into the output. A user could submit a crafted comment to
111 your web site which results in visitors running malicious Javascript,
112 within the security context of your domain:
113
114 @example
115 (display (you-said "<script src=\"http://bad.com/nasty.js\" />"))
116 @print{} <p>You said: <script src="http://bad.com/nasty.js" /></p>
117 @end example
118
119 The fundamental problem here is that both user data and the program
120 template are represented using strings. This identity means that types
121 can't help the programmer to make a distinction between these two, so
122 they get confused.
123
124 There are a number of possible solutions, but perhaps the best is to
125 treat HTML not as strings, but as native s-expressions: as SXML. The
126 basic idea is that HTML is either text, represented by a string, or an
127 element, represented as a tagged list. So @samp{foo} becomes
128 @samp{"foo"}, and @samp{<b>foo</b>} becomes @samp{(b "foo")}.
129 Attributes, if present, go in a tagged list headed by @samp{@@}, like
130 @samp{(img (@@ (src "http://example.com/foo.png")))}. @xref{sxml
131 simple}, for more information.
132
133 The good thing about SXML is that HTML elements cannot be confused with
134 text. Let's make a new definition of @code{para}:
135
136 @example
137 (define (para . contents)
138 `(p ,@@contents))
139
140 (use-modules (sxml simple))
141 (sxml->xml (you-said "Hi!"))
142 @print{} <p>You said: Hi!</p>
143
144 (sxml->xml (you-said "<i>Rats, foiled again!</i>"))
145 @print{} <p>You said: &lt;i&gt;Rats, foiled again!&lt;/i&gt;</p>
146 @end example
147
148 So we see in the second example that HTML elements cannot be unwittingly
149 introduced into the output. However it is now perfectly acceptable to
150 pass SXML to @code{you-said}; in fact, that is the big advantage of SXML
151 over everything-as-a-string.
152
153 @example
154 (sxml->xml (you-said (you-said "<Hi!>")))
155 @print{} <p>You said: <p>You said: &lt;Hi!&gt;</p></p>
156 @end example
157
158 The SXML types allow procedures to @emph{compose}. The types make
159 manifest which parts are HTML elements, and which are text. So you
160 needn't worry about escaping user input; the type transition back to a
161 string handles that for you. @acronym{XSS} vulnerabilities are a thing
162 of the past.
163
164 Well. That's all very nice and opinionated and such, but how do I use
165 the thing? Read on!
166
167 @node URIs
168 @subsection Universal Resource Identifiers
169
170 Guile provides a standard data type for Universal Resource Identifiers
171 (URIs), as defined in RFC 3986.
172
173 The generic URI syntax is as follows:
174
175 @example
176 URI := scheme ":" ["//" [userinfo "@@"] host [":" port]] path \
177 [ "?" query ] [ "#" fragment ]
178 @end example
179
180 For example, in the URI, @indicateurl{http://www.gnu.org/help/}, the
181 scheme is @code{http}, the host is @code{www.gnu.org}, the path is
182 @code{/help/}, and there is no userinfo, port, query, or fragment. All
183 URIs have a scheme and a path (though the path might be empty). Some
184 URIs have a host, and some of those have ports and userinfo. Any URI
185 might have a query part or a fragment.
186
187 Userinfo is something of an abstraction, as some legacy URI schemes
188 allowed userinfo of the form @code{@var{username}:@var{passwd}}. But
189 since passwords do not belong in URIs, the RFC does not want to condone
190 this practice, so it calls anything before the @code{@@} sign
191 @dfn{userinfo}.
192
193 Properly speaking, a fragment is not part of a URI. For example, when a
194 web browser follows a link to @indicateurl{http://example.com/#foo}, it
195 sends a request for @indicateurl{http://example.com/}, then looks in the
196 resulting page for the fragment identified @code{foo} reference. A
197 fragment identifies a part of a resource, not the resource itself. But
198 it is useful to have a fragment field in the URI record itself, so we
199 hope you will forgive the inconsistency.
200
201 @example
202 (use-modules (web uri))
203 @end example
204
205 The following procedures can be found in the @code{(web uri)}
206 module. Load it into your Guile, using a form like the above, to have
207 access to them.
208
209 @deffn {Scheme Procedure} build-uri scheme [#:userinfo=@code{#f}] [#:host=@code{#f}] @
210 [#:port=@code{#f}] [#:path=@code{""}] [#:query=@code{#f}] @
211 [#:fragment=@code{#f}] [#:validate?=@code{#t}]
212 Construct a URI object. @var{scheme} should be a symbol, @var{port}
213 either a positive, exact integer or @code{#f}, and the rest of the
214 fields are either strings or @code{#f}. If @var{validate?} is true,
215 also run some consistency checks to make sure that the constructed URI
216 is valid.
217 @end deffn
218
219 @deffn {Scheme Procedure} uri? x
220 @deffnx {Scheme Procedure} uri-scheme uri
221 @deffnx {Scheme Procedure} uri-userinfo uri
222 @deffnx {Scheme Procedure} uri-host uri
223 @deffnx {Scheme Procedure} uri-port uri
224 @deffnx {Scheme Procedure} uri-path uri
225 @deffnx {Scheme Procedure} uri-query uri
226 @deffnx {Scheme Procedure} uri-fragment uri
227 A predicate and field accessors for the URI record type. The URI scheme
228 will be a symbol, the port either a positive, exact integer or @code{#f},
229 and the rest either strings or @code{#f} if not present.
230 @end deffn
231
232 @deffn {Scheme Procedure} string->uri string
233 Parse @var{string} into a URI object. Return @code{#f} if the string
234 could not be parsed.
235 @end deffn
236
237 @deffn {Scheme Procedure} uri->string uri
238 Serialize @var{uri} to a string. If the URI has a port that is the
239 default port for its scheme, the port is not included in the
240 serialization.
241 @end deffn
242
243 @deffn {Scheme Procedure} declare-default-port! scheme port
244 Declare a default port for the given URI scheme.
245 @end deffn
246
247 @deffn {Scheme Procedure} uri-decode str [#:encoding=@code{"utf-8"}]
248 Percent-decode the given @var{str}, according to @var{encoding}, which
249 should be the name of a character encoding.
250
251 Note that this function should not generally be applied to a full URI
252 string. For paths, use split-and-decode-uri-path instead. For query
253 strings, split the query on @code{&} and @code{=} boundaries, and decode
254 the components separately.
255
256 Note also that percent-encoded strings encode @emph{bytes}, not
257 characters. There is no guarantee that a given byte sequence is a valid
258 string encoding. Therefore this routine may signal an error if the
259 decoded bytes are not valid for the given encoding. Pass @code{#f} for
260 @var{encoding} if you want decoded bytes as a bytevector directly.
261 @xref{Ports, @code{set-port-encoding!}}, for more information on
262 character encodings.
263
264 Returns a string of the decoded characters, or a bytevector if
265 @var{encoding} was @code{#f}.
266 @end deffn
267
268 Fixme: clarify return type. indicate default values. type of
269 unescaped-chars.
270
271 @deffn {Scheme Procedure} uri-encode str [#:encoding=@code{"utf-8"}] [#:unescaped-chars]
272 Percent-encode any character not in the character set,
273 @var{unescaped-chars}.
274
275 The default character set includes alphanumerics from ASCII, as well as
276 the special characters @samp{-}, @samp{.}, @samp{_}, and @samp{~}. Any
277 other character will be percent-encoded, by writing out the character to
278 a bytevector within the given @var{encoding}, then encoding each byte as
279 @code{%@var{HH}}, where @var{HH} is the hexadecimal representation of
280 the byte.
281 @end deffn
282
283 @deffn {Scheme Procedure} split-and-decode-uri-path path
284 Split @var{path} into its components, and decode each component,
285 removing empty components.
286
287 For example, @code{"/foo/bar%20baz/"} decodes to the two-element list,
288 @code{("foo" "bar baz")}.
289 @end deffn
290
291 @deffn {Scheme Procedure} encode-and-join-uri-path parts
292 URI-encode each element of @var{parts}, which should be a list of
293 strings, and join the parts together with @code{/} as a delimiter.
294
295 For example, the list @code{("scrambled eggs" "biscuits&gravy")} encodes
296 as @code{"scrambled%20eggs/biscuits%26gravy"}.
297 @end deffn
298
299 @node HTTP
300 @subsection The Hyper-Text Transfer Protocol
301
302 The initial motivation for including web functionality in Guile, rather
303 than rely on an external package, was to establish a standard base on
304 which people can share code. To that end, we continue the focus on data
305 types by providing a number of low-level parsers and unparsers for
306 elements of the HTTP protocol.
307
308 If you are want to skip the low-level details for now and move on to web
309 pages, @pxref{Web Client}, and @pxref{Web Server}. Otherwise, load the
310 HTTP module, and read on.
311
312 @example
313 (use-modules (web http))
314 @end example
315
316 The focus of the @code{(web http)} module is to parse and unparse
317 standard HTTP headers, representing them to Guile as native data
318 structures. For example, a @code{Date:} header will be represented as a
319 SRFI-19 date record (@pxref{SRFI-19}), rather than as a string.
320
321 Guile tries to follow RFCs fairly strictly---the road to perdition being
322 paved with compatibility hacks---though some allowances are made for
323 not-too-divergent texts.
324
325 Header names are represented as lower-case symbols.
326
327 @deffn {Scheme Procedure} string->header name
328 Parse @var{name} to a symbolic header name.
329 @end deffn
330
331 @deffn {Scheme Procedure} header->string sym
332 Return the string form for the header named @var{sym}.
333 @end deffn
334
335 For example:
336
337 @example
338 (string->header "Content-Length")
339 @result{} content-length
340 (header->string 'content-length)
341 @result{} "Content-Length"
342
343 (string->header "FOO")
344 @result{} foo
345 (header->string 'foo)
346 @result{} "Foo"
347 @end example
348
349 Guile keeps a registry of known headers, their string names, and some
350 parsing and serialization procedures. If a header is unknown, its
351 string name is simply its symbol name in title-case.
352
353 @deffn {Scheme Procedure} known-header? sym
354 Return @code{#t} iff @var{sym} is a known header, with associated
355 parsers and serialization procedures.
356 @end deffn
357
358 @deffn {Scheme Procedure} header-parser sym
359 Return the value parser for headers named @var{sym}. The result is a
360 procedure that takes one argument, a string, and returns the parsed
361 value. If the header isn't known to Guile, a default parser is returned
362 that passes through the string unchanged.
363 @end deffn
364
365 @deffn {Scheme Procedure} header-validator sym
366 Return a predicate which returns @code{#t} if the given value is valid
367 for headers named @var{sym}. The default validator for unknown headers
368 is @code{string?}.
369 @end deffn
370
371 @deffn {Scheme Procedure} header-writer sym
372 Return a procedure that writes values for headers named @var{sym} to a
373 port. The resulting procedure takes two arguments: a value and a port.
374 The default writer is @code{display}.
375 @end deffn
376
377 For more on the set of headers that Guile knows about out of the box,
378 @pxref{HTTP Headers}. To add your own, use the @code{declare-header!}
379 procedure:
380
381 @deffn {Scheme Procedure} declare-header! name parser validator writer [#:multiple?=@code{#f}]
382 Declare a parser, validator, and writer for a given header.
383 @end deffn
384
385 For example, let's say you are running a web server behind some sort of
386 proxy, and your proxy adds an @code{X-Client-Address} header, indicating
387 the IPv4 address of the original client. You would like for the HTTP
388 request record to parse out this header to a Scheme value, instead of
389 leaving it as a string. You could register this header with Guile's
390 HTTP stack like this:
391
392 @example
393 (declare-header! "X-Client-Address"
394 (lambda (str)
395 (inet-aton str))
396 (lambda (ip)
397 (and (integer? ip) (exact? ip) (<= 0 ip #xffffffff)))
398 (lambda (ip port)
399 (display (inet-ntoa ip) port)))
400 @end example
401
402 @deffn {Scheme Procedure} declare-opaque-header! name
403 A specialised version of @code{declare-header!} for the case in which
404 you want a header's value to be returned/written ``as-is''.
405 @end deffn
406
407 @deffn {Scheme Procedure} valid-header? sym val
408 Return a true value iff @var{val} is a valid Scheme value for the header
409 with name @var{sym}.
410 @end deffn
411
412 Now that we have a generic interface for reading and writing headers, we
413 do just that.
414
415 @deffn {Scheme Procedure} read-header port
416 Read one HTTP header from @var{port}. Return two values: the header
417 name and the parsed Scheme value. May raise an exception if the header
418 was known but the value was invalid.
419
420 Returns the end-of-file object for both values if the end of the message
421 body was reached (i.e., a blank line).
422 @end deffn
423
424 @deffn {Scheme Procedure} parse-header name val
425 Parse @var{val}, a string, with the parser for the header named
426 @var{name}. Returns the parsed value.
427 @end deffn
428
429 @deffn {Scheme Procedure} write-header name val port
430 Write the given header name and value to @var{port}, using the writer
431 from @code{header-writer}.
432 @end deffn
433
434 @deffn {Scheme Procedure} read-headers port
435 Read the headers of an HTTP message from @var{port}, returning them
436 as an ordered alist.
437 @end deffn
438
439 @deffn {Scheme Procedure} write-headers headers port
440 Write the given header alist to @var{port}. Doesn't write the final
441 @samp{\r\n}, as the user might want to add another header.
442 @end deffn
443
444 The @code{(web http)} module also has some utility procedures to read
445 and write request and response lines.
446
447 @deffn {Scheme Procedure} parse-http-method str [start] [end]
448 Parse an HTTP method from @var{str}. The result is an upper-case symbol,
449 like @code{GET}.
450 @end deffn
451
452 @deffn {Scheme Procedure} parse-http-version str [start] [end]
453 Parse an HTTP version from @var{str}, returning it as a major-minor
454 pair. For example, @code{HTTP/1.1} parses as the pair of integers,
455 @code{(1 . 1)}.
456 @end deffn
457
458 @deffn {Scheme Procedure} parse-request-uri str [start] [end]
459 Parse a URI from an HTTP request line. Note that URIs in requests do not
460 have to have a scheme or host name. The result is a URI object.
461 @end deffn
462
463 @deffn {Scheme Procedure} read-request-line port
464 Read the first line of an HTTP request from @var{port}, returning three
465 values: the method, the URI, and the version.
466 @end deffn
467
468 @deffn {Scheme Procedure} write-request-line method uri version port
469 Write the first line of an HTTP request to @var{port}.
470 @end deffn
471
472 @deffn {Scheme Procedure} read-response-line port
473 Read the first line of an HTTP response from @var{port}, returning three
474 values: the HTTP version, the response code, and the "reason phrase".
475 @end deffn
476
477 @deffn {Scheme Procedure} write-response-line version code reason-phrase port
478 Write the first line of an HTTP response to @var{port}.
479 @end deffn
480
481
482 @node HTTP Headers
483 @subsection HTTP Headers
484
485 In addition to defining the infrastructure to parse headers, the
486 @code{(web http)} module defines specific parsers and unparsers for all
487 headers defined in the HTTP/1.1 standard.
488
489 For example, if you receive a header named @samp{Accept-Language} with a
490 value @samp{en, es;q=0.8}, Guile parses it as a quality list (defined
491 below):
492
493 @example
494 (parse-header 'accept-language "en, es;q=0.8")
495 @result{} ((1000 . "en") (800 . "es"))
496 @end example
497
498 The format of the value for @samp{Accept-Language} headers is defined
499 below, along with all other headers defined in the HTTP standard. (If
500 the header were unknown, the value would have been returned as a
501 string.)
502
503 For brevity, the header definitions below are given in the form,
504 @var{Type} @code{@var{name}}, indicating that values for the header
505 @code{@var{name}} will be of the given @var{Type}. Since Guile
506 internally treats header names in lower case, in this document we give
507 types title-cased names. A short description of the each header's
508 purpose and an example follow.
509
510 For full details on the meanings of all of these headers, see the HTTP
511 1.1 standard, RFC 2616.
512
513 @subsubsection HTTP Header Types
514
515 Here we define the types that are used below, when defining headers.
516
517 @deftp {HTTP Header Type} Date
518 A SRFI-19 date.
519 @end deftp
520
521 @deftp {HTTP Header Type} KVList
522 A list whose elements are keys or key-value pairs. Keys are parsed to
523 symbols. Values are strings by default. Non-string values are the
524 exception, and are mentioned explicitly below, as appropriate.
525 @end deftp
526
527 @deftp {HTTP Header Type} SList
528 A list of strings.
529 @end deftp
530
531 @deftp {HTTP Header Type} Quality
532 An exact integer between 0 and 1000. Qualities are used to express
533 preference, given multiple options. An option with a quality of 870,
534 for example, is preferred over an option with quality 500.
535
536 (Qualities are written out over the wire as numbers between 0.0 and
537 1.0, but since the standard only allows three digits after the decimal,
538 it's equivalent to integers between 0 and 1000, so that's what Guile
539 uses.)
540 @end deftp
541
542 @deftp {HTTP Header Type} QList
543 A quality list: a list of pairs, the car of which is a quality, and the
544 cdr a string. Used to express a list of options, along with their
545 qualities.
546 @end deftp
547
548 @deftp {HTTP Header Type} ETag
549 An entity tag, represented as a pair. The car of the pair is an opaque
550 string, and the cdr is @code{#t} if the entity tag is a ``strong'' entity
551 tag, and @code{#f} otherwise.
552 @end deftp
553
554 @subsubsection General Headers
555
556 General HTTP headers may be present in any HTTP message.
557
558 @deftypevr {HTTP Header} KVList cache-control
559 A key-value list of cache-control directives. See RFC 2616, for more
560 details.
561
562 If present, parameters to @code{max-age}, @code{max-stale},
563 @code{min-fresh}, and @code{s-maxage} are all parsed as non-negative
564 integers.
565
566 If present, parameters to @code{private} and @code{no-cache} are parsed
567 as lists of header names, as symbols.
568
569 @example
570 (parse-header 'cache-control "no-cache,no-store"
571 @result{} (no-cache no-store)
572 (parse-header 'cache-control "no-cache=\"Authorization,Date\",no-store"
573 @result{} ((no-cache . (authorization date)) no-store)
574 (parse-header 'cache-control "no-cache=\"Authorization,Date\",max-age=10"
575 @result{} ((no-cache . (authorization date)) (max-age . 10))
576 @end example
577 @end deftypevr
578
579 @deftypevr {HTTP Header} List connection
580 A list of header names that apply only to this HTTP connection, as
581 symbols. Additionally, the symbol @samp{close} may be present, to
582 indicate that the server should close the connection after responding to
583 the request.
584 @example
585 (parse-header 'connection "close")
586 @result{} (close)
587 @end example
588 @end deftypevr
589
590 @deftypevr {HTTP Header} Date date
591 The date that a given HTTP message was originated.
592 @example
593 (parse-header 'date "Tue, 15 Nov 1994 08:12:31 GMT")
594 @result{} #<date ...>
595 @end example
596 @end deftypevr
597
598 @deftypevr {HTTP Header} KVList pragma
599 A key-value list of implementation-specific directives.
600 @example
601 (parse-header 'pragma "no-cache, broccoli=tasty")
602 @result{} (no-cache (broccoli . "tasty"))
603 @end example
604 @end deftypevr
605
606 @deftypevr {HTTP Header} List trailer
607 A list of header names which will appear after the message body, instead
608 of with the message headers.
609 @example
610 (parse-header 'trailer "ETag")
611 @result{} (etag)
612 @end example
613 @end deftypevr
614
615 @deftypevr {HTTP Header} List transfer-encoding
616 A list of transfer codings, expressed as key-value lists. The only
617 transfer coding defined by the specification is @code{chunked}.
618 @example
619 (parse-header 'transfer-encoding "chunked")
620 @result{} ((chunked))
621 @end example
622 @end deftypevr
623
624 @deftypevr {HTTP Header} List upgrade
625 A list of strings, indicating additional protocols that a server could use
626 in response to a request.
627 @example
628 (parse-header 'upgrade "WebSocket")
629 @result{} ("WebSocket")
630 @end example
631 @end deftypevr
632
633 FIXME: parse out more fully?
634 @deftypevr {HTTP Header} List via
635 A list of strings, indicating the protocol versions and hosts of
636 intermediate servers and proxies. There may be multiple @code{via}
637 headers in one message.
638 @example
639 (parse-header 'via "1.0 venus, 1.1 mars")
640 @result{} ("1.0 venus" "1.1 mars")
641 @end example
642 @end deftypevr
643
644 @deftypevr {HTTP Header} List warning
645 A list of warnings given by a server or intermediate proxy. Each
646 warning is a itself a list of four elements: a code, as an exact integer
647 between 0 and 1000, a host as a string, the warning text as a string,
648 and either @code{#f} or a SRFI-19 date.
649
650 There may be multiple @code{warning} headers in one message.
651 @example
652 (parse-header 'warning "123 foo \"core breach imminent\"")
653 @result{} ((123 "foo" "core-breach imminent" #f))
654 @end example
655 @end deftypevr
656
657
658 @subsubsection Entity Headers
659
660 Entity headers may be present in any HTTP message, and refer to the
661 resource referenced in the HTTP request or response.
662
663 @deftypevr {HTTP Header} List allow
664 A list of allowed methods on a given resource, as symbols.
665 @example
666 (parse-header 'allow "GET, HEAD")
667 @result{} (GET HEAD)
668 @end example
669 @end deftypevr
670
671 @deftypevr {HTTP Header} List content-encoding
672 A list of content codings, as symbols.
673 @example
674 (parse-header 'content-encoding "gzip")
675 @result{} (gzip)
676 @end example
677 @end deftypevr
678
679 @deftypevr {HTTP Header} List content-language
680 The languages that a resource is in, as strings.
681 @example
682 (parse-header 'content-language "en")
683 @result{} ("en")
684 @end example
685 @end deftypevr
686
687 @deftypevr {HTTP Header} UInt content-length
688 The number of bytes in a resource, as an exact, non-negative integer.
689 @example
690 (parse-header 'content-length "300")
691 @result{} 300
692 @end example
693 @end deftypevr
694
695 @deftypevr {HTTP Header} URI content-location
696 The canonical URI for a resource, in the case that it is also accessible
697 from a different URI.
698 @example
699 (parse-header 'content-location "http://example.com/foo")
700 @result{} #<<uri> ...>
701 @end example
702 @end deftypevr
703
704 @deftypevr {HTTP Header} String content-md5
705 The MD5 digest of a resource.
706 @example
707 (parse-header 'content-md5 "ffaea1a79810785575e29e2bd45e2fa5")
708 @result{} "ffaea1a79810785575e29e2bd45e2fa5"
709 @end example
710 @end deftypevr
711
712 @deftypevr {HTTP Header} List content-range
713 A range specification, as a list of three elements: the symbol
714 @code{bytes}, either the symbol @code{*} or a pair of integers,
715 indicating the byte rage, and either @code{*} or an integer, for the
716 instance length. Used to indicate that a response only includes part of
717 a resource.
718 @example
719 (parse-header 'content-range "bytes 10-20/*")
720 @result{} (bytes (10 . 20) *)
721 @end example
722 @end deftypevr
723
724 @deftypevr {HTTP Header} List content-type
725 The MIME type of a resource, as a symbol, along with any parameters.
726 @example
727 (parse-header 'content-length "text/plain")
728 @result{} (text/plain)
729 (parse-header 'content-length "text/plain;charset=utf-8")
730 @result{} (text/plain (charset . "utf-8"))
731 @end example
732 Note that the @code{charset} parameter is something is a misnomer, and
733 the HTTP specification admits this. It specifies the @emph{encoding} of
734 the characters, not the character set.
735 @end deftypevr
736
737 @deftypevr {HTTP Header} Date expires
738 The date/time after which the resource given in a response is considered
739 stale.
740 @example
741 (parse-header 'expires "Tue, 15 Nov 1994 08:12:31 GMT")
742 @result{} #<date ...>
743 @end example
744 @end deftypevr
745
746 @deftypevr {HTTP Header} Date last-modified
747 The date/time on which the resource given in a response was last
748 modified.
749 @example
750 (parse-header 'expires "Tue, 15 Nov 1994 08:12:31 GMT")
751 @result{} #<date ...>
752 @end example
753 @end deftypevr
754
755
756 @subsubsection Request Headers
757
758 Request headers may only appear in an HTTP request, not in a response.
759
760 @deftypevr {HTTP Header} List accept
761 A list of preferred media types for a response. Each element of the
762 list is itself a list, in the same format as @code{content-type}.
763 @example
764 (parse-header 'accept "text/html,text/plain;charset=utf-8")
765 @result{} ((text/html) (text/plain (charset . "utf-8")))
766 @end example
767 Preference is expressed with quality values:
768 @example
769 (parse-header 'accept "text/html;q=0.8,text/plain;q=0.6")
770 @result{} ((text/html (q . 800)) (text/plain (q . 600)))
771 @end example
772 @end deftypevr
773
774 @deftypevr {HTTP Header} QList accept-charset
775 A quality list of acceptable charsets. Note again that what HTTP calls
776 a ``charset'' is what Guile calls a ``character encoding''.
777 @example
778 (parse-header 'accept-charset "iso-8859-5, unicode-1-1;q=0.8")
779 @result{} ((1000 . "iso-8859-5") (800 . "unicode-1-1"))
780 @end example
781 @end deftypevr
782
783 @deftypevr {HTTP Header} QList accept-encoding
784 A quality list of acceptable content codings.
785 @example
786 (parse-header 'accept-encoding "gzip,identity=0.8")
787 @result{} ((1000 . "gzip") (800 . "identity"))
788 @end example
789 @end deftypevr
790
791 @deftypevr {HTTP Header} QList accept-language
792 A quality list of acceptable languages.
793 @example
794 (parse-header 'accept-language "cn,en=0.75")
795 @result{} ((1000 . "cn") (750 . "en"))
796 @end example
797 @end deftypevr
798
799 @deftypevr {HTTP Header} Pair authorization
800 Authorization credentials. The car of the pair indicates the
801 authentication scheme, like @code{basic}. For basic authentication, the
802 cdr of the pair will be the base64-encoded @samp{@var{user}:@var{pass}}
803 string. For other authentication schemes, like @code{digest}, the cdr
804 will be a key-value list of credentials.
805 @example
806 (parse-header 'authorization "Basic QWxhZGRpbjpvcGVuIHNlc2FtZQ=="
807 @result{} (basic . "QWxhZGRpbjpvcGVuIHNlc2FtZQ==")
808 @end example
809 @end deftypevr
810
811 @deftypevr {HTTP Header} List expect
812 A list of expectations that a client has of a server. The expectations
813 are key-value lists.
814 @example
815 (parse-header 'expect "100-continue")
816 @result{} ((100-continue))
817 @end example
818 @end deftypevr
819
820 @deftypevr {HTTP Header} String from
821 The email address of a user making an HTTP request.
822 @example
823 (parse-header 'from "bob@@example.com")
824 @result{} "bob@@example.com"
825 @end example
826 @end deftypevr
827
828 @deftypevr {HTTP Header} Pair host
829 The host for the resource being requested, as a hostname-port pair. If
830 no port is given, the port is @code{#f}.
831 @example
832 (parse-header 'host "gnu.org:80")
833 @result{} ("gnu.org" . 80)
834 (parse-header 'host "gnu.org")
835 @result{} ("gnu.org" . #f)
836 @end example
837 @end deftypevr
838
839 @deftypevr {HTTP Header} *|List if-match
840 A set of etags, indicating that the request should proceed if and only
841 if the etag of the resource is in that set. Either the symbol @code{*},
842 indicating any etag, or a list of entity tags.
843 @example
844 (parse-header 'if-match "*")
845 @result{} *
846 (parse-header 'if-match "asdfadf")
847 @result{} (("asdfadf" . #t))
848 (parse-header 'if-match W/"asdfadf")
849 @result{} (("asdfadf" . #f))
850 @end example
851 @end deftypevr
852
853 @deftypevr {HTTP Header} Date if-modified-since
854 Indicates that a response should proceed if and only if the resource has
855 been modified since the given date.
856 @example
857 (parse-header 'if-modified-since "Tue, 15 Nov 1994 08:12:31 GMT")
858 @result{} #<date ...>
859 @end example
860 @end deftypevr
861
862 @deftypevr {HTTP Header} *|List if-none-match
863 A set of etags, indicating that the request should proceed if and only
864 if the etag of the resource is not in the set. Either the symbol
865 @code{*}, indicating any etag, or a list of entity tags.
866 @example
867 (parse-header 'if-none-match "*")
868 @result{} *
869 @end example
870 @end deftypevr
871
872 @deftypevr {HTTP Header} ETag|Date if-range
873 Indicates that the range request should proceed if and only if the
874 resource matches a modification date or an etag. Either an entity tag,
875 or a SRFI-19 date.
876 @example
877 (parse-header 'if-range "\"original-etag\"")
878 @result{} ("original-etag" . #t)
879 @end example
880 @end deftypevr
881
882 @deftypevr {HTTP Header} Date if-unmodified-since
883 Indicates that a response should proceed if and only if the resource has
884 not been modified since the given date.
885 @example
886 (parse-header 'if-not-modified-since "Tue, 15 Nov 1994 08:12:31 GMT")
887 @result{} #<date ...>
888 @end example
889 @end deftypevr
890
891 @deftypevr {HTTP Header} UInt max-forwards
892 The maximum number of proxy or gateway hops that a request should be
893 subject to.
894 @example
895 (parse-header 'max-forwards "10")
896 @result{} 10
897 @end example
898 @end deftypevr
899
900 @deftypevr {HTTP Header} Pair proxy-authorization
901 Authorization credentials for a proxy connection. See the documentation
902 for @code{authorization} above for more information on the format.
903 @example
904 (parse-header 'proxy-authorization "Digest foo=bar,baz=qux"
905 @result{} (digest (foo . "bar") (baz . "qux"))
906 @end example
907 @end deftypevr
908
909 @deftypevr {HTTP Header} Pair range
910 A range request, indicating that the client wants only part of a
911 resource. The car of the pair is the symbol @code{bytes}, and the cdr
912 is a list of pairs. Each element of the cdr indicates a range; the car
913 is the first byte position and the cdr is the last byte position, as
914 integers, or @code{#f} if not given.
915 @example
916 (parse-header 'range "bytes=10-30,50-")
917 @result{} (bytes (10 . 30) (50 . #f))
918 @end example
919 @end deftypevr
920
921 @deftypevr {HTTP Header} URI referer
922 The URI of the resource that referred the user to this resource. The
923 name of the header is a misspelling, but we are stuck with it.
924 @example
925 (parse-header 'referer "http://www.gnu.org/")
926 @result{} #<uri ...>
927 @end example
928 @end deftypevr
929
930 @deftypevr {HTTP Header} List te
931 A list of transfer codings, expressed as key-value lists. A common
932 transfer coding is @code{trailers}.
933 @example
934 (parse-header 'te "trailers")
935 @result{} ((trailers))
936 @end example
937 @end deftypevr
938
939 @deftypevr {HTTP Header} String user-agent
940 A string indicating the user agent making the request. The
941 specification defines a structured format for this header, but it is
942 widely disregarded, so Guile does not attempt to parse strictly.
943 @example
944 (parse-header 'user-agent "Mozilla/5.0")
945 @result{} "Mozilla/5.0"
946 @end example
947 @end deftypevr
948
949
950 @subsubsection Response Headers
951
952 @deftypevr {HTTP Header} List accept-ranges
953 A list of range units that the server supports, as symbols.
954 @example
955 (parse-header 'accept-ranges "bytes")
956 @result{} (bytes)
957 @end example
958 @end deftypevr
959
960 @deftypevr {HTTP Header} UInt age
961 The age of a cached response, in seconds.
962 @example
963 (parse-header 'age "3600")
964 @result{} 3600
965 @end example
966 @end deftypevr
967
968 @deftypevr {HTTP Header} ETag etag
969 The entity-tag of the resource.
970 @example
971 (parse-header 'etag "\"foo\"")
972 @result{} ("foo" . #t)
973 @end example
974 @end deftypevr
975
976 @deftypevr {HTTP Header} URI location
977 A URI on which a request may be completed. Used in combination with a
978 redirecting status code to perform client-side redirection.
979 @example
980 (parse-header 'location "http://example.com/other")
981 @result{} #<uri ...>
982 @end example
983 @end deftypevr
984
985 @deftypevr {HTTP Header} List proxy-authenticate
986 A list of challenges to a proxy, indicating the need for authentication.
987 @example
988 (parse-header 'proxy-authenticate "Basic realm=\"foo\"")
989 @result{} ((basic (realm . "foo")))
990 @end example
991 @end deftypevr
992
993 @deftypevr {HTTP Header} UInt|Date retry-after
994 Used in combination with a server-busy status code, like 503, to
995 indicate that a client should retry later. Either a number of seconds,
996 or a date.
997 @example
998 (parse-header 'retry-after "60")
999 @result{} 60
1000 @end example
1001 @end deftypevr
1002
1003 @deftypevr {HTTP Header} String server
1004 A string identifying the server.
1005 @example
1006 (parse-header 'server "My first web server")
1007 @result{} "My first web server"
1008 @end example
1009 @end deftypevr
1010
1011 @deftypevr {HTTP Header} *|List vary
1012 A set of request headers that were used in computing this response.
1013 Used to indicate that server-side content negotiation was performed, for
1014 example in response to the @code{accept-language} header. Can also be
1015 the symbol @code{*}, indicating that all headers were considered.
1016 @example
1017 (parse-header 'vary "Accept-Language, Accept")
1018 @result{} (accept-language accept)
1019 @end example
1020 @end deftypevr
1021
1022 @deftypevr {HTTP Header} List www-authenticate
1023 A list of challenges to a user, indicating the need for authentication.
1024 @example
1025 (parse-header 'www-authenticate "Basic realm=\"foo\"")
1026 @result{} ((basic (realm . "foo")))
1027 @end example
1028 @end deftypevr
1029
1030 @node Transfer Codings
1031 @subsection Transfer Codings
1032
1033 HTTP 1.1 allows for various transfer codings to be applied to message
1034 bodies. These include various types of compression, and HTTP chunked
1035 encoding. Currently, only chunked encoding is supported by guile.
1036
1037 Chunked coding is an optional coding that may be applied to message
1038 bodies, to allow messages whose length is not known beforehand to be
1039 returned. Such messages can be split into chunks, terminated by a final
1040 zero length chunk.
1041
1042 In order to make dealing with encodings more simple, guile provides
1043 procedures to create ports that ``wrap'' existing ports, applying
1044 transformations transparently under the hood.
1045
1046 These procedures are in the @code{(web http)} module.
1047
1048 @example
1049 (use-modules (web http))
1050 @end example
1051
1052 @deffn {Scheme Procedure} make-chunked-input-port port [#:keep-alive?=#f]
1053 Returns a new port, that transparently reads and decodes chunk-encoded
1054 data from @var{port}. If no more chunk-encoded data is available, it
1055 returns the end-of-file object. When the port is closed, @var{port} will
1056 also be closed, unless @var{keep-alive?} is true.
1057 @end deffn
1058
1059 @example
1060 (use-modules (ice-9 rdelim))
1061
1062 (define s "5\r\nFirst\r\nA\r\n line\n Sec\r\n8\r\nond line\r\n0\r\n")
1063 (define p (make-chunked-input-port (open-input-string s)))
1064 (read-line s)
1065 @result{} "First line"
1066 (read-line s)
1067 @result{} "Second line"
1068 @end example
1069
1070 @deffn {Scheme Procedure} make-chunked-output-port port [#:keep-alive?=#f]
1071 Returns a new port, which transparently encodes data as chunk-encoded
1072 before writing it to @var{port}. Whenever a write occurs on this port,
1073 it buffers it, until the port is flushed, at which point it writes a
1074 chunk containing all the data written so far. When the port is closed,
1075 the data remaining is written to @var{port}, as is the terminating zero
1076 chunk. It also causes @var{port} to be closed, unless @var{keep-alive?}
1077 is true.
1078
1079 Note. Forcing a chunked output port when there is no data is buffered
1080 does not write a zero chunk, as this would cause the data to be
1081 interpreted incorrectly by the client.
1082 @end deffn
1083
1084 @example
1085 (call-with-output-string
1086 (lambda (out)
1087 (define out* (make-chunked-output-port out #:keep-alive? #t))
1088 (display "first chunk" out*)
1089 (force-output out*)
1090 (force-output out*) ; note this does not write a zero chunk
1091 (display "second chunk" out*)
1092 (close-port out*)))
1093 @result{} "b\r\nfirst chunk\r\nc\r\nsecond chunk\r\n0\r\n"
1094 @end example
1095
1096 @node Requests
1097 @subsection HTTP Requests
1098
1099 @example
1100 (use-modules (web request))
1101 @end example
1102
1103 The request module contains a data type for HTTP requests.
1104
1105 @subsubsection An Important Note on Character Sets
1106
1107 HTTP requests consist of two parts: the request proper, consisting of a
1108 request line and a set of headers, and (optionally) a body. The body
1109 might have a binary content-type, and even in the textual case its
1110 length is specified in bytes, not characters.
1111
1112 Therefore, HTTP is a fundamentally binary protocol. However the request
1113 line and headers are specified to be in a subset of ASCII, so they can
1114 be treated as text, provided that the port's encoding is set to an
1115 ASCII-compatible one-byte-per-character encoding. ISO-8859-1 (latin-1)
1116 is just such an encoding, and happens to be very efficient for Guile.
1117
1118 So what Guile does when reading requests from the wire, or writing them
1119 out, is to set the port's encoding to latin-1, and treating the request
1120 headers as text.
1121
1122 The request body is another issue. For binary data, the data is
1123 probably in a bytevector, so we use the R6RS binary output procedures to
1124 write out the binary payload. Textual data usually has to be written
1125 out to some character encoding, usually UTF-8, and then the resulting
1126 bytevector is written out to the port.
1127
1128 In summary, Guile reads and writes HTTP over latin-1 sockets, without
1129 any loss of generality.
1130
1131 @subsubsection Request API
1132
1133 @deffn {Scheme Procedure} request?
1134 @deffnx {Scheme Procedure} request-method
1135 @deffnx {Scheme Procedure} request-uri
1136 @deffnx {Scheme Procedure} request-version
1137 @deffnx {Scheme Procedure} request-headers
1138 @deffnx {Scheme Procedure} request-meta
1139 @deffnx {Scheme Procedure} request-port
1140 A predicate and field accessors for the request type. The fields are as
1141 follows:
1142 @table @code
1143 @item method
1144 The HTTP method, for example, @code{GET}.
1145 @item uri
1146 The URI as a URI record.
1147 @item version
1148 The HTTP version pair, like @code{(1 . 1)}.
1149 @item headers
1150 The request headers, as an alist of parsed values.
1151 @item meta
1152 An arbitrary alist of other data, for example information returned in
1153 the @code{sockaddr} from @code{accept} (@pxref{Network Sockets and
1154 Communication}).
1155 @item port
1156 The port on which to read or write a request body, if any.
1157 @end table
1158 @end deffn
1159
1160 @deffn {Scheme Procedure} read-request port [meta='()]
1161 Read an HTTP request from @var{port}, optionally attaching the given
1162 metadata, @var{meta}.
1163
1164 As a side effect, sets the encoding on @var{port} to ISO-8859-1
1165 (latin-1), so that reading one character reads one byte. See the
1166 discussion of character sets above, for more information.
1167
1168 Note that the body is not part of the request. Once you have read a
1169 request, you may read the body separately, and likewise for writing
1170 requests.
1171 @end deffn
1172
1173 @deffn {Scheme Procedure} build-request uri [#:method='GET] [#:version='(1 . 1)] [#:headers='()] [#:port=#f] [#:meta='()] [#:validate-headers?=#t]
1174 Construct an HTTP request object. If @var{validate-headers?} is true,
1175 the headers are each run through their respective validators.
1176 @end deffn
1177
1178 @deffn {Scheme Procedure} write-request r port
1179 Write the given HTTP request to @var{port}.
1180
1181 Return a new request, whose @code{request-port} will continue writing
1182 on @var{port}, perhaps using some transfer encoding.
1183 @end deffn
1184
1185 @deffn {Scheme Procedure} read-request-body r
1186 Reads the request body from @var{r}, as a bytevector. Return @code{#f}
1187 if there was no request body.
1188 @end deffn
1189
1190 @deffn {Scheme Procedure} write-request-body r bv
1191 Write @var{bv}, a bytevector, to the port corresponding to the HTTP
1192 request @var{r}.
1193 @end deffn
1194
1195 The various headers that are typically associated with HTTP requests may
1196 be accessed with these dedicated accessors. @xref{HTTP Headers}, for
1197 more information on the format of parsed headers.
1198
1199 @deffn {Scheme Procedure} request-accept request [default='()]
1200 @deffnx {Scheme Procedure} request-accept-charset request [default='()]
1201 @deffnx {Scheme Procedure} request-accept-encoding request [default='()]
1202 @deffnx {Scheme Procedure} request-accept-language request [default='()]
1203 @deffnx {Scheme Procedure} request-allow request [default='()]
1204 @deffnx {Scheme Procedure} request-authorization request [default=#f]
1205 @deffnx {Scheme Procedure} request-cache-control request [default='()]
1206 @deffnx {Scheme Procedure} request-connection request [default='()]
1207 @deffnx {Scheme Procedure} request-content-encoding request [default='()]
1208 @deffnx {Scheme Procedure} request-content-language request [default='()]
1209 @deffnx {Scheme Procedure} request-content-length request [default=#f]
1210 @deffnx {Scheme Procedure} request-content-location request [default=#f]
1211 @deffnx {Scheme Procedure} request-content-md5 request [default=#f]
1212 @deffnx {Scheme Procedure} request-content-range request [default=#f]
1213 @deffnx {Scheme Procedure} request-content-type request [default=#f]
1214 @deffnx {Scheme Procedure} request-date request [default=#f]
1215 @deffnx {Scheme Procedure} request-expect request [default='()]
1216 @deffnx {Scheme Procedure} request-expires request [default=#f]
1217 @deffnx {Scheme Procedure} request-from request [default=#f]
1218 @deffnx {Scheme Procedure} request-host request [default=#f]
1219 @deffnx {Scheme Procedure} request-if-match request [default=#f]
1220 @deffnx {Scheme Procedure} request-if-modified-since request [default=#f]
1221 @deffnx {Scheme Procedure} request-if-none-match request [default=#f]
1222 @deffnx {Scheme Procedure} request-if-range request [default=#f]
1223 @deffnx {Scheme Procedure} request-if-unmodified-since request [default=#f]
1224 @deffnx {Scheme Procedure} request-last-modified request [default=#f]
1225 @deffnx {Scheme Procedure} request-max-forwards request [default=#f]
1226 @deffnx {Scheme Procedure} request-pragma request [default='()]
1227 @deffnx {Scheme Procedure} request-proxy-authorization request [default=#f]
1228 @deffnx {Scheme Procedure} request-range request [default=#f]
1229 @deffnx {Scheme Procedure} request-referer request [default=#f]
1230 @deffnx {Scheme Procedure} request-te request [default=#f]
1231 @deffnx {Scheme Procedure} request-trailer request [default='()]
1232 @deffnx {Scheme Procedure} request-transfer-encoding request [default='()]
1233 @deffnx {Scheme Procedure} request-upgrade request [default='()]
1234 @deffnx {Scheme Procedure} request-user-agent request [default=#f]
1235 @deffnx {Scheme Procedure} request-via request [default='()]
1236 @deffnx {Scheme Procedure} request-warning request [default='()]
1237 Return the given request header, or @var{default} if none was present.
1238 @end deffn
1239
1240 @deffn {Scheme Procedure} request-absolute-uri r [default-host=#f] [default-port=#f]
1241 A helper routine to determine the absolute URI of a request, using the
1242 @code{host} header and the default host and port.
1243 @end deffn
1244
1245
1246 @node Responses
1247 @subsection HTTP Responses
1248
1249 @example
1250 (use-modules (web response))
1251 @end example
1252
1253 As with requests (@pxref{Requests}), Guile offers a data type for HTTP
1254 responses. Again, the body is represented separately from the request.
1255
1256 @deffn {Scheme Procedure} response?
1257 @deffnx {Scheme Procedure} response-version
1258 @deffnx {Scheme Procedure} response-code
1259 @deffnx {Scheme Procedure} response-reason-phrase response
1260 @deffnx {Scheme Procedure} response-headers
1261 @deffnx {Scheme Procedure} response-port
1262 A predicate and field accessors for the response type. The fields are as
1263 follows:
1264 @table @code
1265 @item version
1266 The HTTP version pair, like @code{(1 . 1)}.
1267 @item code
1268 The HTTP response code, like @code{200}.
1269 @item reason-phrase
1270 The reason phrase, or the standard reason phrase for the response's
1271 code.
1272 @item headers
1273 The response headers, as an alist of parsed values.
1274 @item port
1275 The port on which to read or write a response body, if any.
1276 @end table
1277 @end deffn
1278
1279 @deffn {Scheme Procedure} read-response port
1280 Read an HTTP response from @var{port}.
1281
1282 As a side effect, sets the encoding on @var{port} to ISO-8859-1
1283 (latin-1), so that reading one character reads one byte. See the
1284 discussion of character sets in @ref{Responses}, for more information.
1285 @end deffn
1286
1287 @deffn {Scheme Procedure} build-response [#:version='(1 . 1)] [#:code=200] [#:reason-phrase=#f] [#:headers='()] [#:port=#f] [#:validate-headers?=#t]
1288 Construct an HTTP response object. If @var{validate-headers?} is true,
1289 the headers are each run through their respective validators.
1290 @end deffn
1291
1292 @deffn {Scheme Procedure} adapt-response-version response version
1293 Adapt the given response to a different HTTP version. Return a new HTTP
1294 response.
1295
1296 The idea is that many applications might just build a response for the
1297 default HTTP version, and this method could handle a number of
1298 programmatic transformations to respond to older HTTP versions (0.9 and
1299 1.0). But currently this function is a bit heavy-handed, just updating
1300 the version field.
1301 @end deffn
1302
1303 @deffn {Scheme Procedure} write-response r port
1304 Write the given HTTP response to @var{port}.
1305
1306 Return a new response, whose @code{response-port} will continue writing
1307 on @var{port}, perhaps using some transfer encoding.
1308 @end deffn
1309
1310 @deffn {Scheme Procedure} response-must-not-include-body? r
1311 Some responses, like those with status code 304, are specified as never
1312 having bodies. This predicate returns @code{#t} for those responses.
1313
1314 Note also, though, that responses to @code{HEAD} requests must also not
1315 have a body.
1316 @end deffn
1317
1318 @deffn {Scheme Procedure} response-body-port r [#:decode?=#t] [#:keep-alive?=#t]
1319 Return an input port from which the body of @var{r} can be read. The encoding
1320 of the returned port is set according to @var{r}'s @code{content-type} header,
1321 when it's textual, except if @var{decode?} is @code{#f}. Return @code{#f}
1322 when no body is available.
1323
1324 When @var{keep-alive?} is @code{#f}, closing the returned port also closes
1325 @var{r}'s response port.
1326 @end deffn
1327
1328 @deffn {Scheme Procedure} read-response-body r
1329 Read the response body from @var{r}, as a bytevector. Returns @code{#f}
1330 if there was no response body.
1331 @end deffn
1332
1333 @deffn {Scheme Procedure} write-response-body r bv
1334 Write @var{bv}, a bytevector, to the port corresponding to the HTTP
1335 response @var{r}.
1336 @end deffn
1337
1338 As with requests, the various headers that are typically associated with
1339 HTTP responses may be accessed with these dedicated accessors.
1340 @xref{HTTP Headers}, for more information on the format of parsed
1341 headers.
1342
1343 @deffn {Scheme Procedure} response-accept-ranges response [default=#f]
1344 @deffnx {Scheme Procedure} response-age response [default='()]
1345 @deffnx {Scheme Procedure} response-allow response [default='()]
1346 @deffnx {Scheme Procedure} response-cache-control response [default='()]
1347 @deffnx {Scheme Procedure} response-connection response [default='()]
1348 @deffnx {Scheme Procedure} response-content-encoding response [default='()]
1349 @deffnx {Scheme Procedure} response-content-language response [default='()]
1350 @deffnx {Scheme Procedure} response-content-length response [default=#f]
1351 @deffnx {Scheme Procedure} response-content-location response [default=#f]
1352 @deffnx {Scheme Procedure} response-content-md5 response [default=#f]
1353 @deffnx {Scheme Procedure} response-content-range response [default=#f]
1354 @deffnx {Scheme Procedure} response-content-type response [default=#f]
1355 @deffnx {Scheme Procedure} response-date response [default=#f]
1356 @deffnx {Scheme Procedure} response-etag response [default=#f]
1357 @deffnx {Scheme Procedure} response-expires response [default=#f]
1358 @deffnx {Scheme Procedure} response-last-modified response [default=#f]
1359 @deffnx {Scheme Procedure} response-location response [default=#f]
1360 @deffnx {Scheme Procedure} response-pragma response [default='()]
1361 @deffnx {Scheme Procedure} response-proxy-authenticate response [default=#f]
1362 @deffnx {Scheme Procedure} response-retry-after response [default=#f]
1363 @deffnx {Scheme Procedure} response-server response [default=#f]
1364 @deffnx {Scheme Procedure} response-trailer response [default='()]
1365 @deffnx {Scheme Procedure} response-transfer-encoding response [default='()]
1366 @deffnx {Scheme Procedure} response-upgrade response [default='()]
1367 @deffnx {Scheme Procedure} response-vary response [default='()]
1368 @deffnx {Scheme Procedure} response-via response [default='()]
1369 @deffnx {Scheme Procedure} response-warning response [default='()]
1370 @deffnx {Scheme Procedure} response-www-authenticate response [default=#f]
1371 Return the given response header, or @var{default} if none was present.
1372 @end deffn
1373
1374 @deffn {Scheme Procedure} text-content-type? @var{type}
1375 Return @code{#t} if @var{type}, a symbol as returned by
1376 @code{response-content-type}, represents a textual type such as
1377 @code{text/plain}.
1378 @end deffn
1379
1380
1381 @node Web Client
1382 @subsection Web Client
1383
1384 @code{(web client)} provides a simple, synchronous HTTP client, built on
1385 the lower-level HTTP, request, and response modules.
1386
1387 @deffn {Scheme Procedure} open-socket-for-uri uri
1388 Return an open input/output port for a connection to URI.
1389 @end deffn
1390
1391 @deffn {Scheme Procedure} http-get uri [#:port=(open-socket-for-uri uri)] [#:version='(1 . 1)] [#:keep-alive?=#f] [#:extra-headers='()] [#:decode-body?=#t]
1392 Connect to the server corresponding to @var{uri} and ask for the
1393 resource, using the @code{GET} method. If you already have a port open,
1394 pass it as @var{port}. The port will be closed at the end of the
1395 request unless @var{keep-alive?} is true. Any extra headers in the
1396 alist @var{extra-headers} will be added to the request.
1397
1398 If @var{decode-body?} is true, as is the default, the body of the
1399 response will be decoded to string, if it is a textual content-type.
1400 Otherwise it will be returned as a bytevector.
1401 @end deffn
1402
1403 @deffn {Scheme Procedure} http-get* uri [#:port=(open-socket-for-uri uri)] [#:version='(1 . 1)] [#:keep-alive?=#f] [#:extra-headers='()] [#:decode-body?=#t]
1404 Like @code{http-get}, but return an input port from which to read. When
1405 @var{decode-body?} is true, as is the default, the returned port has its
1406 encoding set appropriately if the data at @var{uri} is textual. Closing the
1407 returned port closes @var{port}, unless @var{keep-alive?} is true.
1408 @end deffn
1409
1410 @code{http-get} is useful for making one-off requests to web sites. If
1411 you are writing a web spider or some other client that needs to handle a
1412 number of requests in parallel, it's better to build an event-driven URL
1413 fetcher, similar in structure to the web server (@pxref{Web Server}).
1414
1415 Another option, good but not as performant, would be to use threads,
1416 possibly via par-map or futures.
1417
1418 More helper procedures for the other common HTTP verbs would be a good
1419 addition to this module. Send your code to
1420 @email{guile-user@@gnu.org}.
1421
1422
1423 @node Web Server
1424 @subsection Web Server
1425
1426 @code{(web server)} is a generic web server interface, along with a main
1427 loop implementation for web servers controlled by Guile.
1428
1429 @example
1430 (use-modules (web server))
1431 @end example
1432
1433 The lowest layer is the @code{<server-impl>} object, which defines a set
1434 of hooks to open a server, read a request from a client, write a
1435 response to a client, and close a server. These hooks -- @code{open},
1436 @code{read}, @code{write}, and @code{close}, respectively -- are bound
1437 together in a @code{<server-impl>} object. Procedures in this module take a
1438 @code{<server-impl>} object, if needed.
1439
1440 A @code{<server-impl>} may also be looked up by name. If you pass the
1441 @code{http} symbol to @code{run-server}, Guile looks for a variable
1442 named @code{http} in the @code{(web server http)} module, which should
1443 be bound to a @code{<server-impl>} object. Such a binding is made by
1444 instantiation of the @code{define-server-impl} syntax. In this way the
1445 run-server loop can automatically load other backends if available.
1446
1447 The life cycle of a server goes as follows:
1448
1449 @enumerate
1450 @item
1451 The @code{open} hook is called, to open the server. @code{open} takes 0 or
1452 more arguments, depending on the backend, and returns an opaque
1453 server socket object, or signals an error.
1454
1455 @item
1456 The @code{read} hook is called, to read a request from a new client.
1457 The @code{read} hook takes one argument, the server socket. It should
1458 return three values: an opaque client socket, the request, and the
1459 request body. The request should be a @code{<request>} object, from
1460 @code{(web request)}. The body should be a string or a bytevector, or
1461 @code{#f} if there is no body.
1462
1463 If the read failed, the @code{read} hook may return #f for the client
1464 socket, request, and body.
1465
1466 @item
1467 A user-provided handler procedure is called, with the request and body
1468 as its arguments. The handler should return two values: the response,
1469 as a @code{<response>} record from @code{(web response)}, and the
1470 response body as bytevector, or @code{#f} if not present.
1471
1472 The respose and response body are run through @code{sanitize-response},
1473 documented below. This allows the handler writer to take some
1474 convenient shortcuts: for example, instead of a @code{<response>}, the
1475 handler can simply return an alist of headers, in which case a default
1476 response object is constructed with those headers. Instead of a
1477 bytevector for the body, the handler can return a string, which will be
1478 serialized into an appropriate encoding; or it can return a procedure,
1479 which will be called on a port to write out the data. See the
1480 @code{sanitize-response} documentation, for more.
1481
1482 @item
1483 The @code{write} hook is called with three arguments: the client
1484 socket, the response, and the body. The @code{write} hook returns no
1485 values.
1486
1487 @item
1488 At this point the request handling is complete. For a loop, we
1489 loop back and try to read a new request.
1490
1491 @item
1492 If the user interrupts the loop, the @code{close} hook is called on
1493 the server socket.
1494 @end enumerate
1495
1496 A user may define a server implementation with the following form:
1497
1498 @deffn {Scheme Syntax} define-server-impl name open read write close
1499 Make a @code{<server-impl>} object with the hooks @var{open},
1500 @var{read}, @var{write}, and @var{close}, and bind it to the symbol
1501 @var{name} in the current module.
1502 @end deffn
1503
1504 @deffn {Scheme Procedure} lookup-server-impl impl
1505 Look up a server implementation. If @var{impl} is a server
1506 implementation already, it is returned directly. If it is a symbol, the
1507 binding named @var{impl} in the @code{(web server @var{impl})} module is
1508 looked up. Otherwise an error is signaled.
1509
1510 Currently a server implementation is a somewhat opaque type, useful only
1511 for passing to other procedures in this module, like @code{read-client}.
1512 @end deffn
1513
1514 The @code{(web server)} module defines a number of routines that use
1515 @code{<server-impl>} objects to implement parts of a web server. Given
1516 that we don't expose the accessors for the various fields of a
1517 @code{<server-impl>}, indeed these routines are the only procedures with
1518 any access to the impl objects.
1519
1520 @deffn {Scheme Procedure} open-server impl open-params
1521 Open a server for the given implementation. Return one value, the new
1522 server object. The implementation's @code{open} procedure is applied to
1523 @var{open-params}, which should be a list.
1524 @end deffn
1525
1526 @deffn {Scheme Procedure} read-client impl server
1527 Read a new client from @var{server}, by applying the implementation's
1528 @code{read} procedure to the server. If successful, return three
1529 values: an object corresponding to the client, a request object, and the
1530 request body. If any exception occurs, return @code{#f} for all three
1531 values.
1532 @end deffn
1533
1534 @deffn {Scheme Procedure} handle-request handler request body state
1535 Handle a given request, returning the response and body.
1536
1537 The response and response body are produced by calling the given
1538 @var{handler} with @var{request} and @var{body} as arguments.
1539
1540 The elements of @var{state} are also passed to @var{handler} as
1541 arguments, and may be returned as additional values. The new
1542 @var{state}, collected from the @var{handler}'s return values, is then
1543 returned as a list. The idea is that a server loop receives a handler
1544 from the user, along with whatever state values the user is interested
1545 in, allowing the user's handler to explicitly manage its state.
1546 @end deffn
1547
1548 @deffn {Scheme Procedure} sanitize-response request response body
1549 "Sanitize" the given response and body, making them appropriate for the
1550 given request.
1551
1552 As a convenience to web handler authors, @var{response} may be given as
1553 an alist of headers, in which case it is used to construct a default
1554 response. Ensures that the response version corresponds to the request
1555 version. If @var{body} is a string, encodes the string to a bytevector,
1556 in an encoding appropriate for @var{response}. Adds a
1557 @code{content-length} and @code{content-type} header, as necessary.
1558
1559 If @var{body} is a procedure, it is called with a port as an argument,
1560 and the output collected as a bytevector. In the future we might try to
1561 instead use a compressing, chunk-encoded port, and call this procedure
1562 later, in the write-client procedure. Authors are advised not to rely on
1563 the procedure being called at any particular time.
1564 @end deffn
1565
1566 @deffn {Scheme Procedure} write-client impl server client response body
1567 Write an HTTP response and body to @var{client}. If the server and
1568 client support persistent connections, it is the implementation's
1569 responsibility to keep track of the client thereafter, presumably by
1570 attaching it to the @var{server} argument somehow.
1571 @end deffn
1572
1573 @deffn {Scheme Procedure} close-server impl server
1574 Release resources allocated by a previous invocation of
1575 @code{open-server}.
1576 @end deffn
1577
1578 Given the procedures above, it is a small matter to make a web server:
1579
1580 @deffn {Scheme Procedure} serve-one-client handler impl server state
1581 Read one request from @var{server}, call @var{handler} on the request
1582 and body, and write the response to the client. Return the new state
1583 produced by the handler procedure.
1584 @end deffn
1585
1586 @deffn {Scheme Procedure} run-server handler [impl='http] [open-params='()] . state
1587 Run Guile's built-in web server.
1588
1589 @var{handler} should be a procedure that takes two or more arguments,
1590 the HTTP request and request body, and returns two or more values, the
1591 response and response body.
1592
1593 For examples, skip ahead to the next section, @ref{Web Examples}.
1594
1595 The response and body will be run through @code{sanitize-response}
1596 before sending back to the client.
1597
1598 Additional arguments to @var{handler} are taken from @var{state}.
1599 Additional return values are accumulated into a new @var{state}, which
1600 will be used for subsequent requests. In this way a handler can
1601 explicitly manage its state.
1602 @end deffn
1603
1604 The default web server implementation is @code{http}, which binds to a
1605 socket, listening for request on that port.
1606
1607 @deffn {HTTP Implementation} http [#:host=#f] [#:family=AF_INET] [#:addr=INADDR_LOOPBACK] [#:port 8080] [#:socket]
1608 The default HTTP implementation. We document it as a function with
1609 keyword arguments, because that is precisely the way that it is -- all
1610 of the @var{open-params} to @code{run-server} get passed to the
1611 implementation's open function.
1612
1613 @example
1614 ;; The defaults: localhost:8080
1615 (run-server handler)
1616 ;; Same thing
1617 (run-server handler 'http '())
1618 ;; On a different port
1619 (run-server handler 'http '(#:port 8081))
1620 ;; IPv6
1621 (run-server handler 'http '(#:family AF_INET6 #:port 8081))
1622 ;; Custom socket
1623 (run-server handler 'http `(#:socket ,(sudo-make-me-a-socket)))
1624 @end example
1625 @end deffn
1626
1627 @node Web Examples
1628 @subsection Web Examples
1629
1630 Well, enough about the tedious internals. Let's make a web application!
1631
1632 @subsubsection Hello, World!
1633
1634 The first program we have to write, of course, is ``Hello, World!''.
1635 This means that we have to implement a web handler that does what we
1636 want.
1637
1638 Now we define a handler, a function of two arguments and two return
1639 values:
1640
1641 @example
1642 (define (handler request request-body)
1643 (values @var{response} @var{response-body}))
1644 @end example
1645
1646 In this first example, we take advantage of a short-cut, returning an
1647 alist of headers instead of a proper response object. The response body
1648 is our payload:
1649
1650 @example
1651 (define (hello-world-handler request request-body)
1652 (values '((content-type . (text/plain)))
1653 "Hello World!"))
1654 @end example
1655
1656 Now let's test it, by running a server with this handler. Load up the
1657 web server module if you haven't yet done so, and run a server with this
1658 handler:
1659
1660 @example
1661 (use-modules (web server))
1662 (run-server hello-world-handler)
1663 @end example
1664
1665 By default, the web server listens for requests on
1666 @code{localhost:8080}. Visit that address in your web browser to
1667 test. If you see the string, @code{Hello World!}, sweet!
1668
1669 @subsubsection Inspecting the Request
1670
1671 The Hello World program above is a general greeter, responding to all
1672 URIs. To make a more exclusive greeter, we need to inspect the request
1673 object, and conditionally produce different results. So let's load up
1674 the request, response, and URI modules, and do just that.
1675
1676 @example
1677 (use-modules (web server)) ; you probably did this already
1678 (use-modules (web request)
1679 (web response)
1680 (web uri))
1681
1682 (define (request-path-components request)
1683 (split-and-decode-uri-path (uri-path (request-uri request))))
1684
1685 (define (hello-hacker-handler request body)
1686 (if (equal? (request-path-components request)
1687 '("hacker"))
1688 (values '((content-type . (text/plain)))
1689 "Hello hacker!")
1690 (not-found request)))
1691
1692 (run-server hello-hacker-handler)
1693 @end example
1694
1695 Here we see that we have defined a helper to return the components of
1696 the URI path as a list of strings, and used that to check for a request
1697 to @code{/hacker/}. Then the success case is just as before -- visit
1698 @code{http://localhost:8080/hacker/} in your browser to check.
1699
1700 You should always match against URI path components as decoded by
1701 @code{split-and-decode-uri-path}. The above example will work for
1702 @code{/hacker/}, @code{//hacker///}, and @code{/h%61ck%65r}.
1703
1704 But we forgot to define @code{not-found}! If you are pasting these
1705 examples into a REPL, accessing any other URI in your web browser will
1706 drop your Guile console into the debugger:
1707
1708 @example
1709 <unnamed port>:38:7: In procedure module-lookup:
1710 <unnamed port>:38:7: Unbound variable: not-found
1711
1712 Entering a new prompt. Type `,bt' for a backtrace or `,q' to continue.
1713 scheme@@(guile-user) [1]>
1714 @end example
1715
1716 So let's define the function, right there in the debugger. As you
1717 probably know, we'll want to return a 404 response.
1718
1719 @example
1720 ;; Paste this in your REPL
1721 (define (not-found request)
1722 (values (build-response #:code 404)
1723 (string-append "Resource not found: "
1724 (uri->string (request-uri request)))))
1725
1726 ;; Now paste this to let the web server keep going:
1727 ,continue
1728 @end example
1729
1730 Now if you access @code{http://localhost/foo/}, you get this error
1731 message. (Note that some popular web browsers won't show
1732 server-generated 404 messages, showing their own instead, unless the 404
1733 message body is long enough.)
1734
1735 @subsubsection Higher-Level Interfaces
1736
1737 The web handler interface is a common baseline that all kinds of Guile
1738 web applications can use. You will usually want to build something on
1739 top of it, however, especially when producing HTML. Here is a simple
1740 example that builds up HTML output using SXML (@pxref{sxml simple}).
1741
1742 First, load up the modules:
1743
1744 @example
1745 (use-modules (web server)
1746 (web request)
1747 (web response)
1748 (sxml simple))
1749 @end example
1750
1751 Now we define a simple templating function that takes a list of HTML
1752 body elements, as SXML, and puts them in our super template:
1753
1754 @example
1755 (define (templatize title body)
1756 `(html (head (title ,title))
1757 (body ,@@body)))
1758 @end example
1759
1760 For example, the simplest Hello HTML can be produced like this:
1761
1762 @example
1763 (sxml->xml (templatize "Hello!" '((b "Hi!"))))
1764 @print{}
1765 <html><head><title>Hello!</title></head><body><b>Hi!</b></body></html>
1766 @end example
1767
1768 Much better to work with Scheme data types than to work with HTML as
1769 strings. Now we define a little response helper:
1770
1771 @example
1772 (define* (respond #:optional body #:key
1773 (status 200)
1774 (title "Hello hello!")
1775 (doctype "<!DOCTYPE html>\n")
1776 (content-type-params '((charset . "utf-8")))
1777 (content-type 'text/html)
1778 (extra-headers '())
1779 (sxml (and body (templatize title body))))
1780 (values (build-response
1781 #:code status
1782 #:headers `((content-type
1783 . (,content-type ,@@content-type-params))
1784 ,@@extra-headers))
1785 (lambda (port)
1786 (if sxml
1787 (begin
1788 (if doctype (display doctype port))
1789 (sxml->xml sxml port))))))
1790 @end example
1791
1792 Here we see the power of keyword arguments with default initializers. By
1793 the time the arguments are fully parsed, the @code{sxml} local variable
1794 will hold the templated SXML, ready for sending out to the client.
1795
1796 Also, instead of returning the body as a string, @code{respond} gives a
1797 procedure, which will be called by the web server to write out the
1798 response to the client.
1799
1800 Now, a simple example using this responder, which lays out the incoming
1801 headers in an HTML table.
1802
1803 @example
1804 (define (debug-page request body)
1805 (respond
1806 `((h1 "hello world!")
1807 (table
1808 (tr (th "header") (th "value"))
1809 ,@@(map (lambda (pair)
1810 `(tr (td (tt ,(with-output-to-string
1811 (lambda () (display (car pair))))))
1812 (td (tt ,(with-output-to-string
1813 (lambda ()
1814 (write (cdr pair))))))))
1815 (request-headers request))))))
1816
1817 (run-server debug-page)
1818 @end example
1819
1820 Now if you visit any local address in your web browser, we actually see
1821 some HTML, finally.
1822
1823 @subsubsection Conclusion
1824
1825 Well, this is about as far as Guile's built-in web support goes, for
1826 now. There are many ways to make a web application, but hopefully by
1827 standardizing the most fundamental data types, users will be able to
1828 choose the approach that suits them best, while also being able to
1829 switch between implementations of the server. This is a relatively new
1830 part of Guile, so if you have feedback, let us know, and we can take it
1831 into account. Happy hacking on the web!
1832
1833 @c Local Variables:
1834 @c TeX-master: "guile.texi"
1835 @c End: