| 1 | @c -*-texinfo-*- |
| 2 | @c This is part of the GNU Guile Reference Manual. |
| 3 | @c Copyright (C) 2010, 2011, 2012, 2013 Free Software Foundation, Inc. |
| 4 | @c See the file guile.texi for copying conditions. |
| 5 | |
| 6 | @node Web |
| 7 | @section @acronym{HTTP}, the Web, and All That |
| 8 | @cindex Web |
| 9 | @cindex WWW |
| 10 | @cindex HTTP |
| 11 | |
| 12 | It has always been possible to connect computers together and share |
| 13 | information between them, but the rise of the World Wide Web over the |
| 14 | last couple of decades has made it much easier to do so. The result is |
| 15 | a richly connected network of computation, in which Guile forms a part. |
| 16 | |
| 17 | By ``the web'', we mean the HTTP protocol@footnote{Yes, the P is for |
| 18 | protocol, but this phrase appears repeatedly in RFC 2616.} as handled by |
| 19 | servers, clients, proxies, caches, and the various kinds of messages and |
| 20 | message components that can be sent and received by that protocol, |
| 21 | notably HTML. |
| 22 | |
| 23 | On one level, the web is text in motion: the protocols themselves are |
| 24 | textual (though the payload may be binary), and it's possible to create |
| 25 | a socket and speak text to the web. But such an approach is obviously |
| 26 | primitive. This section details the higher-level data types and |
| 27 | operations provided by Guile: URIs, HTTP request and response records, |
| 28 | and a conventional web server implementation. |
| 29 | |
| 30 | The material in this section is arranged in ascending order, in which |
| 31 | later concepts build on previous ones. If you prefer to start with the |
| 32 | highest-level perspective, @pxref{Web Examples}, and work your way |
| 33 | back. |
| 34 | |
| 35 | @menu |
| 36 | * Types and the Web:: Types prevent bugs and security problems. |
| 37 | * URIs:: Universal Resource Identifiers. |
| 38 | * HTTP:: The Hyper-Text Transfer Protocol. |
| 39 | * HTTP Headers:: How Guile represents specific header values. |
| 40 | * Transfer Codings:: HTTP Transfer Codings. |
| 41 | * Requests:: HTTP requests. |
| 42 | * Responses:: HTTP responses. |
| 43 | * Web Client:: Accessing web resources over HTTP. |
| 44 | * Web Server:: Serving HTTP to the internet. |
| 45 | * Web Examples:: How to use this thing. |
| 46 | @end menu |
| 47 | |
| 48 | @node Types and the Web |
| 49 | @subsection Types and the Web |
| 50 | |
| 51 | It is a truth universally acknowledged, that a program with good use of |
| 52 | data types, will be free from many common bugs. Unfortunately, the |
| 53 | common practice in web programming seems to ignore this maxim. This |
| 54 | subsection makes the case for expressive data types in web programming. |
| 55 | |
| 56 | By ``expressive data types'', we mean that the data types @emph{say} |
| 57 | something about how a program solves a problem. For example, if we |
| 58 | choose to represent dates using SRFI 19 date records (@pxref{SRFI-19}), |
| 59 | this indicates that there is a part of the program that will always have |
| 60 | valid dates. Error handling for a number of basic cases, like invalid |
| 61 | dates, occurs on the boundary in which we produce a SRFI 19 date record |
| 62 | from other types, like strings. |
| 63 | |
| 64 | With regards to the web, data types are helpful in the two broad phases |
| 65 | of HTTP messages: parsing and generation. |
| 66 | |
| 67 | Consider a server, which has to parse a request, and produce a response. |
| 68 | Guile will parse the request into an HTTP request object |
| 69 | (@pxref{Requests}), with each header parsed into an appropriate Scheme |
| 70 | data type. This transition from an incoming stream of characters to |
| 71 | typed data is a state change in a program---the strings might parse, or |
| 72 | they might not, and something has to happen if they do not. (Guile |
| 73 | throws an error in this case.) But after you have the parsed request, |
| 74 | ``client'' code (code built on top of the Guile web framework) will not |
| 75 | have to check for syntactic validity. The types already make this |
| 76 | information manifest. |
| 77 | |
| 78 | This state change on the parsing boundary makes programs more robust, |
| 79 | as they themselves are freed from the need to do a number of common |
| 80 | error checks, and they can use normal Scheme procedures to handle a |
| 81 | request instead of ad-hoc string parsers. |
| 82 | |
| 83 | The need for types on the response generation side (in a server) is more |
| 84 | subtle, though not less important. Consider the example of a POST |
| 85 | handler, which prints out the text that a user submits from a form. |
| 86 | Such a handler might include a procedure like this: |
| 87 | |
| 88 | @example |
| 89 | ;; First, a helper procedure |
| 90 | (define (para . contents) |
| 91 | (string-append "<p>" (string-concatenate contents) "</p>")) |
| 92 | |
| 93 | ;; Now the meat of our simple web application |
| 94 | (define (you-said text) |
| 95 | (para "You said: " text)) |
| 96 | |
| 97 | (display (you-said "Hi!")) |
| 98 | @print{} <p>You said: Hi!</p> |
| 99 | @end example |
| 100 | |
| 101 | This is a perfectly valid implementation, provided that the incoming |
| 102 | text does not contain the special HTML characters @samp{<}, @samp{>}, or |
| 103 | @samp{&}. But this provision of a restricted character set is not |
| 104 | reflected anywhere in the program itself: we must @emph{assume} that the |
| 105 | programmer understands this, and performs the check elsewhere. |
| 106 | |
| 107 | Unfortunately, the short history of the practice of programming does not |
| 108 | bear out this assumption. A @dfn{cross-site scripting} (@acronym{XSS}) |
| 109 | vulnerability is just such a common error in which unfiltered user input |
| 110 | is allowed into the output. A user could submit a crafted comment to |
| 111 | your web site which results in visitors running malicious Javascript, |
| 112 | within the security context of your domain: |
| 113 | |
| 114 | @example |
| 115 | (display (you-said "<script src=\"http://bad.com/nasty.js\" />")) |
| 116 | @print{} <p>You said: <script src="http://bad.com/nasty.js" /></p> |
| 117 | @end example |
| 118 | |
| 119 | The fundamental problem here is that both user data and the program |
| 120 | template are represented using strings. This identity means that types |
| 121 | can't help the programmer to make a distinction between these two, so |
| 122 | they get confused. |
| 123 | |
| 124 | There are a number of possible solutions, but perhaps the best is to |
| 125 | treat HTML not as strings, but as native s-expressions: as SXML. The |
| 126 | basic idea is that HTML is either text, represented by a string, or an |
| 127 | element, represented as a tagged list. So @samp{foo} becomes |
| 128 | @samp{"foo"}, and @samp{<b>foo</b>} becomes @samp{(b "foo")}. |
| 129 | Attributes, if present, go in a tagged list headed by @samp{@@}, like |
| 130 | @samp{(img (@@ (src "http://example.com/foo.png")))}. @xref{SXML}, for |
| 131 | more information. |
| 132 | |
| 133 | The good thing about SXML is that HTML elements cannot be confused with |
| 134 | text. Let's make a new definition of @code{para}: |
| 135 | |
| 136 | @example |
| 137 | (define (para . contents) |
| 138 | `(p ,@@contents)) |
| 139 | |
| 140 | (use-modules (sxml simple)) |
| 141 | (sxml->xml (you-said "Hi!")) |
| 142 | @print{} <p>You said: Hi!</p> |
| 143 | |
| 144 | (sxml->xml (you-said "<i>Rats, foiled again!</i>")) |
| 145 | @print{} <p>You said: <i>Rats, foiled again!</i></p> |
| 146 | @end example |
| 147 | |
| 148 | So we see in the second example that HTML elements cannot be unwittingly |
| 149 | introduced into the output. However it is now perfectly acceptable to |
| 150 | pass SXML to @code{you-said}; in fact, that is the big advantage of SXML |
| 151 | over everything-as-a-string. |
| 152 | |
| 153 | @example |
| 154 | (sxml->xml (you-said (you-said "<Hi!>"))) |
| 155 | @print{} <p>You said: <p>You said: <Hi!></p></p> |
| 156 | @end example |
| 157 | |
| 158 | The SXML types allow procedures to @emph{compose}. The types make |
| 159 | manifest which parts are HTML elements, and which are text. So you |
| 160 | needn't worry about escaping user input; the type transition back to a |
| 161 | string handles that for you. @acronym{XSS} vulnerabilities are a thing |
| 162 | of the past. |
| 163 | |
| 164 | Well. That's all very nice and opinionated and such, but how do I use |
| 165 | the thing? Read on! |
| 166 | |
| 167 | @node URIs |
| 168 | @subsection Universal Resource Identifiers |
| 169 | |
| 170 | Guile provides a standard data type for Universal Resource Identifiers |
| 171 | (URIs), as defined in RFC 3986. |
| 172 | |
| 173 | The generic URI syntax is as follows: |
| 174 | |
| 175 | @example |
| 176 | URI := scheme ":" ["//" [userinfo "@@"] host [":" port]] path \ |
| 177 | [ "?" query ] [ "#" fragment ] |
| 178 | @end example |
| 179 | |
| 180 | For example, in the URI, @indicateurl{http://www.gnu.org/help/}, the |
| 181 | scheme is @code{http}, the host is @code{www.gnu.org}, the path is |
| 182 | @code{/help/}, and there is no userinfo, port, query, or fragment. All |
| 183 | URIs have a scheme and a path (though the path might be empty). Some |
| 184 | URIs have a host, and some of those have ports and userinfo. Any URI |
| 185 | might have a query part or a fragment. |
| 186 | |
| 187 | Userinfo is something of an abstraction, as some legacy URI schemes |
| 188 | allowed userinfo of the form @code{@var{username}:@var{passwd}}. But |
| 189 | since passwords do not belong in URIs, the RFC does not want to condone |
| 190 | this practice, so it calls anything before the @code{@@} sign |
| 191 | @dfn{userinfo}. |
| 192 | |
| 193 | Properly speaking, a fragment is not part of a URI. For example, when a |
| 194 | web browser follows a link to @indicateurl{http://example.com/#foo}, it |
| 195 | sends a request for @indicateurl{http://example.com/}, then looks in the |
| 196 | resulting page for the fragment identified @code{foo} reference. A |
| 197 | fragment identifies a part of a resource, not the resource itself. But |
| 198 | it is useful to have a fragment field in the URI record itself, so we |
| 199 | hope you will forgive the inconsistency. |
| 200 | |
| 201 | @example |
| 202 | (use-modules (web uri)) |
| 203 | @end example |
| 204 | |
| 205 | The following procedures can be found in the @code{(web uri)} |
| 206 | module. Load it into your Guile, using a form like the above, to have |
| 207 | access to them. |
| 208 | |
| 209 | @deffn {Scheme Procedure} build-uri scheme @ |
| 210 | [#:userinfo=@code{#f}] [#:host=@code{#f}] [#:port=@code{#f}] @ |
| 211 | [#:path=@code{""}] [#:query=@code{#f}] [#:fragment=@code{#f}] @ |
| 212 | [#:validate?=@code{#t}] |
| 213 | Construct a URI object. @var{scheme} should be a symbol, @var{port} |
| 214 | either a positive, exact integer or @code{#f}, and the rest of the |
| 215 | fields are either strings or @code{#f}. If @var{validate?} is true, |
| 216 | also run some consistency checks to make sure that the constructed URI |
| 217 | is valid. |
| 218 | @end deffn |
| 219 | |
| 220 | @deffn {Scheme Procedure} uri? obj |
| 221 | @deffnx {Scheme Procedure} uri-scheme uri |
| 222 | @deffnx {Scheme Procedure} uri-userinfo uri |
| 223 | @deffnx {Scheme Procedure} uri-host uri |
| 224 | @deffnx {Scheme Procedure} uri-port uri |
| 225 | @deffnx {Scheme Procedure} uri-path uri |
| 226 | @deffnx {Scheme Procedure} uri-query uri |
| 227 | @deffnx {Scheme Procedure} uri-fragment uri |
| 228 | A predicate and field accessors for the URI record type. The URI scheme |
| 229 | will be a symbol, the port either a positive, exact integer or @code{#f}, |
| 230 | and the rest either strings or @code{#f} if not present. |
| 231 | @end deffn |
| 232 | |
| 233 | @deffn {Scheme Procedure} string->uri string |
| 234 | Parse @var{string} into a URI object. Return @code{#f} if the string |
| 235 | could not be parsed. |
| 236 | @end deffn |
| 237 | |
| 238 | @deffn {Scheme Procedure} uri->string uri |
| 239 | Serialize @var{uri} to a string. If the URI has a port that is the |
| 240 | default port for its scheme, the port is not included in the |
| 241 | serialization. |
| 242 | @end deffn |
| 243 | |
| 244 | @deffn {Scheme Procedure} declare-default-port! scheme port |
| 245 | Declare a default port for the given URI scheme. |
| 246 | @end deffn |
| 247 | |
| 248 | @deffn {Scheme Procedure} uri-decode str [#:encoding=@code{"utf-8"}] |
| 249 | Percent-decode the given @var{str}, according to @var{encoding}, which |
| 250 | should be the name of a character encoding. |
| 251 | |
| 252 | Note that this function should not generally be applied to a full URI |
| 253 | string. For paths, use @code{split-and-decode-uri-path} instead. For |
| 254 | query strings, split the query on @code{&} and @code{=} boundaries, and |
| 255 | decode the components separately. |
| 256 | |
| 257 | Note also that percent-encoded strings encode @emph{bytes}, not |
| 258 | characters. There is no guarantee that a given byte sequence is a valid |
| 259 | string encoding. Therefore this routine may signal an error if the |
| 260 | decoded bytes are not valid for the given encoding. Pass @code{#f} for |
| 261 | @var{encoding} if you want decoded bytes as a bytevector directly. |
| 262 | @xref{Ports, @code{set-port-encoding!}}, for more information on |
| 263 | character encodings. |
| 264 | |
| 265 | Returns a string of the decoded characters, or a bytevector if |
| 266 | @var{encoding} was @code{#f}. |
| 267 | @end deffn |
| 268 | |
| 269 | Fixme: clarify return type. indicate default values. type of |
| 270 | unescaped-chars. |
| 271 | |
| 272 | @deffn {Scheme Procedure} uri-encode str [#:encoding=@code{"utf-8"}] [#:unescaped-chars] |
| 273 | Percent-encode any character not in the character set, |
| 274 | @var{unescaped-chars}. |
| 275 | |
| 276 | The default character set includes alphanumerics from ASCII, as well as |
| 277 | the special characters @samp{-}, @samp{.}, @samp{_}, and @samp{~}. Any |
| 278 | other character will be percent-encoded, by writing out the character to |
| 279 | a bytevector within the given @var{encoding}, then encoding each byte as |
| 280 | @code{%@var{HH}}, where @var{HH} is the hexadecimal representation of |
| 281 | the byte. |
| 282 | @end deffn |
| 283 | |
| 284 | @deffn {Scheme Procedure} split-and-decode-uri-path path |
| 285 | Split @var{path} into its components, and decode each component, |
| 286 | removing empty components. |
| 287 | |
| 288 | For example, @code{"/foo/bar%20baz/"} decodes to the two-element list, |
| 289 | @code{("foo" "bar baz")}. |
| 290 | @end deffn |
| 291 | |
| 292 | @deffn {Scheme Procedure} encode-and-join-uri-path parts |
| 293 | URI-encode each element of @var{parts}, which should be a list of |
| 294 | strings, and join the parts together with @code{/} as a delimiter. |
| 295 | |
| 296 | For example, the list @code{("scrambled eggs" "biscuits&gravy")} encodes |
| 297 | as @code{"scrambled%20eggs/biscuits%26gravy"}. |
| 298 | @end deffn |
| 299 | |
| 300 | @node HTTP |
| 301 | @subsection The Hyper-Text Transfer Protocol |
| 302 | |
| 303 | The initial motivation for including web functionality in Guile, rather |
| 304 | than rely on an external package, was to establish a standard base on |
| 305 | which people can share code. To that end, we continue the focus on data |
| 306 | types by providing a number of low-level parsers and unparsers for |
| 307 | elements of the HTTP protocol. |
| 308 | |
| 309 | If you are want to skip the low-level details for now and move on to web |
| 310 | pages, @pxref{Web Client}, and @pxref{Web Server}. Otherwise, load the |
| 311 | HTTP module, and read on. |
| 312 | |
| 313 | @example |
| 314 | (use-modules (web http)) |
| 315 | @end example |
| 316 | |
| 317 | The focus of the @code{(web http)} module is to parse and unparse |
| 318 | standard HTTP headers, representing them to Guile as native data |
| 319 | structures. For example, a @code{Date:} header will be represented as a |
| 320 | SRFI-19 date record (@pxref{SRFI-19}), rather than as a string. |
| 321 | |
| 322 | Guile tries to follow RFCs fairly strictly---the road to perdition being |
| 323 | paved with compatibility hacks---though some allowances are made for |
| 324 | not-too-divergent texts. |
| 325 | |
| 326 | Header names are represented as lower-case symbols. |
| 327 | |
| 328 | @deffn {Scheme Procedure} string->header name |
| 329 | Parse @var{name} to a symbolic header name. |
| 330 | @end deffn |
| 331 | |
| 332 | @deffn {Scheme Procedure} header->string sym |
| 333 | Return the string form for the header named @var{sym}. |
| 334 | @end deffn |
| 335 | |
| 336 | For example: |
| 337 | |
| 338 | @example |
| 339 | (string->header "Content-Length") |
| 340 | @result{} content-length |
| 341 | (header->string 'content-length) |
| 342 | @result{} "Content-Length" |
| 343 | |
| 344 | (string->header "FOO") |
| 345 | @result{} foo |
| 346 | (header->string 'foo) |
| 347 | @result{} "Foo" |
| 348 | @end example |
| 349 | |
| 350 | Guile keeps a registry of known headers, their string names, and some |
| 351 | parsing and serialization procedures. If a header is unknown, its |
| 352 | string name is simply its symbol name in title-case. |
| 353 | |
| 354 | @deffn {Scheme Procedure} known-header? sym |
| 355 | Return @code{#t} if @var{sym} is a known header, with associated |
| 356 | parsers and serialization procedures, or @code{#f} otherwise. |
| 357 | @end deffn |
| 358 | |
| 359 | @deffn {Scheme Procedure} header-parser sym |
| 360 | Return the value parser for headers named @var{sym}. The result is a |
| 361 | procedure that takes one argument, a string, and returns the parsed |
| 362 | value. If the header isn't known to Guile, a default parser is returned |
| 363 | that passes through the string unchanged. |
| 364 | @end deffn |
| 365 | |
| 366 | @deffn {Scheme Procedure} header-validator sym |
| 367 | Return a predicate which returns @code{#t} if the given value is valid |
| 368 | for headers named @var{sym}. The default validator for unknown headers |
| 369 | is @code{string?}. |
| 370 | @end deffn |
| 371 | |
| 372 | @deffn {Scheme Procedure} header-writer sym |
| 373 | Return a procedure that writes values for headers named @var{sym} to a |
| 374 | port. The resulting procedure takes two arguments: a value and a port. |
| 375 | The default writer is @code{display}. |
| 376 | @end deffn |
| 377 | |
| 378 | For more on the set of headers that Guile knows about out of the box, |
| 379 | @pxref{HTTP Headers}. To add your own, use the @code{declare-header!} |
| 380 | procedure: |
| 381 | |
| 382 | @deffn {Scheme Procedure} declare-header! name parser validator writer @ |
| 383 | [#:multiple?=@code{#f}] |
| 384 | Declare a parser, validator, and writer for a given header. |
| 385 | @end deffn |
| 386 | |
| 387 | For example, let's say you are running a web server behind some sort of |
| 388 | proxy, and your proxy adds an @code{X-Client-Address} header, indicating |
| 389 | the IPv4 address of the original client. You would like for the HTTP |
| 390 | request record to parse out this header to a Scheme value, instead of |
| 391 | leaving it as a string. You could register this header with Guile's |
| 392 | HTTP stack like this: |
| 393 | |
| 394 | @example |
| 395 | (declare-header! "X-Client-Address" |
| 396 | (lambda (str) |
| 397 | (inet-aton str)) |
| 398 | (lambda (ip) |
| 399 | (and (integer? ip) (exact? ip) (<= 0 ip #xffffffff))) |
| 400 | (lambda (ip port) |
| 401 | (display (inet-ntoa ip) port))) |
| 402 | @end example |
| 403 | |
| 404 | @deffn {Scheme Procedure} declare-opaque-header! name |
| 405 | A specialised version of @code{declare-header!} for the case in which |
| 406 | you want a header's value to be returned/written ``as-is''. |
| 407 | @end deffn |
| 408 | |
| 409 | @deffn {Scheme Procedure} valid-header? sym val |
| 410 | Return a true value if @var{val} is a valid Scheme value for the header |
| 411 | with name @var{sym}, or @code{#f} otherwise. |
| 412 | @end deffn |
| 413 | |
| 414 | Now that we have a generic interface for reading and writing headers, we |
| 415 | do just that. |
| 416 | |
| 417 | @deffn {Scheme Procedure} read-header port |
| 418 | Read one HTTP header from @var{port}. Return two values: the header |
| 419 | name and the parsed Scheme value. May raise an exception if the header |
| 420 | was known but the value was invalid. |
| 421 | |
| 422 | Returns the end-of-file object for both values if the end of the message |
| 423 | body was reached (i.e., a blank line). |
| 424 | @end deffn |
| 425 | |
| 426 | @deffn {Scheme Procedure} parse-header name val |
| 427 | Parse @var{val}, a string, with the parser for the header named |
| 428 | @var{name}. Returns the parsed value. |
| 429 | @end deffn |
| 430 | |
| 431 | @deffn {Scheme Procedure} write-header name val port |
| 432 | Write the given header name and value to @var{port}, using the writer |
| 433 | from @code{header-writer}. |
| 434 | @end deffn |
| 435 | |
| 436 | @deffn {Scheme Procedure} read-headers port |
| 437 | Read the headers of an HTTP message from @var{port}, returning them |
| 438 | as an ordered alist. |
| 439 | @end deffn |
| 440 | |
| 441 | @deffn {Scheme Procedure} write-headers headers port |
| 442 | Write the given header alist to @var{port}. Doesn't write the final |
| 443 | @samp{\r\n}, as the user might want to add another header. |
| 444 | @end deffn |
| 445 | |
| 446 | The @code{(web http)} module also has some utility procedures to read |
| 447 | and write request and response lines. |
| 448 | |
| 449 | @deffn {Scheme Procedure} parse-http-method str [start] [end] |
| 450 | Parse an HTTP method from @var{str}. The result is an upper-case symbol, |
| 451 | like @code{GET}. |
| 452 | @end deffn |
| 453 | |
| 454 | @deffn {Scheme Procedure} parse-http-version str [start] [end] |
| 455 | Parse an HTTP version from @var{str}, returning it as a major--minor |
| 456 | pair. For example, @code{HTTP/1.1} parses as the pair of integers, |
| 457 | @code{(1 . 1)}. |
| 458 | @end deffn |
| 459 | |
| 460 | @deffn {Scheme Procedure} parse-request-uri str [start] [end] |
| 461 | Parse a URI from an HTTP request line. Note that URIs in requests do not |
| 462 | have to have a scheme or host name. The result is a URI object. |
| 463 | @end deffn |
| 464 | |
| 465 | @deffn {Scheme Procedure} read-request-line port |
| 466 | Read the first line of an HTTP request from @var{port}, returning three |
| 467 | values: the method, the URI, and the version. |
| 468 | @end deffn |
| 469 | |
| 470 | @deffn {Scheme Procedure} write-request-line method uri version port |
| 471 | Write the first line of an HTTP request to @var{port}. |
| 472 | @end deffn |
| 473 | |
| 474 | @deffn {Scheme Procedure} read-response-line port |
| 475 | Read the first line of an HTTP response from @var{port}, returning three |
| 476 | values: the HTTP version, the response code, and the ``reason phrase''. |
| 477 | @end deffn |
| 478 | |
| 479 | @deffn {Scheme Procedure} write-response-line version code reason-phrase port |
| 480 | Write the first line of an HTTP response to @var{port}. |
| 481 | @end deffn |
| 482 | |
| 483 | |
| 484 | @node HTTP Headers |
| 485 | @subsection HTTP Headers |
| 486 | |
| 487 | In addition to defining the infrastructure to parse headers, the |
| 488 | @code{(web http)} module defines specific parsers and unparsers for all |
| 489 | headers defined in the HTTP/1.1 standard. |
| 490 | |
| 491 | For example, if you receive a header named @samp{Accept-Language} with a |
| 492 | value @samp{en, es;q=0.8}, Guile parses it as a quality list (defined |
| 493 | below): |
| 494 | |
| 495 | @example |
| 496 | (parse-header 'accept-language "en, es;q=0.8") |
| 497 | @result{} ((1000 . "en") (800 . "es")) |
| 498 | @end example |
| 499 | |
| 500 | The format of the value for @samp{Accept-Language} headers is defined |
| 501 | below, along with all other headers defined in the HTTP standard. (If |
| 502 | the header were unknown, the value would have been returned as a |
| 503 | string.) |
| 504 | |
| 505 | For brevity, the header definitions below are given in the form, |
| 506 | @var{Type} @code{@var{name}}, indicating that values for the header |
| 507 | @code{@var{name}} will be of the given @var{Type}. Since Guile |
| 508 | internally treats header names in lower case, in this document we give |
| 509 | types title-cased names. A short description of the each header's |
| 510 | purpose and an example follow. |
| 511 | |
| 512 | For full details on the meanings of all of these headers, see the HTTP |
| 513 | 1.1 standard, RFC 2616. |
| 514 | |
| 515 | @subsubsection HTTP Header Types |
| 516 | |
| 517 | Here we define the types that are used below, when defining headers. |
| 518 | |
| 519 | @deftp {HTTP Header Type} Date |
| 520 | A SRFI-19 date. |
| 521 | @end deftp |
| 522 | |
| 523 | @deftp {HTTP Header Type} KVList |
| 524 | A list whose elements are keys or key-value pairs. Keys are parsed to |
| 525 | symbols. Values are strings by default. Non-string values are the |
| 526 | exception, and are mentioned explicitly below, as appropriate. |
| 527 | @end deftp |
| 528 | |
| 529 | @deftp {HTTP Header Type} SList |
| 530 | A list of strings. |
| 531 | @end deftp |
| 532 | |
| 533 | @deftp {HTTP Header Type} Quality |
| 534 | An exact integer between 0 and 1000. Qualities are used to express |
| 535 | preference, given multiple options. An option with a quality of 870, |
| 536 | for example, is preferred over an option with quality 500. |
| 537 | |
| 538 | (Qualities are written out over the wire as numbers between 0.0 and |
| 539 | 1.0, but since the standard only allows three digits after the decimal, |
| 540 | it's equivalent to integers between 0 and 1000, so that's what Guile |
| 541 | uses.) |
| 542 | @end deftp |
| 543 | |
| 544 | @deftp {HTTP Header Type} QList |
| 545 | A quality list: a list of pairs, the car of which is a quality, and the |
| 546 | cdr a string. Used to express a list of options, along with their |
| 547 | qualities. |
| 548 | @end deftp |
| 549 | |
| 550 | @deftp {HTTP Header Type} ETag |
| 551 | An entity tag, represented as a pair. The car of the pair is an opaque |
| 552 | string, and the cdr is @code{#t} if the entity tag is a ``strong'' entity |
| 553 | tag, and @code{#f} otherwise. |
| 554 | @end deftp |
| 555 | |
| 556 | @subsubsection General Headers |
| 557 | |
| 558 | General HTTP headers may be present in any HTTP message. |
| 559 | |
| 560 | @deftypevr {HTTP Header} KVList cache-control |
| 561 | A key-value list of cache-control directives. See RFC 2616, for more |
| 562 | details. |
| 563 | |
| 564 | If present, parameters to @code{max-age}, @code{max-stale}, |
| 565 | @code{min-fresh}, and @code{s-maxage} are all parsed as non-negative |
| 566 | integers. |
| 567 | |
| 568 | If present, parameters to @code{private} and @code{no-cache} are parsed |
| 569 | as lists of header names, as symbols. |
| 570 | |
| 571 | @example |
| 572 | (parse-header 'cache-control "no-cache,no-store" |
| 573 | @result{} (no-cache no-store) |
| 574 | (parse-header 'cache-control "no-cache=\"Authorization,Date\",no-store" |
| 575 | @result{} ((no-cache . (authorization date)) no-store) |
| 576 | (parse-header 'cache-control "no-cache=\"Authorization,Date\",max-age=10" |
| 577 | @result{} ((no-cache . (authorization date)) (max-age . 10)) |
| 578 | @end example |
| 579 | @end deftypevr |
| 580 | |
| 581 | @deftypevr {HTTP Header} List connection |
| 582 | A list of header names that apply only to this HTTP connection, as |
| 583 | symbols. Additionally, the symbol @samp{close} may be present, to |
| 584 | indicate that the server should close the connection after responding to |
| 585 | the request. |
| 586 | @example |
| 587 | (parse-header 'connection "close") |
| 588 | @result{} (close) |
| 589 | @end example |
| 590 | @end deftypevr |
| 591 | |
| 592 | @deftypevr {HTTP Header} Date date |
| 593 | The date that a given HTTP message was originated. |
| 594 | @example |
| 595 | (parse-header 'date "Tue, 15 Nov 1994 08:12:31 GMT") |
| 596 | @result{} #<date ...> |
| 597 | @end example |
| 598 | @end deftypevr |
| 599 | |
| 600 | @deftypevr {HTTP Header} KVList pragma |
| 601 | A key-value list of implementation-specific directives. |
| 602 | @example |
| 603 | (parse-header 'pragma "no-cache, broccoli=tasty") |
| 604 | @result{} (no-cache (broccoli . "tasty")) |
| 605 | @end example |
| 606 | @end deftypevr |
| 607 | |
| 608 | @deftypevr {HTTP Header} List trailer |
| 609 | A list of header names which will appear after the message body, instead |
| 610 | of with the message headers. |
| 611 | @example |
| 612 | (parse-header 'trailer "ETag") |
| 613 | @result{} (etag) |
| 614 | @end example |
| 615 | @end deftypevr |
| 616 | |
| 617 | @deftypevr {HTTP Header} List transfer-encoding |
| 618 | A list of transfer codings, expressed as key-value lists. The only |
| 619 | transfer coding defined by the specification is @code{chunked}. |
| 620 | @example |
| 621 | (parse-header 'transfer-encoding "chunked") |
| 622 | @result{} ((chunked)) |
| 623 | @end example |
| 624 | @end deftypevr |
| 625 | |
| 626 | @deftypevr {HTTP Header} List upgrade |
| 627 | A list of strings, indicating additional protocols that a server could use |
| 628 | in response to a request. |
| 629 | @example |
| 630 | (parse-header 'upgrade "WebSocket") |
| 631 | @result{} ("WebSocket") |
| 632 | @end example |
| 633 | @end deftypevr |
| 634 | |
| 635 | FIXME: parse out more fully? |
| 636 | @deftypevr {HTTP Header} List via |
| 637 | A list of strings, indicating the protocol versions and hosts of |
| 638 | intermediate servers and proxies. There may be multiple @code{via} |
| 639 | headers in one message. |
| 640 | @example |
| 641 | (parse-header 'via "1.0 venus, 1.1 mars") |
| 642 | @result{} ("1.0 venus" "1.1 mars") |
| 643 | @end example |
| 644 | @end deftypevr |
| 645 | |
| 646 | @deftypevr {HTTP Header} List warning |
| 647 | A list of warnings given by a server or intermediate proxy. Each |
| 648 | warning is a itself a list of four elements: a code, as an exact integer |
| 649 | between 0 and 1000, a host as a string, the warning text as a string, |
| 650 | and either @code{#f} or a SRFI-19 date. |
| 651 | |
| 652 | There may be multiple @code{warning} headers in one message. |
| 653 | @example |
| 654 | (parse-header 'warning "123 foo \"core breach imminent\"") |
| 655 | @result{} ((123 "foo" "core-breach imminent" #f)) |
| 656 | @end example |
| 657 | @end deftypevr |
| 658 | |
| 659 | |
| 660 | @subsubsection Entity Headers |
| 661 | |
| 662 | Entity headers may be present in any HTTP message, and refer to the |
| 663 | resource referenced in the HTTP request or response. |
| 664 | |
| 665 | @deftypevr {HTTP Header} List allow |
| 666 | A list of allowed methods on a given resource, as symbols. |
| 667 | @example |
| 668 | (parse-header 'allow "GET, HEAD") |
| 669 | @result{} (GET HEAD) |
| 670 | @end example |
| 671 | @end deftypevr |
| 672 | |
| 673 | @deftypevr {HTTP Header} List content-encoding |
| 674 | A list of content codings, as symbols. |
| 675 | @example |
| 676 | (parse-header 'content-encoding "gzip") |
| 677 | @result{} (gzip) |
| 678 | @end example |
| 679 | @end deftypevr |
| 680 | |
| 681 | @deftypevr {HTTP Header} List content-language |
| 682 | The languages that a resource is in, as strings. |
| 683 | @example |
| 684 | (parse-header 'content-language "en") |
| 685 | @result{} ("en") |
| 686 | @end example |
| 687 | @end deftypevr |
| 688 | |
| 689 | @deftypevr {HTTP Header} UInt content-length |
| 690 | The number of bytes in a resource, as an exact, non-negative integer. |
| 691 | @example |
| 692 | (parse-header 'content-length "300") |
| 693 | @result{} 300 |
| 694 | @end example |
| 695 | @end deftypevr |
| 696 | |
| 697 | @deftypevr {HTTP Header} URI content-location |
| 698 | The canonical URI for a resource, in the case that it is also accessible |
| 699 | from a different URI. |
| 700 | @example |
| 701 | (parse-header 'content-location "http://example.com/foo") |
| 702 | @result{} #<<uri> ...> |
| 703 | @end example |
| 704 | @end deftypevr |
| 705 | |
| 706 | @deftypevr {HTTP Header} String content-md5 |
| 707 | The MD5 digest of a resource. |
| 708 | @example |
| 709 | (parse-header 'content-md5 "ffaea1a79810785575e29e2bd45e2fa5") |
| 710 | @result{} "ffaea1a79810785575e29e2bd45e2fa5" |
| 711 | @end example |
| 712 | @end deftypevr |
| 713 | |
| 714 | @deftypevr {HTTP Header} List content-range |
| 715 | A range specification, as a list of three elements: the symbol |
| 716 | @code{bytes}, either the symbol @code{*} or a pair of integers, |
| 717 | indicating the byte rage, and either @code{*} or an integer, for the |
| 718 | instance length. Used to indicate that a response only includes part of |
| 719 | a resource. |
| 720 | @example |
| 721 | (parse-header 'content-range "bytes 10-20/*") |
| 722 | @result{} (bytes (10 . 20) *) |
| 723 | @end example |
| 724 | @end deftypevr |
| 725 | |
| 726 | @deftypevr {HTTP Header} List content-type |
| 727 | The MIME type of a resource, as a symbol, along with any parameters. |
| 728 | @example |
| 729 | (parse-header 'content-length "text/plain") |
| 730 | @result{} (text/plain) |
| 731 | (parse-header 'content-length "text/plain;charset=utf-8") |
| 732 | @result{} (text/plain (charset . "utf-8")) |
| 733 | @end example |
| 734 | Note that the @code{charset} parameter is something is a misnomer, and |
| 735 | the HTTP specification admits this. It specifies the @emph{encoding} of |
| 736 | the characters, not the character set. |
| 737 | @end deftypevr |
| 738 | |
| 739 | @deftypevr {HTTP Header} Date expires |
| 740 | The date/time after which the resource given in a response is considered |
| 741 | stale. |
| 742 | @example |
| 743 | (parse-header 'expires "Tue, 15 Nov 1994 08:12:31 GMT") |
| 744 | @result{} #<date ...> |
| 745 | @end example |
| 746 | @end deftypevr |
| 747 | |
| 748 | @deftypevr {HTTP Header} Date last-modified |
| 749 | The date/time on which the resource given in a response was last |
| 750 | modified. |
| 751 | @example |
| 752 | (parse-header 'expires "Tue, 15 Nov 1994 08:12:31 GMT") |
| 753 | @result{} #<date ...> |
| 754 | @end example |
| 755 | @end deftypevr |
| 756 | |
| 757 | |
| 758 | @subsubsection Request Headers |
| 759 | |
| 760 | Request headers may only appear in an HTTP request, not in a response. |
| 761 | |
| 762 | @deftypevr {HTTP Header} List accept |
| 763 | A list of preferred media types for a response. Each element of the |
| 764 | list is itself a list, in the same format as @code{content-type}. |
| 765 | @example |
| 766 | (parse-header 'accept "text/html,text/plain;charset=utf-8") |
| 767 | @result{} ((text/html) (text/plain (charset . "utf-8"))) |
| 768 | @end example |
| 769 | Preference is expressed with quality values: |
| 770 | @example |
| 771 | (parse-header 'accept "text/html;q=0.8,text/plain;q=0.6") |
| 772 | @result{} ((text/html (q . 800)) (text/plain (q . 600))) |
| 773 | @end example |
| 774 | @end deftypevr |
| 775 | |
| 776 | @deftypevr {HTTP Header} QList accept-charset |
| 777 | A quality list of acceptable charsets. Note again that what HTTP calls |
| 778 | a ``charset'' is what Guile calls a ``character encoding''. |
| 779 | @example |
| 780 | (parse-header 'accept-charset "iso-8859-5, unicode-1-1;q=0.8") |
| 781 | @result{} ((1000 . "iso-8859-5") (800 . "unicode-1-1")) |
| 782 | @end example |
| 783 | @end deftypevr |
| 784 | |
| 785 | @deftypevr {HTTP Header} QList accept-encoding |
| 786 | A quality list of acceptable content codings. |
| 787 | @example |
| 788 | (parse-header 'accept-encoding "gzip,identity=0.8") |
| 789 | @result{} ((1000 . "gzip") (800 . "identity")) |
| 790 | @end example |
| 791 | @end deftypevr |
| 792 | |
| 793 | @deftypevr {HTTP Header} QList accept-language |
| 794 | A quality list of acceptable languages. |
| 795 | @example |
| 796 | (parse-header 'accept-language "cn,en=0.75") |
| 797 | @result{} ((1000 . "cn") (750 . "en")) |
| 798 | @end example |
| 799 | @end deftypevr |
| 800 | |
| 801 | @deftypevr {HTTP Header} Pair authorization |
| 802 | Authorization credentials. The car of the pair indicates the |
| 803 | authentication scheme, like @code{basic}. For basic authentication, the |
| 804 | cdr of the pair will be the base64-encoded @samp{@var{user}:@var{pass}} |
| 805 | string. For other authentication schemes, like @code{digest}, the cdr |
| 806 | will be a key-value list of credentials. |
| 807 | @example |
| 808 | (parse-header 'authorization "Basic QWxhZGRpbjpvcGVuIHNlc2FtZQ==" |
| 809 | @result{} (basic . "QWxhZGRpbjpvcGVuIHNlc2FtZQ==") |
| 810 | @end example |
| 811 | @end deftypevr |
| 812 | |
| 813 | @deftypevr {HTTP Header} List expect |
| 814 | A list of expectations that a client has of a server. The expectations |
| 815 | are key-value lists. |
| 816 | @example |
| 817 | (parse-header 'expect "100-continue") |
| 818 | @result{} ((100-continue)) |
| 819 | @end example |
| 820 | @end deftypevr |
| 821 | |
| 822 | @deftypevr {HTTP Header} String from |
| 823 | The email address of a user making an HTTP request. |
| 824 | @example |
| 825 | (parse-header 'from "bob@@example.com") |
| 826 | @result{} "bob@@example.com" |
| 827 | @end example |
| 828 | @end deftypevr |
| 829 | |
| 830 | @deftypevr {HTTP Header} Pair host |
| 831 | The host for the resource being requested, as a hostname-port pair. If |
| 832 | no port is given, the port is @code{#f}. |
| 833 | @example |
| 834 | (parse-header 'host "gnu.org:80") |
| 835 | @result{} ("gnu.org" . 80) |
| 836 | (parse-header 'host "gnu.org") |
| 837 | @result{} ("gnu.org" . #f) |
| 838 | @end example |
| 839 | @end deftypevr |
| 840 | |
| 841 | @deftypevr {HTTP Header} *|List if-match |
| 842 | A set of etags, indicating that the request should proceed if and only |
| 843 | if the etag of the resource is in that set. Either the symbol @code{*}, |
| 844 | indicating any etag, or a list of entity tags. |
| 845 | @example |
| 846 | (parse-header 'if-match "*") |
| 847 | @result{} * |
| 848 | (parse-header 'if-match "asdfadf") |
| 849 | @result{} (("asdfadf" . #t)) |
| 850 | (parse-header 'if-match W/"asdfadf") |
| 851 | @result{} (("asdfadf" . #f)) |
| 852 | @end example |
| 853 | @end deftypevr |
| 854 | |
| 855 | @deftypevr {HTTP Header} Date if-modified-since |
| 856 | Indicates that a response should proceed if and only if the resource has |
| 857 | been modified since the given date. |
| 858 | @example |
| 859 | (parse-header 'if-modified-since "Tue, 15 Nov 1994 08:12:31 GMT") |
| 860 | @result{} #<date ...> |
| 861 | @end example |
| 862 | @end deftypevr |
| 863 | |
| 864 | @deftypevr {HTTP Header} *|List if-none-match |
| 865 | A set of etags, indicating that the request should proceed if and only |
| 866 | if the etag of the resource is not in the set. Either the symbol |
| 867 | @code{*}, indicating any etag, or a list of entity tags. |
| 868 | @example |
| 869 | (parse-header 'if-none-match "*") |
| 870 | @result{} * |
| 871 | @end example |
| 872 | @end deftypevr |
| 873 | |
| 874 | @deftypevr {HTTP Header} ETag|Date if-range |
| 875 | Indicates that the range request should proceed if and only if the |
| 876 | resource matches a modification date or an etag. Either an entity tag, |
| 877 | or a SRFI-19 date. |
| 878 | @example |
| 879 | (parse-header 'if-range "\"original-etag\"") |
| 880 | @result{} ("original-etag" . #t) |
| 881 | @end example |
| 882 | @end deftypevr |
| 883 | |
| 884 | @deftypevr {HTTP Header} Date if-unmodified-since |
| 885 | Indicates that a response should proceed if and only if the resource has |
| 886 | not been modified since the given date. |
| 887 | @example |
| 888 | (parse-header 'if-not-modified-since "Tue, 15 Nov 1994 08:12:31 GMT") |
| 889 | @result{} #<date ...> |
| 890 | @end example |
| 891 | @end deftypevr |
| 892 | |
| 893 | @deftypevr {HTTP Header} UInt max-forwards |
| 894 | The maximum number of proxy or gateway hops that a request should be |
| 895 | subject to. |
| 896 | @example |
| 897 | (parse-header 'max-forwards "10") |
| 898 | @result{} 10 |
| 899 | @end example |
| 900 | @end deftypevr |
| 901 | |
| 902 | @deftypevr {HTTP Header} Pair proxy-authorization |
| 903 | Authorization credentials for a proxy connection. See the documentation |
| 904 | for @code{authorization} above for more information on the format. |
| 905 | @example |
| 906 | (parse-header 'proxy-authorization "Digest foo=bar,baz=qux" |
| 907 | @result{} (digest (foo . "bar") (baz . "qux")) |
| 908 | @end example |
| 909 | @end deftypevr |
| 910 | |
| 911 | @deftypevr {HTTP Header} Pair range |
| 912 | A range request, indicating that the client wants only part of a |
| 913 | resource. The car of the pair is the symbol @code{bytes}, and the cdr |
| 914 | is a list of pairs. Each element of the cdr indicates a range; the car |
| 915 | is the first byte position and the cdr is the last byte position, as |
| 916 | integers, or @code{#f} if not given. |
| 917 | @example |
| 918 | (parse-header 'range "bytes=10-30,50-") |
| 919 | @result{} (bytes (10 . 30) (50 . #f)) |
| 920 | @end example |
| 921 | @end deftypevr |
| 922 | |
| 923 | @deftypevr {HTTP Header} URI referer |
| 924 | The URI of the resource that referred the user to this resource. The |
| 925 | name of the header is a misspelling, but we are stuck with it. |
| 926 | @example |
| 927 | (parse-header 'referer "http://www.gnu.org/") |
| 928 | @result{} #<uri ...> |
| 929 | @end example |
| 930 | @end deftypevr |
| 931 | |
| 932 | @deftypevr {HTTP Header} List te |
| 933 | A list of transfer codings, expressed as key-value lists. A common |
| 934 | transfer coding is @code{trailers}. |
| 935 | @example |
| 936 | (parse-header 'te "trailers") |
| 937 | @result{} ((trailers)) |
| 938 | @end example |
| 939 | @end deftypevr |
| 940 | |
| 941 | @deftypevr {HTTP Header} String user-agent |
| 942 | A string indicating the user agent making the request. The |
| 943 | specification defines a structured format for this header, but it is |
| 944 | widely disregarded, so Guile does not attempt to parse strictly. |
| 945 | @example |
| 946 | (parse-header 'user-agent "Mozilla/5.0") |
| 947 | @result{} "Mozilla/5.0" |
| 948 | @end example |
| 949 | @end deftypevr |
| 950 | |
| 951 | |
| 952 | @subsubsection Response Headers |
| 953 | |
| 954 | @deftypevr {HTTP Header} List accept-ranges |
| 955 | A list of range units that the server supports, as symbols. |
| 956 | @example |
| 957 | (parse-header 'accept-ranges "bytes") |
| 958 | @result{} (bytes) |
| 959 | @end example |
| 960 | @end deftypevr |
| 961 | |
| 962 | @deftypevr {HTTP Header} UInt age |
| 963 | The age of a cached response, in seconds. |
| 964 | @example |
| 965 | (parse-header 'age "3600") |
| 966 | @result{} 3600 |
| 967 | @end example |
| 968 | @end deftypevr |
| 969 | |
| 970 | @deftypevr {HTTP Header} ETag etag |
| 971 | The entity-tag of the resource. |
| 972 | @example |
| 973 | (parse-header 'etag "\"foo\"") |
| 974 | @result{} ("foo" . #t) |
| 975 | @end example |
| 976 | @end deftypevr |
| 977 | |
| 978 | @deftypevr {HTTP Header} URI location |
| 979 | A URI on which a request may be completed. Used in combination with a |
| 980 | redirecting status code to perform client-side redirection. |
| 981 | @example |
| 982 | (parse-header 'location "http://example.com/other") |
| 983 | @result{} #<uri ...> |
| 984 | @end example |
| 985 | @end deftypevr |
| 986 | |
| 987 | @deftypevr {HTTP Header} List proxy-authenticate |
| 988 | A list of challenges to a proxy, indicating the need for authentication. |
| 989 | @example |
| 990 | (parse-header 'proxy-authenticate "Basic realm=\"foo\"") |
| 991 | @result{} ((basic (realm . "foo"))) |
| 992 | @end example |
| 993 | @end deftypevr |
| 994 | |
| 995 | @deftypevr {HTTP Header} UInt|Date retry-after |
| 996 | Used in combination with a server-busy status code, like 503, to |
| 997 | indicate that a client should retry later. Either a number of seconds, |
| 998 | or a date. |
| 999 | @example |
| 1000 | (parse-header 'retry-after "60") |
| 1001 | @result{} 60 |
| 1002 | @end example |
| 1003 | @end deftypevr |
| 1004 | |
| 1005 | @deftypevr {HTTP Header} String server |
| 1006 | A string identifying the server. |
| 1007 | @example |
| 1008 | (parse-header 'server "My first web server") |
| 1009 | @result{} "My first web server" |
| 1010 | @end example |
| 1011 | @end deftypevr |
| 1012 | |
| 1013 | @deftypevr {HTTP Header} *|List vary |
| 1014 | A set of request headers that were used in computing this response. |
| 1015 | Used to indicate that server-side content negotiation was performed, for |
| 1016 | example in response to the @code{accept-language} header. Can also be |
| 1017 | the symbol @code{*}, indicating that all headers were considered. |
| 1018 | @example |
| 1019 | (parse-header 'vary "Accept-Language, Accept") |
| 1020 | @result{} (accept-language accept) |
| 1021 | @end example |
| 1022 | @end deftypevr |
| 1023 | |
| 1024 | @deftypevr {HTTP Header} List www-authenticate |
| 1025 | A list of challenges to a user, indicating the need for authentication. |
| 1026 | @example |
| 1027 | (parse-header 'www-authenticate "Basic realm=\"foo\"") |
| 1028 | @result{} ((basic (realm . "foo"))) |
| 1029 | @end example |
| 1030 | @end deftypevr |
| 1031 | |
| 1032 | @node Transfer Codings |
| 1033 | @subsection Transfer Codings |
| 1034 | |
| 1035 | HTTP 1.1 allows for various transfer codings to be applied to message |
| 1036 | bodies. These include various types of compression, and HTTP chunked |
| 1037 | encoding. Currently, only chunked encoding is supported by guile. |
| 1038 | |
| 1039 | Chunked coding is an optional coding that may be applied to message |
| 1040 | bodies, to allow messages whose length is not known beforehand to be |
| 1041 | returned. Such messages can be split into chunks, terminated by a final |
| 1042 | zero length chunk. |
| 1043 | |
| 1044 | In order to make dealing with encodings more simple, guile provides |
| 1045 | procedures to create ports that ``wrap'' existing ports, applying |
| 1046 | transformations transparently under the hood. |
| 1047 | |
| 1048 | These procedures are in the @code{(web http)} module. |
| 1049 | |
| 1050 | @example |
| 1051 | (use-modules (web http)) |
| 1052 | @end example |
| 1053 | |
| 1054 | @deffn {Scheme Procedure} make-chunked-input-port port [#:keep-alive?=#f] |
| 1055 | Returns a new port, that transparently reads and decodes chunk-encoded |
| 1056 | data from @var{port}. If no more chunk-encoded data is available, it |
| 1057 | returns the end-of-file object. When the port is closed, @var{port} will |
| 1058 | also be closed, unless @var{keep-alive?} is true. |
| 1059 | @end deffn |
| 1060 | |
| 1061 | @example |
| 1062 | (use-modules (ice-9 rdelim)) |
| 1063 | |
| 1064 | (define s "5\r\nFirst\r\nA\r\n line\n Sec\r\n8\r\nond line\r\n0\r\n") |
| 1065 | (define p (make-chunked-input-port (open-input-string s))) |
| 1066 | (read-line s) |
| 1067 | @result{} "First line" |
| 1068 | (read-line s) |
| 1069 | @result{} "Second line" |
| 1070 | @end example |
| 1071 | |
| 1072 | @deffn {Scheme Procedure} make-chunked-output-port port [#:keep-alive?=#f] |
| 1073 | Returns a new port, which transparently encodes data as chunk-encoded |
| 1074 | before writing it to @var{port}. Whenever a write occurs on this port, |
| 1075 | it buffers it, until the port is flushed, at which point it writes a |
| 1076 | chunk containing all the data written so far. When the port is closed, |
| 1077 | the data remaining is written to @var{port}, as is the terminating zero |
| 1078 | chunk. It also causes @var{port} to be closed, unless @var{keep-alive?} |
| 1079 | is true. |
| 1080 | |
| 1081 | Note. Forcing a chunked output port when there is no data is buffered |
| 1082 | does not write a zero chunk, as this would cause the data to be |
| 1083 | interpreted incorrectly by the client. |
| 1084 | @end deffn |
| 1085 | |
| 1086 | @example |
| 1087 | (call-with-output-string |
| 1088 | (lambda (out) |
| 1089 | (define out* (make-chunked-output-port out #:keep-alive? #t)) |
| 1090 | (display "first chunk" out*) |
| 1091 | (force-output out*) |
| 1092 | (force-output out*) ; note this does not write a zero chunk |
| 1093 | (display "second chunk" out*) |
| 1094 | (close-port out*))) |
| 1095 | @result{} "b\r\nfirst chunk\r\nc\r\nsecond chunk\r\n0\r\n" |
| 1096 | @end example |
| 1097 | |
| 1098 | @node Requests |
| 1099 | @subsection HTTP Requests |
| 1100 | |
| 1101 | @example |
| 1102 | (use-modules (web request)) |
| 1103 | @end example |
| 1104 | |
| 1105 | The request module contains a data type for HTTP requests. |
| 1106 | |
| 1107 | @subsubsection An Important Note on Character Sets |
| 1108 | |
| 1109 | HTTP requests consist of two parts: the request proper, consisting of a |
| 1110 | request line and a set of headers, and (optionally) a body. The body |
| 1111 | might have a binary content-type, and even in the textual case its |
| 1112 | length is specified in bytes, not characters. |
| 1113 | |
| 1114 | Therefore, HTTP is a fundamentally binary protocol. However the request |
| 1115 | line and headers are specified to be in a subset of ASCII, so they can |
| 1116 | be treated as text, provided that the port's encoding is set to an |
| 1117 | ASCII-compatible one-byte-per-character encoding. ISO-8859-1 (latin-1) |
| 1118 | is just such an encoding, and happens to be very efficient for Guile. |
| 1119 | |
| 1120 | So what Guile does when reading requests from the wire, or writing them |
| 1121 | out, is to set the port's encoding to latin-1, and treating the request |
| 1122 | headers as text. |
| 1123 | |
| 1124 | The request body is another issue. For binary data, the data is |
| 1125 | probably in a bytevector, so we use the R6RS binary output procedures to |
| 1126 | write out the binary payload. Textual data usually has to be written |
| 1127 | out to some character encoding, usually UTF-8, and then the resulting |
| 1128 | bytevector is written out to the port. |
| 1129 | |
| 1130 | In summary, Guile reads and writes HTTP over latin-1 sockets, without |
| 1131 | any loss of generality. |
| 1132 | |
| 1133 | @subsubsection Request API |
| 1134 | |
| 1135 | @deffn {Scheme Procedure} request? obj |
| 1136 | @deffnx {Scheme Procedure} request-method request |
| 1137 | @deffnx {Scheme Procedure} request-uri request |
| 1138 | @deffnx {Scheme Procedure} request-version request |
| 1139 | @deffnx {Scheme Procedure} request-headers request |
| 1140 | @deffnx {Scheme Procedure} request-meta request |
| 1141 | @deffnx {Scheme Procedure} request-port request |
| 1142 | A predicate and field accessors for the request type. The fields are as |
| 1143 | follows: |
| 1144 | @table @code |
| 1145 | @item method |
| 1146 | The HTTP method, for example, @code{GET}. |
| 1147 | @item uri |
| 1148 | The URI as a URI record. |
| 1149 | @item version |
| 1150 | The HTTP version pair, like @code{(1 . 1)}. |
| 1151 | @item headers |
| 1152 | The request headers, as an alist of parsed values. |
| 1153 | @item meta |
| 1154 | An arbitrary alist of other data, for example information returned in |
| 1155 | the @code{sockaddr} from @code{accept} (@pxref{Network Sockets and |
| 1156 | Communication}). |
| 1157 | @item port |
| 1158 | The port on which to read or write a request body, if any. |
| 1159 | @end table |
| 1160 | @end deffn |
| 1161 | |
| 1162 | @deffn {Scheme Procedure} read-request port [meta='()] |
| 1163 | Read an HTTP request from @var{port}, optionally attaching the given |
| 1164 | metadata, @var{meta}. |
| 1165 | |
| 1166 | As a side effect, sets the encoding on @var{port} to ISO-8859-1 |
| 1167 | (latin-1), so that reading one character reads one byte. See the |
| 1168 | discussion of character sets above, for more information. |
| 1169 | |
| 1170 | Note that the body is not part of the request. Once you have read a |
| 1171 | request, you may read the body separately, and likewise for writing |
| 1172 | requests. |
| 1173 | @end deffn |
| 1174 | |
| 1175 | @deffn {Scheme Procedure} build-request uri [#:method='GET] @ |
| 1176 | [#:version='(1 . 1)] [#:headers='()] [#:port=#f] [#:meta='()] @ |
| 1177 | [#:validate-headers?=#t] |
| 1178 | Construct an HTTP request object. If @var{validate-headers?} is true, |
| 1179 | the headers are each run through their respective validators. |
| 1180 | @end deffn |
| 1181 | |
| 1182 | @deffn {Scheme Procedure} write-request r port |
| 1183 | Write the given HTTP request to @var{port}. |
| 1184 | |
| 1185 | Return a new request, whose @code{request-port} will continue writing |
| 1186 | on @var{port}, perhaps using some transfer encoding. |
| 1187 | @end deffn |
| 1188 | |
| 1189 | @deffn {Scheme Procedure} read-request-body r |
| 1190 | Reads the request body from @var{r}, as a bytevector. Return @code{#f} |
| 1191 | if there was no request body. |
| 1192 | @end deffn |
| 1193 | |
| 1194 | @deffn {Scheme Procedure} write-request-body r bv |
| 1195 | Write @var{bv}, a bytevector, to the port corresponding to the HTTP |
| 1196 | request @var{r}. |
| 1197 | @end deffn |
| 1198 | |
| 1199 | The various headers that are typically associated with HTTP requests may |
| 1200 | be accessed with these dedicated accessors. @xref{HTTP Headers}, for |
| 1201 | more information on the format of parsed headers. |
| 1202 | |
| 1203 | @deffn {Scheme Procedure} request-accept request [default='()] |
| 1204 | @deffnx {Scheme Procedure} request-accept-charset request [default='()] |
| 1205 | @deffnx {Scheme Procedure} request-accept-encoding request [default='()] |
| 1206 | @deffnx {Scheme Procedure} request-accept-language request [default='()] |
| 1207 | @deffnx {Scheme Procedure} request-allow request [default='()] |
| 1208 | @deffnx {Scheme Procedure} request-authorization request [default=#f] |
| 1209 | @deffnx {Scheme Procedure} request-cache-control request [default='()] |
| 1210 | @deffnx {Scheme Procedure} request-connection request [default='()] |
| 1211 | @deffnx {Scheme Procedure} request-content-encoding request [default='()] |
| 1212 | @deffnx {Scheme Procedure} request-content-language request [default='()] |
| 1213 | @deffnx {Scheme Procedure} request-content-length request [default=#f] |
| 1214 | @deffnx {Scheme Procedure} request-content-location request [default=#f] |
| 1215 | @deffnx {Scheme Procedure} request-content-md5 request [default=#f] |
| 1216 | @deffnx {Scheme Procedure} request-content-range request [default=#f] |
| 1217 | @deffnx {Scheme Procedure} request-content-type request [default=#f] |
| 1218 | @deffnx {Scheme Procedure} request-date request [default=#f] |
| 1219 | @deffnx {Scheme Procedure} request-expect request [default='()] |
| 1220 | @deffnx {Scheme Procedure} request-expires request [default=#f] |
| 1221 | @deffnx {Scheme Procedure} request-from request [default=#f] |
| 1222 | @deffnx {Scheme Procedure} request-host request [default=#f] |
| 1223 | @deffnx {Scheme Procedure} request-if-match request [default=#f] |
| 1224 | @deffnx {Scheme Procedure} request-if-modified-since request [default=#f] |
| 1225 | @deffnx {Scheme Procedure} request-if-none-match request [default=#f] |
| 1226 | @deffnx {Scheme Procedure} request-if-range request [default=#f] |
| 1227 | @deffnx {Scheme Procedure} request-if-unmodified-since request [default=#f] |
| 1228 | @deffnx {Scheme Procedure} request-last-modified request [default=#f] |
| 1229 | @deffnx {Scheme Procedure} request-max-forwards request [default=#f] |
| 1230 | @deffnx {Scheme Procedure} request-pragma request [default='()] |
| 1231 | @deffnx {Scheme Procedure} request-proxy-authorization request [default=#f] |
| 1232 | @deffnx {Scheme Procedure} request-range request [default=#f] |
| 1233 | @deffnx {Scheme Procedure} request-referer request [default=#f] |
| 1234 | @deffnx {Scheme Procedure} request-te request [default=#f] |
| 1235 | @deffnx {Scheme Procedure} request-trailer request [default='()] |
| 1236 | @deffnx {Scheme Procedure} request-transfer-encoding request [default='()] |
| 1237 | @deffnx {Scheme Procedure} request-upgrade request [default='()] |
| 1238 | @deffnx {Scheme Procedure} request-user-agent request [default=#f] |
| 1239 | @deffnx {Scheme Procedure} request-via request [default='()] |
| 1240 | @deffnx {Scheme Procedure} request-warning request [default='()] |
| 1241 | Return the given request header, or @var{default} if none was present. |
| 1242 | @end deffn |
| 1243 | |
| 1244 | @deffn {Scheme Procedure} request-absolute-uri r [default-host=#f] [default-port=#f] |
| 1245 | A helper routine to determine the absolute URI of a request, using the |
| 1246 | @code{host} header and the default host and port. |
| 1247 | @end deffn |
| 1248 | |
| 1249 | |
| 1250 | @node Responses |
| 1251 | @subsection HTTP Responses |
| 1252 | |
| 1253 | @example |
| 1254 | (use-modules (web response)) |
| 1255 | @end example |
| 1256 | |
| 1257 | As with requests (@pxref{Requests}), Guile offers a data type for HTTP |
| 1258 | responses. Again, the body is represented separately from the request. |
| 1259 | |
| 1260 | @deffn {Scheme Procedure} response? obj |
| 1261 | @deffnx {Scheme Procedure} response-version response |
| 1262 | @deffnx {Scheme Procedure} response-code response |
| 1263 | @deffnx {Scheme Procedure} response-reason-phrase response |
| 1264 | @deffnx {Scheme Procedure} response-headers response |
| 1265 | @deffnx {Scheme Procedure} response-port response |
| 1266 | A predicate and field accessors for the response type. The fields are as |
| 1267 | follows: |
| 1268 | @table @code |
| 1269 | @item version |
| 1270 | The HTTP version pair, like @code{(1 . 1)}. |
| 1271 | @item code |
| 1272 | The HTTP response code, like @code{200}. |
| 1273 | @item reason-phrase |
| 1274 | The reason phrase, or the standard reason phrase for the response's |
| 1275 | code. |
| 1276 | @item headers |
| 1277 | The response headers, as an alist of parsed values. |
| 1278 | @item port |
| 1279 | The port on which to read or write a response body, if any. |
| 1280 | @end table |
| 1281 | @end deffn |
| 1282 | |
| 1283 | @deffn {Scheme Procedure} read-response port |
| 1284 | Read an HTTP response from @var{port}. |
| 1285 | |
| 1286 | As a side effect, sets the encoding on @var{port} to ISO-8859-1 |
| 1287 | (latin-1), so that reading one character reads one byte. See the |
| 1288 | discussion of character sets in @ref{Responses}, for more information. |
| 1289 | @end deffn |
| 1290 | |
| 1291 | @deffn {Scheme Procedure} build-response [#:version='(1 . 1)] [#:code=200] [#:reason-phrase=#f] [#:headers='()] [#:port=#f] [#:validate-headers?=#t] |
| 1292 | Construct an HTTP response object. If @var{validate-headers?} is true, |
| 1293 | the headers are each run through their respective validators. |
| 1294 | @end deffn |
| 1295 | |
| 1296 | @deffn {Scheme Procedure} adapt-response-version response version |
| 1297 | Adapt the given response to a different HTTP version. Return a new HTTP |
| 1298 | response. |
| 1299 | |
| 1300 | The idea is that many applications might just build a response for the |
| 1301 | default HTTP version, and this method could handle a number of |
| 1302 | programmatic transformations to respond to older HTTP versions (0.9 and |
| 1303 | 1.0). But currently this function is a bit heavy-handed, just updating |
| 1304 | the version field. |
| 1305 | @end deffn |
| 1306 | |
| 1307 | @deffn {Scheme Procedure} write-response r port |
| 1308 | Write the given HTTP response to @var{port}. |
| 1309 | |
| 1310 | Return a new response, whose @code{response-port} will continue writing |
| 1311 | on @var{port}, perhaps using some transfer encoding. |
| 1312 | @end deffn |
| 1313 | |
| 1314 | @deffn {Scheme Procedure} response-must-not-include-body? r |
| 1315 | Some responses, like those with status code 304, are specified as never |
| 1316 | having bodies. This predicate returns @code{#t} for those responses. |
| 1317 | |
| 1318 | Note also, though, that responses to @code{HEAD} requests must also not |
| 1319 | have a body. |
| 1320 | @end deffn |
| 1321 | |
| 1322 | @deffn {Scheme Procedure} response-body-port r [#:decode?=#t] [#:keep-alive?=#t] |
| 1323 | Return an input port from which the body of @var{r} can be read. The encoding |
| 1324 | of the returned port is set according to @var{r}'s @code{content-type} header, |
| 1325 | when it's textual, except if @var{decode?} is @code{#f}. Return @code{#f} |
| 1326 | when no body is available. |
| 1327 | |
| 1328 | When @var{keep-alive?} is @code{#f}, closing the returned port also closes |
| 1329 | @var{r}'s response port. |
| 1330 | @end deffn |
| 1331 | |
| 1332 | @deffn {Scheme Procedure} read-response-body r |
| 1333 | Read the response body from @var{r}, as a bytevector. Returns @code{#f} |
| 1334 | if there was no response body. |
| 1335 | @end deffn |
| 1336 | |
| 1337 | @deffn {Scheme Procedure} write-response-body r bv |
| 1338 | Write @var{bv}, a bytevector, to the port corresponding to the HTTP |
| 1339 | response @var{r}. |
| 1340 | @end deffn |
| 1341 | |
| 1342 | As with requests, the various headers that are typically associated with |
| 1343 | HTTP responses may be accessed with these dedicated accessors. |
| 1344 | @xref{HTTP Headers}, for more information on the format of parsed |
| 1345 | headers. |
| 1346 | |
| 1347 | @deffn {Scheme Procedure} response-accept-ranges response [default=#f] |
| 1348 | @deffnx {Scheme Procedure} response-age response [default='()] |
| 1349 | @deffnx {Scheme Procedure} response-allow response [default='()] |
| 1350 | @deffnx {Scheme Procedure} response-cache-control response [default='()] |
| 1351 | @deffnx {Scheme Procedure} response-connection response [default='()] |
| 1352 | @deffnx {Scheme Procedure} response-content-encoding response [default='()] |
| 1353 | @deffnx {Scheme Procedure} response-content-language response [default='()] |
| 1354 | @deffnx {Scheme Procedure} response-content-length response [default=#f] |
| 1355 | @deffnx {Scheme Procedure} response-content-location response [default=#f] |
| 1356 | @deffnx {Scheme Procedure} response-content-md5 response [default=#f] |
| 1357 | @deffnx {Scheme Procedure} response-content-range response [default=#f] |
| 1358 | @deffnx {Scheme Procedure} response-content-type response [default=#f] |
| 1359 | @deffnx {Scheme Procedure} response-date response [default=#f] |
| 1360 | @deffnx {Scheme Procedure} response-etag response [default=#f] |
| 1361 | @deffnx {Scheme Procedure} response-expires response [default=#f] |
| 1362 | @deffnx {Scheme Procedure} response-last-modified response [default=#f] |
| 1363 | @deffnx {Scheme Procedure} response-location response [default=#f] |
| 1364 | @deffnx {Scheme Procedure} response-pragma response [default='()] |
| 1365 | @deffnx {Scheme Procedure} response-proxy-authenticate response [default=#f] |
| 1366 | @deffnx {Scheme Procedure} response-retry-after response [default=#f] |
| 1367 | @deffnx {Scheme Procedure} response-server response [default=#f] |
| 1368 | @deffnx {Scheme Procedure} response-trailer response [default='()] |
| 1369 | @deffnx {Scheme Procedure} response-transfer-encoding response [default='()] |
| 1370 | @deffnx {Scheme Procedure} response-upgrade response [default='()] |
| 1371 | @deffnx {Scheme Procedure} response-vary response [default='()] |
| 1372 | @deffnx {Scheme Procedure} response-via response [default='()] |
| 1373 | @deffnx {Scheme Procedure} response-warning response [default='()] |
| 1374 | @deffnx {Scheme Procedure} response-www-authenticate response [default=#f] |
| 1375 | Return the given response header, or @var{default} if none was present. |
| 1376 | @end deffn |
| 1377 | |
| 1378 | @deffn {Scheme Procedure} text-content-type? @var{type} |
| 1379 | Return @code{#t} if @var{type}, a symbol as returned by |
| 1380 | @code{response-content-type}, represents a textual type such as |
| 1381 | @code{text/plain}. |
| 1382 | @end deffn |
| 1383 | |
| 1384 | |
| 1385 | @node Web Client |
| 1386 | @subsection Web Client |
| 1387 | |
| 1388 | @code{(web client)} provides a simple, synchronous HTTP client, built on |
| 1389 | the lower-level HTTP, request, and response modules. |
| 1390 | |
| 1391 | @example |
| 1392 | (use-modules (web client)) |
| 1393 | @end example |
| 1394 | |
| 1395 | @deffn {Scheme Procedure} open-socket-for-uri uri |
| 1396 | Return an open input/output port for a connection to URI. |
| 1397 | @end deffn |
| 1398 | |
| 1399 | @deffn {Scheme Procedure} http-get uri arg... |
| 1400 | @deffnx {Scheme Procedure} http-head uri arg... |
| 1401 | @deffnx {Scheme Procedure} http-post uri arg... |
| 1402 | @deffnx {Scheme Procedure} http-put uri arg... |
| 1403 | @deffnx {Scheme Procedure} http-delete uri arg... |
| 1404 | @deffnx {Scheme Procedure} http-trace uri arg... |
| 1405 | @deffnx {Scheme Procedure} http-options uri arg... |
| 1406 | |
| 1407 | Connect to the server corresponding to @var{uri} and make a request over |
| 1408 | HTTP, using the appropriate method (@code{GET}, @code{HEAD}, etc.). |
| 1409 | |
| 1410 | All of these procedures have the same prototype: a URI followed by an |
| 1411 | optional sequence of keyword arguments. These keyword arguments allow |
| 1412 | you to modify the requests in various ways, for example attaching a body |
| 1413 | to the request, or setting specific headers. The following table lists |
| 1414 | the keyword arguments and their default values. |
| 1415 | |
| 1416 | @table @code |
| 1417 | @item #:body #f |
| 1418 | @item #:port (open-socket-for-uri @var{uri})] |
| 1419 | @item #:version '(1 . 1) |
| 1420 | @item #:keep-alive? #f |
| 1421 | @item #:headers '() |
| 1422 | @item #:decode-body? #t |
| 1423 | @item #:streaming? #f |
| 1424 | @end table |
| 1425 | |
| 1426 | If you already have a port open, pass it as @var{port}. Otherwise, a |
| 1427 | connection will be opened to the server corresponding to @var{uri}. Any |
| 1428 | extra headers in the alist @var{headers} will be added to the request. |
| 1429 | |
| 1430 | If @var{body} is not @code{#f}, a message body will also be sent with |
| 1431 | the HTTP request. If @var{body} is a string, it is encoded according to |
| 1432 | the content-type in @var{headers}, defaulting to UTF-8. Otherwise |
| 1433 | @var{body} should be a bytevector, or @code{#f} for no body. Although a |
| 1434 | message body may be sent with any request, usually only @code{POST} and |
| 1435 | @code{PUT} requests have bodies. |
| 1436 | |
| 1437 | If @var{decode-body?} is true, as is the default, the body of the |
| 1438 | response will be decoded to string, if it is a textual content-type. |
| 1439 | Otherwise it will be returned as a bytevector. |
| 1440 | |
| 1441 | However, if @var{streaming?} is true, instead of eagerly reading the |
| 1442 | response body from the server, this function only reads off the headers. |
| 1443 | The response body will be returned as a port on which the data may be |
| 1444 | read. |
| 1445 | |
| 1446 | Unless @var{keep-alive?} is true, the port will be closed after the full |
| 1447 | response body has been read. |
| 1448 | |
| 1449 | Returns two values: the response read from the server, and the response |
| 1450 | body as a string, bytevector, #f value, or as a port (if |
| 1451 | @var{streaming?} is true). |
| 1452 | @end deffn |
| 1453 | |
| 1454 | @code{http-get} is useful for making one-off requests to web sites. If |
| 1455 | you are writing a web spider or some other client that needs to handle a |
| 1456 | number of requests in parallel, it's better to build an event-driven URL |
| 1457 | fetcher, similar in structure to the web server (@pxref{Web Server}). |
| 1458 | |
| 1459 | Another option, good but not as performant, would be to use threads, |
| 1460 | possibly via par-map or futures. |
| 1461 | |
| 1462 | |
| 1463 | @node Web Server |
| 1464 | @subsection Web Server |
| 1465 | |
| 1466 | @code{(web server)} is a generic web server interface, along with a main |
| 1467 | loop implementation for web servers controlled by Guile. |
| 1468 | |
| 1469 | @example |
| 1470 | (use-modules (web server)) |
| 1471 | @end example |
| 1472 | |
| 1473 | The lowest layer is the @code{<server-impl>} object, which defines a set |
| 1474 | of hooks to open a server, read a request from a client, write a |
| 1475 | response to a client, and close a server. These hooks -- @code{open}, |
| 1476 | @code{read}, @code{write}, and @code{close}, respectively -- are bound |
| 1477 | together in a @code{<server-impl>} object. Procedures in this module take a |
| 1478 | @code{<server-impl>} object, if needed. |
| 1479 | |
| 1480 | A @code{<server-impl>} may also be looked up by name. If you pass the |
| 1481 | @code{http} symbol to @code{run-server}, Guile looks for a variable |
| 1482 | named @code{http} in the @code{(web server http)} module, which should |
| 1483 | be bound to a @code{<server-impl>} object. Such a binding is made by |
| 1484 | instantiation of the @code{define-server-impl} syntax. In this way the |
| 1485 | run-server loop can automatically load other backends if available. |
| 1486 | |
| 1487 | The life cycle of a server goes as follows: |
| 1488 | |
| 1489 | @enumerate |
| 1490 | @item |
| 1491 | The @code{open} hook is called, to open the server. @code{open} takes |
| 1492 | zero or more arguments, depending on the backend, and returns an opaque |
| 1493 | server socket object, or signals an error. |
| 1494 | |
| 1495 | @item |
| 1496 | The @code{read} hook is called, to read a request from a new client. |
| 1497 | The @code{read} hook takes one argument, the server socket. It should |
| 1498 | return three values: an opaque client socket, the request, and the |
| 1499 | request body. The request should be a @code{<request>} object, from |
| 1500 | @code{(web request)}. The body should be a string or a bytevector, or |
| 1501 | @code{#f} if there is no body. |
| 1502 | |
| 1503 | If the read failed, the @code{read} hook may return #f for the client |
| 1504 | socket, request, and body. |
| 1505 | |
| 1506 | @item |
| 1507 | A user-provided handler procedure is called, with the request and body |
| 1508 | as its arguments. The handler should return two values: the response, |
| 1509 | as a @code{<response>} record from @code{(web response)}, and the |
| 1510 | response body as bytevector, or @code{#f} if not present. |
| 1511 | |
| 1512 | The respose and response body are run through @code{sanitize-response}, |
| 1513 | documented below. This allows the handler writer to take some |
| 1514 | convenient shortcuts: for example, instead of a @code{<response>}, the |
| 1515 | handler can simply return an alist of headers, in which case a default |
| 1516 | response object is constructed with those headers. Instead of a |
| 1517 | bytevector for the body, the handler can return a string, which will be |
| 1518 | serialized into an appropriate encoding; or it can return a procedure, |
| 1519 | which will be called on a port to write out the data. See the |
| 1520 | @code{sanitize-response} documentation, for more. |
| 1521 | |
| 1522 | @item |
| 1523 | The @code{write} hook is called with three arguments: the client |
| 1524 | socket, the response, and the body. The @code{write} hook returns no |
| 1525 | values. |
| 1526 | |
| 1527 | @item |
| 1528 | At this point the request handling is complete. For a loop, we |
| 1529 | loop back and try to read a new request. |
| 1530 | |
| 1531 | @item |
| 1532 | If the user interrupts the loop, the @code{close} hook is called on |
| 1533 | the server socket. |
| 1534 | @end enumerate |
| 1535 | |
| 1536 | A user may define a server implementation with the following form: |
| 1537 | |
| 1538 | @deffn {Scheme Syntax} define-server-impl name open read write close |
| 1539 | Make a @code{<server-impl>} object with the hooks @var{open}, |
| 1540 | @var{read}, @var{write}, and @var{close}, and bind it to the symbol |
| 1541 | @var{name} in the current module. |
| 1542 | @end deffn |
| 1543 | |
| 1544 | @deffn {Scheme Procedure} lookup-server-impl impl |
| 1545 | Look up a server implementation. If @var{impl} is a server |
| 1546 | implementation already, it is returned directly. If it is a symbol, the |
| 1547 | binding named @var{impl} in the @code{(web server @var{impl})} module is |
| 1548 | looked up. Otherwise an error is signaled. |
| 1549 | |
| 1550 | Currently a server implementation is a somewhat opaque type, useful only |
| 1551 | for passing to other procedures in this module, like @code{read-client}. |
| 1552 | @end deffn |
| 1553 | |
| 1554 | The @code{(web server)} module defines a number of routines that use |
| 1555 | @code{<server-impl>} objects to implement parts of a web server. Given |
| 1556 | that we don't expose the accessors for the various fields of a |
| 1557 | @code{<server-impl>}, indeed these routines are the only procedures with |
| 1558 | any access to the impl objects. |
| 1559 | |
| 1560 | @deffn {Scheme Procedure} open-server impl open-params |
| 1561 | Open a server for the given implementation. Return one value, the new |
| 1562 | server object. The implementation's @code{open} procedure is applied to |
| 1563 | @var{open-params}, which should be a list. |
| 1564 | @end deffn |
| 1565 | |
| 1566 | @deffn {Scheme Procedure} read-client impl server |
| 1567 | Read a new client from @var{server}, by applying the implementation's |
| 1568 | @code{read} procedure to the server. If successful, return three |
| 1569 | values: an object corresponding to the client, a request object, and the |
| 1570 | request body. If any exception occurs, return @code{#f} for all three |
| 1571 | values. |
| 1572 | @end deffn |
| 1573 | |
| 1574 | @deffn {Scheme Procedure} handle-request handler request body state |
| 1575 | Handle a given request, returning the response and body. |
| 1576 | |
| 1577 | The response and response body are produced by calling the given |
| 1578 | @var{handler} with @var{request} and @var{body} as arguments. |
| 1579 | |
| 1580 | The elements of @var{state} are also passed to @var{handler} as |
| 1581 | arguments, and may be returned as additional values. The new |
| 1582 | @var{state}, collected from the @var{handler}'s return values, is then |
| 1583 | returned as a list. The idea is that a server loop receives a handler |
| 1584 | from the user, along with whatever state values the user is interested |
| 1585 | in, allowing the user's handler to explicitly manage its state. |
| 1586 | @end deffn |
| 1587 | |
| 1588 | @deffn {Scheme Procedure} sanitize-response request response body |
| 1589 | ``Sanitize'' the given response and body, making them appropriate for |
| 1590 | the given request. |
| 1591 | |
| 1592 | As a convenience to web handler authors, @var{response} may be given as |
| 1593 | an alist of headers, in which case it is used to construct a default |
| 1594 | response. Ensures that the response version corresponds to the request |
| 1595 | version. If @var{body} is a string, encodes the string to a bytevector, |
| 1596 | in an encoding appropriate for @var{response}. Adds a |
| 1597 | @code{content-length} and @code{content-type} header, as necessary. |
| 1598 | |
| 1599 | If @var{body} is a procedure, it is called with a port as an argument, |
| 1600 | and the output collected as a bytevector. In the future we might try to |
| 1601 | instead use a compressing, chunk-encoded port, and call this procedure |
| 1602 | later, in the write-client procedure. Authors are advised not to rely on |
| 1603 | the procedure being called at any particular time. |
| 1604 | @end deffn |
| 1605 | |
| 1606 | @deffn {Scheme Procedure} write-client impl server client response body |
| 1607 | Write an HTTP response and body to @var{client}. If the server and |
| 1608 | client support persistent connections, it is the implementation's |
| 1609 | responsibility to keep track of the client thereafter, presumably by |
| 1610 | attaching it to the @var{server} argument somehow. |
| 1611 | @end deffn |
| 1612 | |
| 1613 | @deffn {Scheme Procedure} close-server impl server |
| 1614 | Release resources allocated by a previous invocation of |
| 1615 | @code{open-server}. |
| 1616 | @end deffn |
| 1617 | |
| 1618 | Given the procedures above, it is a small matter to make a web server: |
| 1619 | |
| 1620 | @deffn {Scheme Procedure} serve-one-client handler impl server state |
| 1621 | Read one request from @var{server}, call @var{handler} on the request |
| 1622 | and body, and write the response to the client. Return the new state |
| 1623 | produced by the handler procedure. |
| 1624 | @end deffn |
| 1625 | |
| 1626 | @deffn {Scheme Procedure} run-server handler @ |
| 1627 | [impl='http] [open-params='()] @ |
| 1628 | arg @dots{} |
| 1629 | Run Guile's built-in web server. |
| 1630 | |
| 1631 | @var{handler} should be a procedure that takes two or more arguments, |
| 1632 | the HTTP request and request body, and returns two or more values, the |
| 1633 | response and response body. |
| 1634 | |
| 1635 | For examples, skip ahead to the next section, @ref{Web Examples}. |
| 1636 | |
| 1637 | The response and body will be run through @code{sanitize-response} |
| 1638 | before sending back to the client. |
| 1639 | |
| 1640 | Additional arguments to @var{handler} are taken from @var{arg} |
| 1641 | @enddots{}. These arguments comprise a @dfn{state}. Additional return |
| 1642 | values are accumulated into a new state, which will be used for |
| 1643 | subsequent requests. In this way a handler can explicitly manage its |
| 1644 | state. |
| 1645 | @end deffn |
| 1646 | |
| 1647 | The default web server implementation is @code{http}, which binds to a |
| 1648 | socket, listening for request on that port. |
| 1649 | |
| 1650 | @deffn {HTTP Implementation} http [#:host=#f] @ |
| 1651 | [#:family=AF_INET] @ |
| 1652 | [#:addr=INADDR_LOOPBACK] @ |
| 1653 | [#:port 8080] [#:socket] |
| 1654 | The default HTTP implementation. We document it as a function with |
| 1655 | keyword arguments, because that is precisely the way that it is -- all |
| 1656 | of the @var{open-params} to @code{run-server} get passed to the |
| 1657 | implementation's open function. |
| 1658 | |
| 1659 | @example |
| 1660 | ;; The defaults: localhost:8080 |
| 1661 | (run-server handler) |
| 1662 | ;; Same thing |
| 1663 | (run-server handler 'http '()) |
| 1664 | ;; On a different port |
| 1665 | (run-server handler 'http '(#:port 8081)) |
| 1666 | ;; IPv6 |
| 1667 | (run-server handler 'http '(#:family AF_INET6 #:port 8081)) |
| 1668 | ;; Custom socket |
| 1669 | (run-server handler 'http `(#:socket ,(sudo-make-me-a-socket))) |
| 1670 | @end example |
| 1671 | @end deffn |
| 1672 | |
| 1673 | @node Web Examples |
| 1674 | @subsection Web Examples |
| 1675 | |
| 1676 | Well, enough about the tedious internals. Let's make a web application! |
| 1677 | |
| 1678 | @subsubsection Hello, World! |
| 1679 | |
| 1680 | The first program we have to write, of course, is ``Hello, World!''. |
| 1681 | This means that we have to implement a web handler that does what we |
| 1682 | want. |
| 1683 | |
| 1684 | Now we define a handler, a function of two arguments and two return |
| 1685 | values: |
| 1686 | |
| 1687 | @example |
| 1688 | (define (handler request request-body) |
| 1689 | (values @var{response} @var{response-body})) |
| 1690 | @end example |
| 1691 | |
| 1692 | In this first example, we take advantage of a short-cut, returning an |
| 1693 | alist of headers instead of a proper response object. The response body |
| 1694 | is our payload: |
| 1695 | |
| 1696 | @example |
| 1697 | (define (hello-world-handler request request-body) |
| 1698 | (values '((content-type . (text/plain))) |
| 1699 | "Hello World!")) |
| 1700 | @end example |
| 1701 | |
| 1702 | Now let's test it, by running a server with this handler. Load up the |
| 1703 | web server module if you haven't yet done so, and run a server with this |
| 1704 | handler: |
| 1705 | |
| 1706 | @example |
| 1707 | (use-modules (web server)) |
| 1708 | (run-server hello-world-handler) |
| 1709 | @end example |
| 1710 | |
| 1711 | By default, the web server listens for requests on |
| 1712 | @code{localhost:8080}. Visit that address in your web browser to |
| 1713 | test. If you see the string, @code{Hello World!}, sweet! |
| 1714 | |
| 1715 | @subsubsection Inspecting the Request |
| 1716 | |
| 1717 | The Hello World program above is a general greeter, responding to all |
| 1718 | URIs. To make a more exclusive greeter, we need to inspect the request |
| 1719 | object, and conditionally produce different results. So let's load up |
| 1720 | the request, response, and URI modules, and do just that. |
| 1721 | |
| 1722 | @example |
| 1723 | (use-modules (web server)) ; you probably did this already |
| 1724 | (use-modules (web request) |
| 1725 | (web response) |
| 1726 | (web uri)) |
| 1727 | |
| 1728 | (define (request-path-components request) |
| 1729 | (split-and-decode-uri-path (uri-path (request-uri request)))) |
| 1730 | |
| 1731 | (define (hello-hacker-handler request body) |
| 1732 | (if (equal? (request-path-components request) |
| 1733 | '("hacker")) |
| 1734 | (values '((content-type . (text/plain))) |
| 1735 | "Hello hacker!") |
| 1736 | (not-found request))) |
| 1737 | |
| 1738 | (run-server hello-hacker-handler) |
| 1739 | @end example |
| 1740 | |
| 1741 | Here we see that we have defined a helper to return the components of |
| 1742 | the URI path as a list of strings, and used that to check for a request |
| 1743 | to @code{/hacker/}. Then the success case is just as before -- visit |
| 1744 | @code{http://localhost:8080/hacker/} in your browser to check. |
| 1745 | |
| 1746 | You should always match against URI path components as decoded by |
| 1747 | @code{split-and-decode-uri-path}. The above example will work for |
| 1748 | @code{/hacker/}, @code{//hacker///}, and @code{/h%61ck%65r}. |
| 1749 | |
| 1750 | But we forgot to define @code{not-found}! If you are pasting these |
| 1751 | examples into a REPL, accessing any other URI in your web browser will |
| 1752 | drop your Guile console into the debugger: |
| 1753 | |
| 1754 | @example |
| 1755 | <unnamed port>:38:7: In procedure module-lookup: |
| 1756 | <unnamed port>:38:7: Unbound variable: not-found |
| 1757 | |
| 1758 | Entering a new prompt. Type `,bt' for a backtrace or `,q' to continue. |
| 1759 | scheme@@(guile-user) [1]> |
| 1760 | @end example |
| 1761 | |
| 1762 | So let's define the function, right there in the debugger. As you |
| 1763 | probably know, we'll want to return a 404 response. |
| 1764 | |
| 1765 | @example |
| 1766 | ;; Paste this in your REPL |
| 1767 | (define (not-found request) |
| 1768 | (values (build-response #:code 404) |
| 1769 | (string-append "Resource not found: " |
| 1770 | (uri->string (request-uri request))))) |
| 1771 | |
| 1772 | ;; Now paste this to let the web server keep going: |
| 1773 | ,continue |
| 1774 | @end example |
| 1775 | |
| 1776 | Now if you access @code{http://localhost/foo/}, you get this error |
| 1777 | message. (Note that some popular web browsers won't show |
| 1778 | server-generated 404 messages, showing their own instead, unless the 404 |
| 1779 | message body is long enough.) |
| 1780 | |
| 1781 | @subsubsection Higher-Level Interfaces |
| 1782 | |
| 1783 | The web handler interface is a common baseline that all kinds of Guile |
| 1784 | web applications can use. You will usually want to build something on |
| 1785 | top of it, however, especially when producing HTML. Here is a simple |
| 1786 | example that builds up HTML output using SXML (@pxref{SXML}). |
| 1787 | |
| 1788 | First, load up the modules: |
| 1789 | |
| 1790 | @example |
| 1791 | (use-modules (web server) |
| 1792 | (web request) |
| 1793 | (web response) |
| 1794 | (sxml simple)) |
| 1795 | @end example |
| 1796 | |
| 1797 | Now we define a simple templating function that takes a list of HTML |
| 1798 | body elements, as SXML, and puts them in our super template: |
| 1799 | |
| 1800 | @example |
| 1801 | (define (templatize title body) |
| 1802 | `(html (head (title ,title)) |
| 1803 | (body ,@@body))) |
| 1804 | @end example |
| 1805 | |
| 1806 | For example, the simplest Hello HTML can be produced like this: |
| 1807 | |
| 1808 | @example |
| 1809 | (sxml->xml (templatize "Hello!" '((b "Hi!")))) |
| 1810 | @print{} |
| 1811 | <html><head><title>Hello!</title></head><body><b>Hi!</b></body></html> |
| 1812 | @end example |
| 1813 | |
| 1814 | Much better to work with Scheme data types than to work with HTML as |
| 1815 | strings. Now we define a little response helper: |
| 1816 | |
| 1817 | @example |
| 1818 | (define* (respond #:optional body #:key |
| 1819 | (status 200) |
| 1820 | (title "Hello hello!") |
| 1821 | (doctype "<!DOCTYPE html>\n") |
| 1822 | (content-type-params '((charset . "utf-8"))) |
| 1823 | (content-type 'text/html) |
| 1824 | (extra-headers '()) |
| 1825 | (sxml (and body (templatize title body)))) |
| 1826 | (values (build-response |
| 1827 | #:code status |
| 1828 | #:headers `((content-type |
| 1829 | . (,content-type ,@@content-type-params)) |
| 1830 | ,@@extra-headers)) |
| 1831 | (lambda (port) |
| 1832 | (if sxml |
| 1833 | (begin |
| 1834 | (if doctype (display doctype port)) |
| 1835 | (sxml->xml sxml port)))))) |
| 1836 | @end example |
| 1837 | |
| 1838 | Here we see the power of keyword arguments with default initializers. By |
| 1839 | the time the arguments are fully parsed, the @code{sxml} local variable |
| 1840 | will hold the templated SXML, ready for sending out to the client. |
| 1841 | |
| 1842 | Also, instead of returning the body as a string, @code{respond} gives a |
| 1843 | procedure, which will be called by the web server to write out the |
| 1844 | response to the client. |
| 1845 | |
| 1846 | Now, a simple example using this responder, which lays out the incoming |
| 1847 | headers in an HTML table. |
| 1848 | |
| 1849 | @example |
| 1850 | (define (debug-page request body) |
| 1851 | (respond |
| 1852 | `((h1 "hello world!") |
| 1853 | (table |
| 1854 | (tr (th "header") (th "value")) |
| 1855 | ,@@(map (lambda (pair) |
| 1856 | `(tr (td (tt ,(with-output-to-string |
| 1857 | (lambda () (display (car pair)))))) |
| 1858 | (td (tt ,(with-output-to-string |
| 1859 | (lambda () |
| 1860 | (write (cdr pair)))))))) |
| 1861 | (request-headers request)))))) |
| 1862 | |
| 1863 | (run-server debug-page) |
| 1864 | @end example |
| 1865 | |
| 1866 | Now if you visit any local address in your web browser, we actually see |
| 1867 | some HTML, finally. |
| 1868 | |
| 1869 | @subsubsection Conclusion |
| 1870 | |
| 1871 | Well, this is about as far as Guile's built-in web support goes, for |
| 1872 | now. There are many ways to make a web application, but hopefully by |
| 1873 | standardizing the most fundamental data types, users will be able to |
| 1874 | choose the approach that suits them best, while also being able to |
| 1875 | switch between implementations of the server. This is a relatively new |
| 1876 | part of Guile, so if you have feedback, let us know, and we can take it |
| 1877 | into account. Happy hacking on the web! |
| 1878 | |
| 1879 | @c Local Variables: |
| 1880 | @c TeX-master: "guile.texi" |
| 1881 | @c End: |