Commit | Line | Data |
---|---|---|
8db7e094 AW |
1 | @c -*-texinfo-*- |
2 | @c This is part of the GNU Guile Reference Manual. | |
32de1aa7 | 3 | @c Copyright (C) 2010, 2011 Free Software Foundation, Inc. |
8db7e094 AW |
4 | @c See the file guile.texi for copying conditions. |
5 | ||
6 | @node Web | |
7 | @section @acronym{HTTP}, the Web, and All That | |
8 | @cindex Web | |
9 | @cindex WWW | |
10 | @cindex HTTP | |
11 | ||
d75a81b1 AW |
12 | It has always been possible to connect computers together and share |
13 | information between them, but the rise of the World-Wide Web over the | |
14 | last couple of decades has made it much easier to do so. The result is | |
15 | a richly connected network of computation, in which Guile forms a part. | |
8db7e094 | 16 | |
d75a81b1 AW |
17 | By ``the web'', we mean the HTTP protocol@footnote{Yes, the P is for |
18 | protocol, but this phrase appears repeatedly in RFC 2616.} as handled by | |
19 | servers, clients, proxies, caches, and the various kinds of messages and | |
20 | message components that can be sent and received by that protocol, | |
21 | notably HTML. | |
8db7e094 | 22 | |
d75a81b1 AW |
23 | On one level, the web is text in motion: the protocols themselves are |
24 | textual (though the payload may be binary), and it's possible to create | |
25 | a socket and speak text to the web. But such an approach is obviously | |
26 | primitive. This section details the higher-level data types and | |
27 | operations provided by Guile: URIs, HTTP request and response records, | |
28 | and a conventional web server implementation. | |
8db7e094 | 29 | |
d75a81b1 AW |
30 | The material in this section is arranged in ascending order, in which |
31 | later concepts build on previous ones. If you prefer to start with the | |
32 | highest-level perspective, @pxref{Web Examples}, and work your way | |
33 | back. | |
8db7e094 AW |
34 | |
35 | @menu | |
d75a81b1 | 36 | * Types and the Web:: Types prevent bugs and security problems. |
8db7e094 AW |
37 | * URIs:: Universal Resource Identifiers. |
38 | * HTTP:: The Hyper-Text Transfer Protocol. | |
1148d029 | 39 | * HTTP Headers:: How Guile represents specific header values. |
8db7e094 AW |
40 | * Requests:: HTTP requests. |
41 | * Responses:: HTTP responses. | |
ec811439 | 42 | * Web Client:: Accessing web resources over HTTP. |
8db7e094 | 43 | * Web Server:: Serving HTTP to the internet. |
e471a3ee | 44 | * Web Examples:: How to use this thing. |
8db7e094 AW |
45 | @end menu |
46 | ||
d75a81b1 AW |
47 | @node Types and the Web |
48 | @subsection Types and the Web | |
49 | ||
50 | It is a truth universally acknowledged, that a program with good use of | |
51 | data types, will be free from many common bugs. Unfortunately, the | |
52 | common practice in web programming seems to ignore this maxim. This | |
53 | subsection makes the case for expressive data types in web programming. | |
54 | ||
55 | By ``expressive data types'', we mean that the data types @emph{say} | |
56 | something about how a program solves a problem. For example, if we | |
57 | choose to represent dates using SRFI 19 date records (@pxref{SRFI-19}), | |
58 | this indicates that there is a part of the program that will always have | |
59 | valid dates. Error handling for a number of basic cases, like invalid | |
60 | dates, occurs on the boundary in which we produce a SRFI 19 date record | |
61 | from other types, like strings. | |
62 | ||
5ec48b70 NJ |
63 | With regards to the web, data types are helpful in the two broad phases |
64 | of HTTP messages: parsing and generation. | |
d75a81b1 AW |
65 | |
66 | Consider a server, which has to parse a request, and produce a response. | |
67 | Guile will parse the request into an HTTP request object | |
68 | (@pxref{Requests}), with each header parsed into an appropriate Scheme | |
69 | data type. This transition from an incoming stream of characters to | |
70 | typed data is a state change in a program---the strings might parse, or | |
71 | they might not, and something has to happen if they do not. (Guile | |
72 | throws an error in this case.) But after you have the parsed request, | |
73 | ``client'' code (code built on top of the Guile web framework) will not | |
74 | have to check for syntactic validity. The types already make this | |
75 | information manifest. | |
76 | ||
77 | This state change on the parsing boundary makes programs more robust, | |
78 | as they themselves are freed from the need to do a number of common | |
79 | error checks, and they can use normal Scheme procedures to handle a | |
80 | request instead of ad-hoc string parsers. | |
81 | ||
82 | The need for types on the response generation side (in a server) is more | |
83 | subtle, though not less important. Consider the example of a POST | |
84 | handler, which prints out the text that a user submits from a form. | |
85 | Such a handler might include a procedure like this: | |
86 | ||
87 | @example | |
88 | ;; First, a helper procedure | |
89 | (define (para . contents) | |
90 | (string-append "<p>" (string-concatenate contents) "</p>")) | |
91 | ||
92 | ;; Now the meat of our simple web application | |
93 | (define (you-said text) | |
94 | (para "You said: " text)) | |
95 | ||
96 | (display (you-said "Hi!")) | |
97 | @print{} <p>You said: Hi!</p> | |
98 | @end example | |
99 | ||
100 | This is a perfectly valid implementation, provided that the incoming | |
101 | text does not contain the special HTML characters @samp{<}, @samp{>}, or | |
102 | @samp{&}. But this provision of a restricted character set is not | |
103 | reflected anywhere in the program itself: we must @emph{assume} that the | |
104 | programmer understands this, and performs the check elsewhere. | |
105 | ||
106 | Unfortunately, the short history of the practice of programming does not | |
107 | bear out this assumption. A @dfn{cross-site scripting} (@acronym{XSS}) | |
108 | vulnerability is just such a common error in which unfiltered user input | |
109 | is allowed into the output. A user could submit a crafted comment to | |
110 | your web site which results in visitors running malicious Javascript, | |
111 | within the security context of your domain: | |
112 | ||
113 | @example | |
114 | (display (you-said "<script src=\"http://bad.com/nasty.js\" />")) | |
115 | @print{} <p>You said: <script src="http://bad.com/nasty.js" /></p> | |
116 | @end example | |
117 | ||
118 | The fundamental problem here is that both user data and the program | |
119 | template are represented using strings. This identity means that types | |
120 | can't help the programmer to make a distinction between these two, so | |
121 | they get confused. | |
122 | ||
123 | There are a number of possible solutions, but perhaps the best is to | |
124 | treat HTML not as strings, but as native s-expressions: as SXML. The | |
125 | basic idea is that HTML is either text, represented by a string, or an | |
126 | element, represented as a tagged list. So @samp{foo} becomes | |
127 | @samp{"foo"}, and @samp{<b>foo</b>} becomes @samp{(b "foo")}. | |
128 | Attributes, if present, go in a tagged list headed by @samp{@@}, like | |
129 | @samp{(img (@@ (src "http://example.com/foo.png")))}. @xref{sxml | |
130 | simple}, for more information. | |
131 | ||
132 | The good thing about SXML is that HTML elements cannot be confused with | |
133 | text. Let's make a new definition of @code{para}: | |
134 | ||
135 | @example | |
136 | (define (para . contents) | |
137 | `(p ,@@contents)) | |
138 | ||
139 | (use-modules (sxml simple)) | |
140 | (sxml->xml (you-said "Hi!")) | |
141 | @print{} <p>You said: Hi!</p> | |
142 | ||
143 | (sxml->xml (you-said "<i>Rats, foiled again!</i>")) | |
144 | @print{} <p>You said: <i>Rats, foiled again!</i></p> | |
145 | @end example | |
146 | ||
147 | So we see in the second example that HTML elements cannot be unwittingly | |
569269b4 AW |
148 | introduced into the output. However it is now perfectly acceptable to |
149 | pass SXML to @code{you-said}; in fact, that is the big advantage of SXML | |
150 | over everything-as-a-string. | |
d75a81b1 AW |
151 | |
152 | @example | |
153 | (sxml->xml (you-said (you-said "<Hi!>"))) | |
154 | @print{} <p>You said: <p>You said: <Hi!></p></p> | |
155 | @end example | |
156 | ||
157 | The SXML types allow procedures to @emph{compose}. The types make | |
158 | manifest which parts are HTML elements, and which are text. So you | |
159 | needn't worry about escaping user input; the type transition back to a | |
160 | string handles that for you. @acronym{XSS} vulnerabilities are a thing | |
161 | of the past. | |
162 | ||
163 | Well. That's all very nice and opinionated and such, but how do I use | |
164 | the thing? Read on! | |
165 | ||
8db7e094 AW |
166 | @node URIs |
167 | @subsection Universal Resource Identifiers | |
168 | ||
299cd1a2 AW |
169 | Guile provides a standard data type for Universal Resource Identifiers |
170 | (URIs), as defined in RFC 3986. | |
8db7e094 | 171 | |
299cd1a2 | 172 | The generic URI syntax is as follows: |
8db7e094 | 173 | |
299cd1a2 | 174 | @example |
ac7f17e3 | 175 | URI := scheme ":" ["//" [userinfo "@@"] host [":" port]] path \ |
299cd1a2 AW |
176 | [ "?" query ] [ "#" fragment ] |
177 | @end example | |
8db7e094 | 178 | |
b3f94448 AW |
179 | For example, in the URI, @indicateurl{http://www.gnu.org/help/}, the |
180 | scheme is @code{http}, the host is @code{www.gnu.org}, the path is | |
181 | @code{/help/}, and there is no userinfo, port, query, or path. All URIs | |
182 | have a scheme and a path (though the path might be empty). Some URIs | |
183 | have a host, and some of those have ports and userinfo. Any URI might | |
184 | have a query part or a fragment. | |
8db7e094 | 185 | |
299cd1a2 | 186 | Userinfo is something of an abstraction, as some legacy URI schemes |
b3f94448 AW |
187 | allowed userinfo of the form @code{@var{username}:@var{passwd}}. But |
188 | since passwords do not belong in URIs, the RFC does not want to condone | |
189 | this practice, so it calls anything before the @code{@@} sign | |
299cd1a2 | 190 | @dfn{userinfo}. |
8db7e094 | 191 | |
b3f94448 AW |
192 | Properly speaking, a fragment is not part of a URI. For example, when a |
193 | web browser follows a link to @indicateurl{http://example.com/#foo}, it | |
194 | sends a request for @indicateurl{http://example.com/}, then looks in the | |
195 | resulting page for the fragment identified @code{foo} reference. A | |
196 | fragment identifies a part of a resource, not the resource itself. But | |
197 | it is useful to have a fragment field in the URI record itself, so we | |
198 | hope you will forgive the inconsistency. | |
8db7e094 | 199 | |
299cd1a2 AW |
200 | @example |
201 | (use-modules (web uri)) | |
202 | @end example | |
8db7e094 | 203 | |
299cd1a2 AW |
204 | The following procedures can be found in the @code{(web uri)} |
205 | module. Load it into your Guile, using a form like the above, to have | |
206 | access to them. | |
8db7e094 | 207 | |
2e6f5ea4 | 208 | @deffn {Scheme Procedure} build-uri scheme [#:userinfo=@code{#f}] [#:host=@code{#f}] @ |
569269b4 AW |
209 | [#:port=@code{#f}] [#:path=@code{""}] [#:query=@code{#f}] @ |
210 | [#:fragment=@code{#f}] [#:validate?=@code{#t}] | |
211 | Construct a URI object. @var{scheme} should be a symbol, and the rest | |
212 | of the fields are either strings or @code{#f}. If @var{validate?} is | |
213 | true, also run some consistency checks to make sure that the constructed | |
214 | URI is valid. | |
2e6f5ea4 AW |
215 | @end deffn |
216 | ||
217 | @deffn {Scheme Procedure} uri? x | |
218 | @deffnx {Scheme Procedure} uri-scheme uri | |
219 | @deffnx {Scheme Procedure} uri-userinfo uri | |
220 | @deffnx {Scheme Procedure} uri-host uri | |
221 | @deffnx {Scheme Procedure} uri-port uri | |
222 | @deffnx {Scheme Procedure} uri-path uri | |
223 | @deffnx {Scheme Procedure} uri-query uri | |
224 | @deffnx {Scheme Procedure} uri-fragment uri | |
569269b4 AW |
225 | A predicate and field accessors for the URI record type. The URI scheme |
226 | will be a symbol, and the rest either strings or @code{#f} if not | |
227 | present. | |
2e6f5ea4 | 228 | @end deffn |
299cd1a2 | 229 | |
2e6f5ea4 | 230 | @deffn {Scheme Procedure} string->uri string |
569269b4 AW |
231 | Parse @var{string} into a URI object. Return @code{#f} if the string |
232 | could not be parsed. | |
2e6f5ea4 | 233 | @end deffn |
8db7e094 | 234 | |
2e6f5ea4 | 235 | @deffn {Scheme Procedure} uri->string uri |
569269b4 AW |
236 | Serialize @var{uri} to a string. If the URI has a port that is the |
237 | default port for its scheme, the port is not included in the | |
238 | serialization. | |
2e6f5ea4 | 239 | @end deffn |
8db7e094 | 240 | |
2e6f5ea4 | 241 | @deffn {Scheme Procedure} declare-default-port! scheme port |
569269b4 | 242 | Declare a default port for the given URI scheme. |
2e6f5ea4 | 243 | @end deffn |
8db7e094 | 244 | |
2e6f5ea4 | 245 | @deffn {Scheme Procedure} uri-decode str [#:encoding=@code{"utf-8"}] |
569269b4 AW |
246 | Percent-decode the given @var{str}, according to @var{encoding}, which |
247 | should be the name of a character encoding. | |
8db7e094 AW |
248 | |
249 | Note that this function should not generally be applied to a full URI | |
250 | string. For paths, use split-and-decode-uri-path instead. For query | |
251 | strings, split the query on @code{&} and @code{=} boundaries, and decode | |
252 | the components separately. | |
253 | ||
569269b4 AW |
254 | Note also that percent-encoded strings encode @emph{bytes}, not |
255 | characters. There is no guarantee that a given byte sequence is a valid | |
256 | string encoding. Therefore this routine may signal an error if the | |
257 | decoded bytes are not valid for the given encoding. Pass @code{#f} for | |
258 | @var{encoding} if you want decoded bytes as a bytevector directly. | |
259 | @xref{Ports, @code{set-port-encoding!}}, for more information on | |
260 | character encodings. | |
261 | ||
262 | Returns a string of the decoded characters, or a bytevector if | |
263 | @var{encoding} was @code{#f}. | |
2e6f5ea4 | 264 | @end deffn |
8db7e094 | 265 | |
569269b4 AW |
266 | Fixme: clarify return type. indicate default values. type of |
267 | unescaped-chars. | |
8db7e094 | 268 | |
2e6f5ea4 | 269 | @deffn {Scheme Procedure} uri-encode str [#:encoding=@code{"utf-8"}] [#:unescaped-chars] |
569269b4 AW |
270 | Percent-encode any character not in the character set, |
271 | @var{unescaped-chars}. | |
272 | ||
273 | The default character set includes alphanumerics from ASCII, as well as | |
274 | the special characters @samp{-}, @samp{.}, @samp{_}, and @samp{~}. Any | |
275 | other character will be percent-encoded, by writing out the character to | |
276 | a bytevector within the given @var{encoding}, then encoding each byte as | |
8db7e094 AW |
277 | @code{%@var{HH}}, where @var{HH} is the hexadecimal representation of |
278 | the byte. | |
2e6f5ea4 | 279 | @end deffn |
8db7e094 | 280 | |
2e6f5ea4 | 281 | @deffn {Scheme Procedure} split-and-decode-uri-path path |
8db7e094 AW |
282 | Split @var{path} into its components, and decode each component, |
283 | removing empty components. | |
284 | ||
569269b4 AW |
285 | For example, @code{"/foo/bar%20baz/"} decodes to the two-element list, |
286 | @code{("foo" "bar baz")}. | |
2e6f5ea4 | 287 | @end deffn |
8db7e094 | 288 | |
2e6f5ea4 | 289 | @deffn {Scheme Procedure} encode-and-join-uri-path parts |
8db7e094 AW |
290 | URI-encode each element of @var{parts}, which should be a list of |
291 | strings, and join the parts together with @code{/} as a delimiter. | |
569269b4 AW |
292 | |
293 | For example, the list @code{("scrambled eggs" "biscuits&gravy")} encodes | |
294 | as @code{"scrambled%20eggs/biscuits%26gravy"}. | |
2e6f5ea4 | 295 | @end deffn |
8db7e094 AW |
296 | |
297 | @node HTTP | |
298 | @subsection The Hyper-Text Transfer Protocol | |
299 | ||
299cd1a2 AW |
300 | The initial motivation for including web functionality in Guile, rather |
301 | than rely on an external package, was to establish a standard base on | |
302 | which people can share code. To that end, we continue the focus on data | |
303 | types by providing a number of low-level parsers and unparsers for | |
304 | elements of the HTTP protocol. | |
305 | ||
306 | If you are want to skip the low-level details for now and move on to web | |
ec811439 AW |
307 | pages, @pxref{Web Client}, and @pxref{Web Server}. Otherwise, load the |
308 | HTTP module, and read on. | |
299cd1a2 | 309 | |
8db7e094 AW |
310 | @example |
311 | (use-modules (web http)) | |
312 | @end example | |
313 | ||
299cd1a2 AW |
314 | The focus of the @code{(web http)} module is to parse and unparse |
315 | standard HTTP headers, representing them to Guile as native data | |
316 | structures. For example, a @code{Date:} header will be represented as a | |
317 | SRFI-19 date record (@pxref{SRFI-19}), rather than as a string. | |
318 | ||
319 | Guile tries to follow RFCs fairly strictly---the road to perdition being | |
320 | paved with compatibility hacks---though some allowances are made for | |
321 | not-too-divergent texts. | |
322 | ||
32de1aa7 AW |
323 | Header names are represented as lower-case symbols. |
324 | ||
2e6f5ea4 | 325 | @deffn {Scheme Procedure} string->header name |
32de1aa7 | 326 | Parse @var{name} to a symbolic header name. |
2e6f5ea4 | 327 | @end deffn |
8db7e094 | 328 | |
2e6f5ea4 | 329 | @deffn {Scheme Procedure} header->string sym |
32de1aa7 | 330 | Return the string form for the header named @var{sym}. |
2e6f5ea4 | 331 | @end deffn |
32de1aa7 AW |
332 | |
333 | For example: | |
334 | ||
335 | @example | |
336 | (string->header "Content-Length") | |
337 | @result{} content-length | |
338 | (header->string 'content-length) | |
339 | @result{} "Content-Length" | |
340 | ||
341 | (string->header "FOO") | |
342 | @result{} foo | |
5ec48b70 | 343 | (header->string 'foo) |
32de1aa7 AW |
344 | @result{} "Foo" |
345 | @end example | |
346 | ||
347 | Guile keeps a registry of known headers, their string names, and some | |
348 | parsing and serialization procedures. If a header is unknown, its | |
349 | string name is simply its symbol name in title-case. | |
350 | ||
2e6f5ea4 | 351 | @deffn {Scheme Procedure} known-header? sym |
32de1aa7 AW |
352 | Return @code{#t} iff @var{sym} is a known header, with associated |
353 | parsers and serialization procedures. | |
2e6f5ea4 | 354 | @end deffn |
32de1aa7 | 355 | |
2e6f5ea4 | 356 | @deffn {Scheme Procedure} header-parser sym |
32de1aa7 AW |
357 | Return the value parser for headers named @var{sym}. The result is a |
358 | procedure that takes one argument, a string, and returns the parsed | |
359 | value. If the header isn't known to Guile, a default parser is returned | |
360 | that passes through the string unchanged. | |
2e6f5ea4 | 361 | @end deffn |
32de1aa7 | 362 | |
2e6f5ea4 | 363 | @deffn {Scheme Procedure} header-validator sym |
32de1aa7 AW |
364 | Return a predicate which returns @code{#t} if the given value is valid |
365 | for headers named @var{sym}. The default validator for unknown headers | |
366 | is @code{string?}. | |
2e6f5ea4 | 367 | @end deffn |
32de1aa7 | 368 | |
2e6f5ea4 | 369 | @deffn {Scheme Procedure} header-writer sym |
32de1aa7 AW |
370 | Return a procedure that writes values for headers named @var{sym} to a |
371 | port. The resulting procedure takes two arguments: a value and a port. | |
372 | The default writer is @code{display}. | |
2e6f5ea4 | 373 | @end deffn |
32de1aa7 AW |
374 | |
375 | For more on the set of headers that Guile knows about out of the box, | |
376 | @pxref{HTTP Headers}. To add your own, use the @code{declare-header!} | |
377 | procedure: | |
378 | ||
2e6f5ea4 | 379 | @deffn {Scheme Procedure} declare-header! name parser validator writer [#:multiple?=@code{#f}] |
32de1aa7 | 380 | Declare a parser, validator, and writer for a given header. |
2e6f5ea4 | 381 | @end deffn |
8db7e094 | 382 | |
929ccf48 AW |
383 | For example, let's say you are running a web server behind some sort of |
384 | proxy, and your proxy adds an @code{X-Client-Address} header, indicating | |
385 | the IPv4 address of the original client. You would like for the HTTP | |
386 | request record to parse out this header to a Scheme value, instead of | |
387 | leaving it as a string. You could register this header with Guile's | |
388 | HTTP stack like this: | |
389 | ||
390 | @example | |
32de1aa7 AW |
391 | (declare-header! "X-Client-Address" |
392 | (lambda (str) | |
393 | (inet-aton str)) | |
394 | (lambda (ip) | |
395 | (and (integer? ip) (exact? ip) (<= 0 ip #xffffffff))) | |
396 | (lambda (ip port) | |
397 | (display (inet-ntoa ip) port))) | |
929ccf48 AW |
398 | @end example |
399 | ||
2e6f5ea4 | 400 | @deffn {Scheme Procedure} valid-header? sym val |
929ccf48 AW |
401 | Return a true value iff @var{val} is a valid Scheme value for the header |
402 | with name @var{sym}. | |
2e6f5ea4 | 403 | @end deffn |
8db7e094 | 404 | |
299cd1a2 AW |
405 | Now that we have a generic interface for reading and writing headers, we |
406 | do just that. | |
407 | ||
2e6f5ea4 | 408 | @deffn {Scheme Procedure} read-header port |
929ccf48 | 409 | Read one HTTP header from @var{port}. Return two values: the header |
8db7e094 AW |
410 | name and the parsed Scheme value. May raise an exception if the header |
411 | was known but the value was invalid. | |
412 | ||
929ccf48 AW |
413 | Returns the end-of-file object for both values if the end of the message |
414 | body was reached (i.e., a blank line). | |
2e6f5ea4 | 415 | @end deffn |
8db7e094 | 416 | |
2e6f5ea4 | 417 | @deffn {Scheme Procedure} parse-header name val |
8db7e094 | 418 | Parse @var{val}, a string, with the parser for the header named |
32de1aa7 | 419 | @var{name}. Returns the parsed value. |
2e6f5ea4 | 420 | @end deffn |
8db7e094 | 421 | |
2e6f5ea4 | 422 | @deffn {Scheme Procedure} write-header name val port |
32de1aa7 AW |
423 | Write the given header name and value to @var{port}, using the writer |
424 | from @code{header-writer}. | |
2e6f5ea4 | 425 | @end deffn |
8db7e094 | 426 | |
2e6f5ea4 | 427 | @deffn {Scheme Procedure} read-headers port |
929ccf48 AW |
428 | Read the headers of an HTTP message from @var{port}, returning the |
429 | headers as an ordered alist. | |
2e6f5ea4 | 430 | @end deffn |
8db7e094 | 431 | |
2e6f5ea4 | 432 | @deffn {Scheme Procedure} write-headers headers port |
8db7e094 | 433 | Write the given header alist to @var{port}. Doesn't write the final |
32de1aa7 | 434 | @samp{\r\n}, as the user might want to add another header. |
2e6f5ea4 | 435 | @end deffn |
8db7e094 | 436 | |
299cd1a2 AW |
437 | The @code{(web http)} module also has some utility procedures to read |
438 | and write request and response lines. | |
439 | ||
2e6f5ea4 | 440 | @deffn {Scheme Procedure} parse-http-method str [start] [end] |
8db7e094 AW |
441 | Parse an HTTP method from @var{str}. The result is an upper-case symbol, |
442 | like @code{GET}. | |
2e6f5ea4 | 443 | @end deffn |
8db7e094 | 444 | |
2e6f5ea4 | 445 | @deffn {Scheme Procedure} parse-http-version str [start] [end] |
8db7e094 AW |
446 | Parse an HTTP version from @var{str}, returning it as a major-minor |
447 | pair. For example, @code{HTTP/1.1} parses as the pair of integers, | |
448 | @code{(1 . 1)}. | |
2e6f5ea4 | 449 | @end deffn |
8db7e094 | 450 | |
2e6f5ea4 | 451 | @deffn {Scheme Procedure} parse-request-uri str [start] [end] |
8db7e094 AW |
452 | Parse a URI from an HTTP request line. Note that URIs in requests do not |
453 | have to have a scheme or host name. The result is a URI object. | |
2e6f5ea4 | 454 | @end deffn |
8db7e094 | 455 | |
2e6f5ea4 | 456 | @deffn {Scheme Procedure} read-request-line port |
8db7e094 AW |
457 | Read the first line of an HTTP request from @var{port}, returning three |
458 | values: the method, the URI, and the version. | |
2e6f5ea4 | 459 | @end deffn |
8db7e094 | 460 | |
2e6f5ea4 | 461 | @deffn {Scheme Procedure} write-request-line method uri version port |
8db7e094 | 462 | Write the first line of an HTTP request to @var{port}. |
2e6f5ea4 | 463 | @end deffn |
8db7e094 | 464 | |
2e6f5ea4 | 465 | @deffn {Scheme Procedure} read-response-line port |
8db7e094 AW |
466 | Read the first line of an HTTP response from @var{port}, returning three |
467 | values: the HTTP version, the response code, and the "reason phrase". | |
2e6f5ea4 | 468 | @end deffn |
8db7e094 | 469 | |
2e6f5ea4 | 470 | @deffn {Scheme Procedure} write-response-line version code reason-phrase port |
8db7e094 | 471 | Write the first line of an HTTP response to @var{port}. |
2e6f5ea4 | 472 | @end deffn |
8db7e094 AW |
473 | |
474 | ||
1148d029 AW |
475 | @node HTTP Headers |
476 | @subsection HTTP Headers | |
477 | ||
ff8339db AW |
478 | In addition to defining the infrastructure to parse headers, the |
479 | @code{(web http)} module defines specific parsers and unparsers for all | |
480 | headers defined in the HTTP/1.1 standard. | |
1148d029 | 481 | |
ff8339db AW |
482 | For example, if you receive a header named @samp{Accept-Language} with a |
483 | value @samp{en, es;q=0.8}, Guile parses it as a quality list (defined | |
484 | below): | |
485 | ||
486 | @example | |
487 | (parse-header 'accept-language "en, es;q=0.8") | |
488 | @result{} ((1000 . "en") (800 . "es")) | |
489 | @end example | |
490 | ||
491 | The format of the value for @samp{Accept-Language} headers is defined | |
492 | below, along with all other headers defined in the HTTP standard. (If | |
493 | the header were unknown, the value would have been returned as a | |
494 | string.) | |
495 | ||
496 | For brevity, the header definitions below are given in the form, | |
497 | @var{Type} @code{@var{name}}, indicating that values for the header | |
498 | @code{@var{name}} will be of the given @var{Type}. Since Guile | |
499 | internally treats header names in lower case, in this document we give | |
500 | types title-cased names. A short description of the each header's | |
501 | purpose and an example follow. | |
502 | ||
503 | For full details on the meanings of all of these headers, see the HTTP | |
504 | 1.1 standard, RFC 2616. | |
505 | ||
506 | @subsubsection HTTP Header Types | |
507 | ||
508 | Here we define the types that are used below, when defining headers. | |
509 | ||
510 | @deftp {HTTP Header Type} Date | |
511 | A SRFI-19 date. | |
512 | @end deftp | |
513 | ||
514 | @deftp {HTTP Header Type} KVList | |
515 | A list whose elements are keys or key-value pairs. Keys are parsed to | |
516 | symbols. Values are strings by default. Non-string values are the | |
517 | exception, and are mentioned explicitly below, as appropriate. | |
518 | @end deftp | |
519 | ||
520 | @deftp {HTTP Header Type} SList | |
521 | A list of strings. | |
522 | @end deftp | |
523 | ||
524 | @deftp {HTTP Header Type} Quality | |
525 | An exact integer between 0 and 1000. Qualities are used to express | |
526 | preference, given multiple options. An option with a quality of 870, | |
527 | for example, is preferred over an option with quality 500. | |
528 | ||
529 | (Qualities are written out over the wire as numbers between 0.0 and | |
530 | 1.0, but since the standard only allows three digits after the decimal, | |
531 | it's equivalent to integers between 0 and 1000, so that's what Guile | |
532 | uses.) | |
533 | @end deftp | |
534 | ||
535 | @deftp {HTTP Header Type} QList | |
536 | A quality list: a list of pairs, the car of which is a quality, and the | |
537 | cdr a string. Used to express a list of options, along with their | |
538 | qualities. | |
539 | @end deftp | |
540 | ||
541 | @deftp {HTTP Header Type} ETag | |
542 | An entity tag, represented as a pair. The car of the pair is an opaque | |
543 | string, and the cdr is @code{#t} if the entity tag is a ``strong'' entity | |
544 | tag, and @code{#f} otherwise. | |
545 | @end deftp | |
1148d029 AW |
546 | |
547 | @subsubsection General Headers | |
548 | ||
ff8339db AW |
549 | General HTTP headers may be present in any HTTP message. |
550 | ||
551 | @deftypevr {HTTP Header} KVList cache-control | |
552 | A key-value list of cache-control directives. See RFC 2616, for more | |
553 | details. | |
1148d029 AW |
554 | |
555 | If present, parameters to @code{max-age}, @code{max-stale}, | |
556 | @code{min-fresh}, and @code{s-maxage} are all parsed as non-negative | |
557 | integers. | |
558 | ||
559 | If present, parameters to @code{private} and @code{no-cache} are parsed | |
ff8339db | 560 | as lists of header names, as symbols. |
1148d029 | 561 | |
ff8339db AW |
562 | @example |
563 | (parse-header 'cache-control "no-cache,no-store" | |
564 | @result{} (no-cache no-store) | |
565 | (parse-header 'cache-control "no-cache=\"Authorization,Date\",no-store" | |
566 | @result{} ((no-cache . (authorization date)) no-store) | |
567 | (parse-header 'cache-control "no-cache=\"Authorization,Date\",max-age=10" | |
568 | @result{} ((no-cache . (authorization date)) (max-age . 10)) | |
569 | @end example | |
570 | @end deftypevr | |
1148d029 | 571 | |
ff8339db AW |
572 | @deftypevr {HTTP Header} List connection |
573 | A list of header names that apply only to this HTTP connection, as | |
574 | symbols. Additionally, the symbol @samp{close} may be present, to | |
575 | indicate that the server should close the connection after responding to | |
576 | the request. | |
577 | @example | |
578 | (parse-header 'connection "close") | |
579 | @result{} (close) | |
580 | @end example | |
581 | @end deftypevr | |
1148d029 | 582 | |
ff8339db AW |
583 | @deftypevr {HTTP Header} Date date |
584 | The date that a given HTTP message was originated. | |
585 | @example | |
586 | (parse-header 'date "Tue, 15 Nov 1994 08:12:31 GMT") | |
587 | @result{} #<date ...> | |
588 | @end example | |
589 | @end deftypevr | |
1148d029 | 590 | |
ff8339db AW |
591 | @deftypevr {HTTP Header} KVList pragma |
592 | A key-value list of implementation-specific directives. | |
593 | @example | |
594 | (parse-header 'pragma "no-cache, broccoli=tasty") | |
595 | @result{} (no-cache (broccoli . "tasty")) | |
596 | @end example | |
597 | @end deftypevr | |
1148d029 | 598 | |
ff8339db AW |
599 | @deftypevr {HTTP Header} List trailer |
600 | A list of header names which will appear after the message body, instead | |
601 | of with the message headers. | |
602 | @example | |
603 | (parse-header 'trailer "ETag") | |
604 | @result{} (etag) | |
605 | @end example | |
606 | @end deftypevr | |
1148d029 | 607 | |
ff8339db AW |
608 | @deftypevr {HTTP Header} List transfer-encoding |
609 | A list of transfer codings, expressed as key-value lists. The only | |
610 | transfer coding defined by the specification is @code{chunked}. | |
611 | @example | |
612 | (parse-header 'transfer-encoding "chunked") | |
07154109 | 613 | @result{} ((chunked)) |
ff8339db AW |
614 | @end example |
615 | @end deftypevr | |
1148d029 | 616 | |
ff8339db AW |
617 | @deftypevr {HTTP Header} List upgrade |
618 | A list of strings, indicating additional protocols that a server could use | |
619 | in response to a request. | |
620 | @example | |
621 | (parse-header 'upgrade "WebSocket") | |
622 | @result{} ("WebSocket") | |
623 | @end example | |
624 | @end deftypevr | |
1148d029 | 625 | |
ff8339db AW |
626 | FIXME: parse out more fully? |
627 | @deftypevr {HTTP Header} List via | |
628 | A list of strings, indicating the protocol versions and hosts of | |
629 | intermediate servers and proxies. There may be multiple @code{via} | |
630 | headers in one message. | |
631 | @example | |
632 | (parse-header 'via "1.0 venus, 1.1 mars") | |
633 | @result{} ("1.0 venus" "1.1 mars") | |
634 | @end example | |
635 | @end deftypevr | |
636 | ||
637 | @deftypevr {HTTP Header} List warning | |
638 | A list of warnings given by a server or intermediate proxy. Each | |
639 | warning is a itself a list of four elements: a code, as an exact integer | |
640 | between 0 and 1000, a host as a string, the warning text as a string, | |
641 | and either @code{#f} or a SRFI-19 date. | |
1148d029 AW |
642 | |
643 | There may be multiple @code{warning} headers in one message. | |
ff8339db AW |
644 | @example |
645 | (parse-header 'warning "123 foo \"core breach imminent\"") | |
646 | @result{} ((123 "foo" "core-breach imminent" #f)) | |
647 | @end example | |
648 | @end deftypevr | |
1148d029 AW |
649 | |
650 | ||
651 | @subsubsection Entity Headers | |
652 | ||
ff8339db AW |
653 | Entity headers may be present in any HTTP message, and refer to the |
654 | resource referenced in the HTTP request or response. | |
1148d029 | 655 | |
ff8339db AW |
656 | @deftypevr {HTTP Header} List allow |
657 | A list of allowed methods on a given resource, as symbols. | |
658 | @example | |
659 | (parse-header 'allow "GET, HEAD") | |
660 | @result{} (GET HEAD) | |
661 | @end example | |
662 | @end deftypevr | |
1148d029 | 663 | |
ff8339db AW |
664 | @deftypevr {HTTP Header} List content-encoding |
665 | A list of content codings, as symbols. | |
666 | @example | |
667 | (parse-header 'content-encoding "gzip") | |
668 | @result{} (GET HEAD) | |
669 | @end example | |
670 | @end deftypevr | |
1148d029 | 671 | |
ff8339db AW |
672 | @deftypevr {HTTP Header} List content-language |
673 | The languages that a resource is in, as strings. | |
674 | @example | |
675 | (parse-header 'content-language "en") | |
676 | @result{} ("en") | |
677 | @end example | |
678 | @end deftypevr | |
1148d029 | 679 | |
ff8339db AW |
680 | @deftypevr {HTTP Header} UInt content-length |
681 | The number of bytes in a resource, as an exact, non-negative integer. | |
682 | @example | |
683 | (parse-header 'content-length "300") | |
684 | @result{} 300 | |
685 | @end example | |
686 | @end deftypevr | |
1148d029 | 687 | |
ff8339db AW |
688 | @deftypevr {HTTP Header} URI content-location |
689 | The canonical URI for a resource, in the case that it is also accessible | |
690 | from a different URI. | |
691 | @example | |
692 | (parse-header 'content-location "http://example.com/foo") | |
693 | @result{} #<<uri> ...> | |
694 | @end example | |
695 | @end deftypevr | |
1148d029 | 696 | |
ff8339db AW |
697 | @deftypevr {HTTP Header} String content-md5 |
698 | The MD5 digest of a resource. | |
699 | @example | |
700 | (parse-header 'content-md5 "ffaea1a79810785575e29e2bd45e2fa5") | |
701 | @result{} "ffaea1a79810785575e29e2bd45e2fa5" | |
702 | @end example | |
703 | @end deftypevr | |
704 | ||
705 | @deftypevr {HTTP Header} List content-range | |
706 | A range specification, as a list of three elements: the symbol | |
707 | @code{bytes}, either the symbol @code{*} or a pair of integers, | |
708 | indicating the byte rage, and either @code{*} or an integer, for the | |
709 | instance length. Used to indicate that a response only includes part of | |
710 | a resource. | |
711 | @example | |
712 | (parse-header 'content-range "bytes 10-20/*") | |
713 | @result{} (bytes (10 . 20) *) | |
714 | @end example | |
715 | @end deftypevr | |
1148d029 | 716 | |
ff8339db AW |
717 | @deftypevr {HTTP Header} List content-type |
718 | The MIME type of a resource, as a symbol, along with any parameters. | |
719 | @example | |
720 | (parse-header 'content-length "text/plain") | |
721 | @result{} (text/plain) | |
722 | (parse-header 'content-length "text/plain;charset=utf-8") | |
723 | @result{} (text/plain (charset . "utf-8")) | |
724 | @end example | |
725 | Note that the @code{charset} parameter is something is a misnomer, and | |
726 | the HTTP specification admits this. It specifies the @emph{encoding} of | |
727 | the characters, not the character set. | |
728 | @end deftypevr | |
729 | ||
730 | @deftypevr {HTTP Header} Date expires | |
731 | The date/time after which the resource given in a response is considered | |
732 | stale. | |
733 | @example | |
734 | (parse-header 'expires "Tue, 15 Nov 1994 08:12:31 GMT") | |
735 | @result{} #<date ...> | |
736 | @end example | |
737 | @end deftypevr | |
58baff08 | 738 | |
ff8339db AW |
739 | @deftypevr {HTTP Header} Date last-modified |
740 | The date/time on which the resource given in a response was last | |
741 | modified. | |
742 | @example | |
743 | (parse-header 'expires "Tue, 15 Nov 1994 08:12:31 GMT") | |
744 | @result{} #<date ...> | |
745 | @end example | |
746 | @end deftypevr | |
1148d029 AW |
747 | |
748 | ||
749 | @subsubsection Request Headers | |
750 | ||
ff8339db | 751 | Request headers may only appear in an HTTP request, not in a response. |
1148d029 | 752 | |
ff8339db AW |
753 | @deftypevr {HTTP Header} List accept |
754 | A list of preferred media types for a response. Each element of the | |
755 | list is itself a list, in the same format as @code{content-type}. | |
756 | @example | |
757 | (parse-header 'accept "text/html,text/plain;charset=utf-8") | |
758 | @result{} ((text/html) (text/plain (charset . "utf-8"))) | |
759 | @end example | |
ecb87335 | 760 | Preference is expressed with quality values: |
ff8339db AW |
761 | @example |
762 | (parse-header 'accept "text/html;q=0.8,text/plain;q=0.6") | |
763 | @result{} ((text/html (q . 800)) (text/plain (q . 600))) | |
764 | @end example | |
765 | @end deftypevr | |
1148d029 | 766 | |
ff8339db AW |
767 | @deftypevr {HTTP Header} QList accept-charset |
768 | A quality list of acceptable charsets. Note again that what HTTP calls | |
769 | a ``charset'' is what Guile calls a ``character encoding''. | |
770 | @example | |
771 | (parse-header 'accept-charset "iso-8859-5, unicode-1-1;q=0.8") | |
772 | @result{} ((1000 . "iso-8859-5") (800 . "unicode-1-1")) | |
773 | @end example | |
774 | @end deftypevr | |
1148d029 | 775 | |
ff8339db AW |
776 | @deftypevr {HTTP Header} QList accept-encoding |
777 | A quality list of acceptable content codings. | |
778 | @example | |
779 | (parse-header 'accept-encoding "gzip,identity=0.8") | |
780 | @result{} ((1000 . "gzip") (800 . "identity")) | |
781 | @end example | |
782 | @end deftypevr | |
1148d029 | 783 | |
ff8339db AW |
784 | @deftypevr {HTTP Header} QList accept-language |
785 | A quality list of acceptable languages. | |
786 | @example | |
787 | (parse-header 'accept-language "cn,en=0.75") | |
788 | @result{} ((1000 . "cn") (750 . "en")) | |
789 | @end example | |
790 | @end deftypevr | |
791 | ||
792 | @deftypevr {HTTP Header} Pair authorization | |
793 | Authorization credentials. The car of the pair indicates the | |
794 | authentication scheme, like @code{basic}. For basic authentication, the | |
795 | cdr of the pair will be the base64-encoded @samp{@var{user}:@var{pass}} | |
796 | string. For other authentication schemes, like @code{digest}, the cdr | |
797 | will be a key-value list of credentials. | |
798 | @example | |
799 | (parse-header 'authorization "Basic QWxhZGRpbjpvcGVuIHNlc2FtZQ==" | |
800 | @result{} (basic . "QWxhZGRpbjpvcGVuIHNlc2FtZQ==") | |
801 | @end example | |
802 | @end deftypevr | |
1148d029 | 803 | |
ff8339db AW |
804 | @deftypevr {HTTP Header} List expect |
805 | A list of expectations that a client has of a server. The expectations | |
806 | are key-value lists. | |
807 | @example | |
808 | (parse-header 'expect "100-continue") | |
809 | @result{} ((100-continue)) | |
810 | @end example | |
811 | @end deftypevr | |
1148d029 | 812 | |
ff8339db AW |
813 | @deftypevr {HTTP Header} String from |
814 | The email address of a user making an HTTP request. | |
815 | @example | |
816 | (parse-header 'from "bob@@example.com") | |
817 | @result{} "bob@@example.com" | |
818 | @end example | |
819 | @end deftypevr | |
1148d029 | 820 | |
ff8339db AW |
821 | @deftypevr {HTTP Header} Pair host |
822 | The host for the resource being requested, as a hostname-port pair. If | |
823 | no port is given, the port is @code{#f}. | |
824 | @example | |
825 | (parse-header 'host "gnu.org:80") | |
826 | @result{} ("gnu.org" . 80) | |
827 | (parse-header 'host "gnu.org") | |
828 | @result{} ("gnu.org" . #f) | |
829 | @end example | |
830 | @end deftypevr | |
1148d029 | 831 | |
ff8339db AW |
832 | @deftypevr {HTTP Header} *|List if-match |
833 | A set of etags, indicating that the request should proceed if and only | |
834 | if the etag of the resource is in that set. Either the symbol @code{*}, | |
835 | indicating any etag, or a list of entity tags. | |
836 | @example | |
837 | (parse-header 'if-match "*") | |
838 | @result{} * | |
839 | (parse-header 'if-match "asdfadf") | |
840 | @result{} (("asdfadf" . #t)) | |
841 | (parse-header 'if-match W/"asdfadf") | |
842 | @result{} (("asdfadf" . #f)) | |
843 | @end example | |
844 | @end deftypevr | |
1148d029 | 845 | |
ff8339db AW |
846 | @deftypevr {HTTP Header} Date if-modified-since |
847 | Indicates that a response should proceed if and only if the resource has | |
848 | been modified since the given date. | |
849 | @example | |
850 | (parse-header if-modified-since "Tue, 15 Nov 1994 08:12:31 GMT") | |
851 | @result{} #<date ...> | |
852 | @end example | |
853 | @end deftypevr | |
1148d029 | 854 | |
ff8339db AW |
855 | @deftypevr {HTTP Header} *|List if-none-match |
856 | A set of etags, indicating that the request should proceed if and only | |
857 | if the etag of the resource is not in the set. Either the symbol | |
858 | @code{*}, indicating any etag, or a list of entity tags. | |
859 | @example | |
860 | (parse-header 'if-none-match "*") | |
861 | @result{} * | |
862 | @end example | |
863 | @end deftypevr | |
1148d029 | 864 | |
ff8339db AW |
865 | @deftypevr {HTTP Header} ETag|Date if-range |
866 | Indicates that the range request should proceed if and only if the | |
867 | resource matches a modification date or an etag. Either an entity tag, | |
868 | or a SRFI-19 date. | |
869 | @example | |
870 | (parse-header 'if-range "\"original-etag\"") | |
871 | @result{} ("original-etag" . #t) | |
872 | @end example | |
873 | @end deftypevr | |
1148d029 | 874 | |
ff8339db AW |
875 | @deftypevr {HTTP Header} Date if-unmodified-since |
876 | Indicates that a response should proceed if and only if the resource has | |
877 | not been modified since the given date. | |
878 | @example | |
879 | (parse-header 'if-not-modified-since "Tue, 15 Nov 1994 08:12:31 GMT") | |
880 | @result{} #<date ...> | |
881 | @end example | |
882 | @end deftypevr | |
1148d029 | 883 | |
ff8339db AW |
884 | @deftypevr {HTTP Header} UInt max-forwards |
885 | The maximum number of proxy or gateway hops that a request should be | |
886 | subject to. | |
887 | @example | |
888 | (parse-header 'max-forwards "10") | |
889 | @result{} 10 | |
890 | @end example | |
891 | @end deftypevr | |
1148d029 | 892 | |
ff8339db AW |
893 | @deftypevr {HTTP Header} Pair proxy-authorization |
894 | Authorization credentials for a proxy connection. See the documentation | |
895 | for @code{authorization} above for more information on the format. | |
896 | @example | |
897 | (parse-header 'proxy-authorization "Digest foo=bar,baz=qux" | |
898 | @result{} (digest (foo . "bar") (baz . "qux")) | |
899 | @end example | |
900 | @end deftypevr | |
901 | ||
902 | @deftypevr {HTTP Header} Pair range | |
903 | A range request, indicating that the client wants only part of a | |
904 | resource. The car of the pair is the symbol @code{bytes}, and the cdr | |
905 | is a list of pairs. Each element of the cdr indicates a range; the car | |
906 | is the first byte position and the cdr is the last byte position, as | |
907 | integers, or @code{#f} if not given. | |
908 | @example | |
909 | (parse-header 'range "bytes=10-30,50-") | |
910 | @result{} (bytes (10 . 30) (50 . #f)) | |
911 | @end example | |
912 | @end deftypevr | |
1148d029 | 913 | |
ff8339db AW |
914 | @deftypevr {HTTP Header} URI referer |
915 | The URI of the resource that referred the user to this resource. The | |
916 | name of the header is a misspelling, but we are stuck with it. | |
917 | @example | |
918 | (parse-header 'referer "http://www.gnu.org/") | |
919 | @result{} #<uri ...> | |
920 | @end example | |
921 | @end deftypevr | |
1148d029 | 922 | |
ff8339db AW |
923 | @deftypevr {HTTP Header} List te |
924 | A list of transfer codings, expressed as key-value lists. A common | |
925 | transfer coding is @code{trailers}. | |
926 | @example | |
927 | (parse-header 'te "trailers") | |
928 | @result{} ((trailers)) | |
929 | @end example | |
930 | @end deftypevr | |
1148d029 | 931 | |
ff8339db AW |
932 | @deftypevr {HTTP Header} String user-agent |
933 | A string indicating the user agent making the request. The | |
934 | specification defines a structured format for this header, but it is | |
935 | widely disregarded, so Guile does not attempt to parse strictly. | |
936 | @example | |
937 | (parse-header 'user-agent "Mozilla/5.0") | |
938 | @result{} "Mozilla/5.0" | |
939 | @end example | |
940 | @end deftypevr | |
1148d029 AW |
941 | |
942 | ||
943 | @subsubsection Response Headers | |
944 | ||
ff8339db AW |
945 | @deftypevr {HTTP Header} List accept-ranges |
946 | A list of range units that the server supports, as symbols. | |
947 | @example | |
948 | (parse-header 'accept-ranges "bytes") | |
949 | @result{} (bytes) | |
950 | @end example | |
951 | @end deftypevr | |
1148d029 | 952 | |
ff8339db AW |
953 | @deftypevr {HTTP Header} UInt age |
954 | The age of a cached response, in seconds. | |
955 | @example | |
956 | (parse-header 'age "3600") | |
957 | @result{} 3600 | |
958 | @end example | |
959 | @end deftypevr | |
1148d029 | 960 | |
ff8339db AW |
961 | @deftypevr {HTTP Header} ETag etag |
962 | The entity-tag of the resource. | |
963 | @example | |
964 | (parse-header 'etag "\"foo\"") | |
965 | @result{} ("foo" . #t) | |
966 | @end example | |
967 | @end deftypevr | |
1148d029 | 968 | |
ff8339db AW |
969 | @deftypevr {HTTP Header} URI location |
970 | A URI on which a request may be completed. Used in combination with a | |
971 | redirecting status code to perform client-side redirection. | |
972 | @example | |
973 | (parse-header 'location "http://example.com/other") | |
974 | @result{} #<uri ...> | |
975 | @end example | |
976 | @end deftypevr | |
1148d029 | 977 | |
ff8339db AW |
978 | @deftypevr {HTTP Header} List proxy-authenticate |
979 | A list of challenges to a proxy, indicating the need for authentication. | |
980 | @example | |
981 | (parse-header 'proxy-authenticate "Basic realm=\"foo\"") | |
982 | @result{} ((basic (realm . "foo"))) | |
983 | @end example | |
984 | @end deftypevr | |
1148d029 | 985 | |
ff8339db AW |
986 | @deftypevr {HTTP Header} UInt|Date retry-after |
987 | Used in combination with a server-busy status code, like 503, to | |
988 | indicate that a client should retry later. Either a number of seconds, | |
989 | or a date. | |
990 | @example | |
991 | (parse-header 'retry-after "60") | |
992 | @result{} 60 | |
993 | @end example | |
994 | @end deftypevr | |
1148d029 | 995 | |
ff8339db AW |
996 | @deftypevr {HTTP Header} String server |
997 | A string identifying the server. | |
998 | @example | |
999 | (parse-header 'server "My first web server") | |
1000 | @result{} "My first web server" | |
1001 | @end example | |
1002 | @end deftypevr | |
1148d029 | 1003 | |
ff8339db AW |
1004 | @deftypevr {HTTP Header} *|List vary |
1005 | A set of request headers that were used in computing this response. | |
ecb87335 | 1006 | Used to indicate that server-side content negotiation was performed, for |
ff8339db AW |
1007 | example in response to the @code{accept-language} header. Can also be |
1008 | the symbol @code{*}, indicating that all headers were considered. | |
1009 | @example | |
1010 | (parse-header 'vary "Accept-Language, Accept") | |
1011 | @result{} (accept-language accept) | |
1012 | @end example | |
1013 | @end deftypevr | |
1148d029 | 1014 | |
ff8339db AW |
1015 | @deftypevr {HTTP Header} List www-authenticate |
1016 | A list of challenges to a user, indicating the need for authentication. | |
1017 | @example | |
1018 | (parse-header 'www-authenticate "Basic realm=\"foo\"") | |
1019 | @result{} ((basic (realm . "foo"))) | |
1020 | @end example | |
1021 | @end deftypevr | |
1148d029 AW |
1022 | |
1023 | ||
8db7e094 AW |
1024 | @node Requests |
1025 | @subsection HTTP Requests | |
1026 | ||
1027 | @example | |
1028 | (use-modules (web request)) | |
1029 | @end example | |
1030 | ||
de54fb6d | 1031 | The request module contains a data type for HTTP requests. |
8db7e094 | 1032 | |
de54fb6d AW |
1033 | @subsubsection An Important Note on Character Sets |
1034 | ||
1035 | HTTP requests consist of two parts: the request proper, consisting of a | |
1036 | request line and a set of headers, and (optionally) a body. The body | |
1037 | might have a binary content-type, and even in the textual case its | |
1038 | length is specified in bytes, not characters. | |
1039 | ||
1040 | Therefore, HTTP is a fundamentally binary protocol. However the request | |
1041 | line and headers are specified to be in a subset of ASCII, so they can | |
1042 | be treated as text, provided that the port's encoding is set to an | |
1043 | ASCII-compatible one-byte-per-character encoding. ISO-8859-1 (latin-1) | |
1044 | is just such an encoding, and happens to be very efficient for Guile. | |
1045 | ||
1046 | So what Guile does when reading requests from the wire, or writing them | |
1047 | out, is to set the port's encoding to latin-1, and treating the request | |
1048 | headers as text. | |
1049 | ||
1050 | The request body is another issue. For binary data, the data is | |
1051 | probably in a bytevector, so we use the R6RS binary output procedures to | |
1052 | write out the binary payload. Textual data usually has to be written | |
1053 | out to some character encoding, usually UTF-8, and then the resulting | |
1054 | bytevector is written out to the port. | |
1055 | ||
1056 | In summary, Guile reads and writes HTTP over latin-1 sockets, without | |
1057 | any loss of generality. | |
1058 | ||
1059 | @subsubsection Request API | |
8db7e094 | 1060 | |
2e6f5ea4 AW |
1061 | @deffn {Scheme Procedure} request? |
1062 | @deffnx {Scheme Procedure} request-method | |
1063 | @deffnx {Scheme Procedure} request-uri | |
1064 | @deffnx {Scheme Procedure} request-version | |
1065 | @deffnx {Scheme Procedure} request-headers | |
1066 | @deffnx {Scheme Procedure} request-meta | |
1067 | @deffnx {Scheme Procedure} request-port | |
e471a3ee AW |
1068 | A predicate and field accessors for the request type. The fields are as |
1069 | follows: | |
1070 | @table @code | |
1071 | @item method | |
1072 | The HTTP method, for example, @code{GET}. | |
1073 | @item uri | |
1074 | The URI as a URI record. | |
1075 | @item version | |
1076 | The HTTP version pair, like @code{(1 . 1)}. | |
1077 | @item headers | |
1078 | The request headers, as an alist of parsed values. | |
1079 | @item meta | |
1080 | An arbitrary alist of other data, for example information returned in | |
1081 | the @code{sockaddr} from @code{accept} (@pxref{Network Sockets and | |
1082 | Communication}). | |
1083 | @item port | |
1084 | The port on which to read or write a request body, if any. | |
1085 | @end table | |
2e6f5ea4 | 1086 | @end deffn |
8db7e094 | 1087 | |
2e6f5ea4 | 1088 | @deffn {Scheme Procedure} read-request port [meta='()] |
8db7e094 AW |
1089 | Read an HTTP request from @var{port}, optionally attaching the given |
1090 | metadata, @var{meta}. | |
1091 | ||
1092 | As a side effect, sets the encoding on @var{port} to ISO-8859-1 | |
1093 | (latin-1), so that reading one character reads one byte. See the | |
de54fb6d AW |
1094 | discussion of character sets above, for more information. |
1095 | ||
1096 | Note that the body is not part of the request. Once you have read a | |
1097 | request, you may read the body separately, and likewise for writing | |
1098 | requests. | |
2e6f5ea4 | 1099 | @end deffn |
de54fb6d | 1100 | |
2e6f5ea4 | 1101 | @deffn {Scheme Procedure} build-request uri [#:method='GET] [#:version='(1 . 1)] [#:headers='()] [#:port=#f] [#:meta='()] [#:validate-headers?=#t] |
de54fb6d AW |
1102 | Construct an HTTP request object. If @var{validate-headers?} is true, |
1103 | the headers are each run through their respective validators. | |
2e6f5ea4 | 1104 | @end deffn |
8db7e094 | 1105 | |
2e6f5ea4 | 1106 | @deffn {Scheme Procedure} write-request r port |
8db7e094 AW |
1107 | Write the given HTTP request to @var{port}. |
1108 | ||
de54fb6d | 1109 | Return a new request, whose @code{request-port} will continue writing |
8db7e094 | 1110 | on @var{port}, perhaps using some transfer encoding. |
2e6f5ea4 | 1111 | @end deffn |
8db7e094 | 1112 | |
2e6f5ea4 | 1113 | @deffn {Scheme Procedure} read-request-body r |
de54fb6d | 1114 | Reads the request body from @var{r}, as a bytevector. Return @code{#f} |
8db7e094 | 1115 | if there was no request body. |
2e6f5ea4 | 1116 | @end deffn |
8db7e094 | 1117 | |
2e6f5ea4 | 1118 | @deffn {Scheme Procedure} write-request-body r bv |
8db7e094 AW |
1119 | Write @var{body}, a bytevector, to the port corresponding to the HTTP |
1120 | request @var{r}. | |
2e6f5ea4 | 1121 | @end deffn |
8db7e094 | 1122 | |
e471a3ee AW |
1123 | The various headers that are typically associated with HTTP requests may |
1124 | be accessed with these dedicated accessors. @xref{HTTP Headers}, for | |
1125 | more information on the format of parsed headers. | |
1126 | ||
2e6f5ea4 AW |
1127 | @deffn {Scheme Procedure} request-accept request [default='()] |
1128 | @deffnx {Scheme Procedure} request-accept-charset request [default='()] | |
1129 | @deffnx {Scheme Procedure} request-accept-encoding request [default='()] | |
1130 | @deffnx {Scheme Procedure} request-accept-language request [default='()] | |
1131 | @deffnx {Scheme Procedure} request-allow request [default='()] | |
1132 | @deffnx {Scheme Procedure} request-authorization request [default=#f] | |
1133 | @deffnx {Scheme Procedure} request-cache-control request [default='()] | |
1134 | @deffnx {Scheme Procedure} request-connection request [default='()] | |
1135 | @deffnx {Scheme Procedure} request-content-encoding request [default='()] | |
1136 | @deffnx {Scheme Procedure} request-content-language request [default='()] | |
1137 | @deffnx {Scheme Procedure} request-content-length request [default=#f] | |
1138 | @deffnx {Scheme Procedure} request-content-location request [default=#f] | |
1139 | @deffnx {Scheme Procedure} request-content-md5 request [default=#f] | |
1140 | @deffnx {Scheme Procedure} request-content-range request [default=#f] | |
1141 | @deffnx {Scheme Procedure} request-content-type request [default=#f] | |
1142 | @deffnx {Scheme Procedure} request-date request [default=#f] | |
1143 | @deffnx {Scheme Procedure} request-expect request [default='()] | |
1144 | @deffnx {Scheme Procedure} request-expires request [default=#f] | |
1145 | @deffnx {Scheme Procedure} request-from request [default=#f] | |
1146 | @deffnx {Scheme Procedure} request-host request [default=#f] | |
1147 | @deffnx {Scheme Procedure} request-if-match request [default=#f] | |
1148 | @deffnx {Scheme Procedure} request-if-modified-since request [default=#f] | |
1149 | @deffnx {Scheme Procedure} request-if-none-match request [default=#f] | |
1150 | @deffnx {Scheme Procedure} request-if-range request [default=#f] | |
1151 | @deffnx {Scheme Procedure} request-if-unmodified-since request [default=#f] | |
1152 | @deffnx {Scheme Procedure} request-last-modified request [default=#f] | |
1153 | @deffnx {Scheme Procedure} request-max-forwards request [default=#f] | |
1154 | @deffnx {Scheme Procedure} request-pragma request [default='()] | |
1155 | @deffnx {Scheme Procedure} request-proxy-authorization request [default=#f] | |
1156 | @deffnx {Scheme Procedure} request-range request [default=#f] | |
1157 | @deffnx {Scheme Procedure} request-referer request [default=#f] | |
1158 | @deffnx {Scheme Procedure} request-te request [default=#f] | |
1159 | @deffnx {Scheme Procedure} request-trailer request [default='()] | |
1160 | @deffnx {Scheme Procedure} request-transfer-encoding request [default='()] | |
1161 | @deffnx {Scheme Procedure} request-upgrade request [default='()] | |
1162 | @deffnx {Scheme Procedure} request-user-agent request [default=#f] | |
1163 | @deffnx {Scheme Procedure} request-via request [default='()] | |
1164 | @deffnx {Scheme Procedure} request-warning request [default='()] | |
e471a3ee | 1165 | Return the given request header, or @var{default} if none was present. |
2e6f5ea4 | 1166 | @end deffn |
8db7e094 | 1167 | |
2e6f5ea4 | 1168 | @deffn {Scheme Procedure} request-absolute-uri r [default-host=#f] [default-port=#f] |
e471a3ee AW |
1169 | A helper routine to determine the absolute URI of a request, using the |
1170 | @code{host} header and the default host and port. | |
2e6f5ea4 | 1171 | @end deffn |
8db7e094 AW |
1172 | |
1173 | ||
8db7e094 AW |
1174 | @node Responses |
1175 | @subsection HTTP Responses | |
1176 | ||
1177 | @example | |
1178 | (use-modules (web response)) | |
1179 | @end example | |
1180 | ||
e471a3ee AW |
1181 | As with requests (@pxref{Requests}), Guile offers a data type for HTTP |
1182 | responses. Again, the body is represented separately from the request. | |
8db7e094 | 1183 | |
2e6f5ea4 AW |
1184 | @deffn {Scheme Procedure} response? |
1185 | @deffnx {Scheme Procedure} response-version | |
1186 | @deffnx {Scheme Procedure} response-code | |
1187 | @deffnx {Scheme Procedure} response-reason-phrase response | |
1188 | @deffnx {Scheme Procedure} response-headers | |
1189 | @deffnx {Scheme Procedure} response-port | |
e471a3ee AW |
1190 | A predicate and field accessors for the response type. The fields are as |
1191 | follows: | |
1192 | @table @code | |
1193 | @item version | |
1194 | The HTTP version pair, like @code{(1 . 1)}. | |
1195 | @item code | |
1196 | The HTTP response code, like @code{200}. | |
1197 | @item reason-phrase | |
1198 | The reason phrase, or the standard reason phrase for the response's | |
1199 | code. | |
1200 | @item headers | |
1201 | The response headers, as an alist of parsed values. | |
1202 | @item port | |
1203 | The port on which to read or write a response body, if any. | |
1204 | @end table | |
2e6f5ea4 | 1205 | @end deffn |
8db7e094 | 1206 | |
2e6f5ea4 | 1207 | @deffn {Scheme Procedure} read-response port |
de54fb6d | 1208 | Read an HTTP response from @var{port}. |
8db7e094 AW |
1209 | |
1210 | As a side effect, sets the encoding on @var{port} to ISO-8859-1 | |
1211 | (latin-1), so that reading one character reads one byte. See the | |
de54fb6d | 1212 | discussion of character sets in @ref{Responses}, for more information. |
2e6f5ea4 | 1213 | @end deffn |
8db7e094 | 1214 | |
2e6f5ea4 | 1215 | @deffn {Scheme Procedure} build-response [#:version='(1 . 1)] [#:code=200] [#:reason-phrase=#f] [#:headers='()] [#:port=#f] [#:validate-headers=#t] |
8db7e094 AW |
1216 | Construct an HTTP response object. If @var{validate-headers?} is true, |
1217 | the headers are each run through their respective validators. | |
2e6f5ea4 | 1218 | @end deffn |
8db7e094 | 1219 | |
2e6f5ea4 | 1220 | @deffn {Scheme Procedure} adapt-response-version response version |
de54fb6d | 1221 | Adapt the given response to a different HTTP version. Return a new HTTP |
8db7e094 AW |
1222 | response. |
1223 | ||
1224 | The idea is that many applications might just build a response for the | |
1225 | default HTTP version, and this method could handle a number of | |
1226 | programmatic transformations to respond to older HTTP versions (0.9 and | |
1227 | 1.0). But currently this function is a bit heavy-handed, just updating | |
1228 | the version field. | |
2e6f5ea4 | 1229 | @end deffn |
8db7e094 | 1230 | |
2e6f5ea4 | 1231 | @deffn {Scheme Procedure} write-response r port |
8db7e094 AW |
1232 | Write the given HTTP response to @var{port}. |
1233 | ||
de54fb6d | 1234 | Return a new response, whose @code{response-port} will continue writing |
8db7e094 | 1235 | on @var{port}, perhaps using some transfer encoding. |
2e6f5ea4 | 1236 | @end deffn |
8db7e094 | 1237 | |
2e6f5ea4 | 1238 | @deffn {Scheme Procedure} read-response-body r |
de54fb6d | 1239 | Read the response body from @var{r}, as a bytevector. Returns @code{#f} |
8db7e094 | 1240 | if there was no response body. |
2e6f5ea4 | 1241 | @end deffn |
8db7e094 | 1242 | |
2e6f5ea4 | 1243 | @deffn {Scheme Procedure} write-response-body r bv |
8db7e094 AW |
1244 | Write @var{body}, a bytevector, to the port corresponding to the HTTP |
1245 | response @var{r}. | |
2e6f5ea4 | 1246 | @end deffn |
8db7e094 | 1247 | |
e471a3ee AW |
1248 | As with requests, the various headers that are typically associated with |
1249 | HTTP responses may be accessed with these dedicated accessors. | |
1250 | @xref{HTTP Headers}, for more information on the format of parsed | |
1251 | headers. | |
1252 | ||
2e6f5ea4 AW |
1253 | @deffn {Scheme Procedure} response-accept-ranges response [default=#f] |
1254 | @deffnx {Scheme Procedure} response-age response [default='()] | |
1255 | @deffnx {Scheme Procedure} response-allow response [default='()] | |
1256 | @deffnx {Scheme Procedure} response-cache-control response [default='()] | |
1257 | @deffnx {Scheme Procedure} response-connection response [default='()] | |
1258 | @deffnx {Scheme Procedure} response-content-encoding response [default='()] | |
1259 | @deffnx {Scheme Procedure} response-content-language response [default='()] | |
1260 | @deffnx {Scheme Procedure} response-content-length response [default=#f] | |
1261 | @deffnx {Scheme Procedure} response-content-location response [default=#f] | |
1262 | @deffnx {Scheme Procedure} response-content-md5 response [default=#f] | |
1263 | @deffnx {Scheme Procedure} response-content-range response [default=#f] | |
1264 | @deffnx {Scheme Procedure} response-content-type response [default=#f] | |
1265 | @deffnx {Scheme Procedure} response-date response [default=#f] | |
1266 | @deffnx {Scheme Procedure} response-etag response [default=#f] | |
1267 | @deffnx {Scheme Procedure} response-expires response [default=#f] | |
1268 | @deffnx {Scheme Procedure} response-last-modified response [default=#f] | |
1269 | @deffnx {Scheme Procedure} response-location response [default=#f] | |
1270 | @deffnx {Scheme Procedure} response-pragma response [default='()] | |
1271 | @deffnx {Scheme Procedure} response-proxy-authenticate response [default=#f] | |
1272 | @deffnx {Scheme Procedure} response-retry-after response [default=#f] | |
1273 | @deffnx {Scheme Procedure} response-server response [default=#f] | |
1274 | @deffnx {Scheme Procedure} response-trailer response [default='()] | |
1275 | @deffnx {Scheme Procedure} response-transfer-encoding response [default='()] | |
1276 | @deffnx {Scheme Procedure} response-upgrade response [default='()] | |
1277 | @deffnx {Scheme Procedure} response-vary response [default='()] | |
1278 | @deffnx {Scheme Procedure} response-via response [default='()] | |
1279 | @deffnx {Scheme Procedure} response-warning response [default='()] | |
1280 | @deffnx {Scheme Procedure} response-www-authenticate response [default=#f] | |
de54fb6d | 1281 | Return the given response header, or @var{default} if none was present. |
2e6f5ea4 | 1282 | @end deffn |
8db7e094 AW |
1283 | |
1284 | ||
ec811439 AW |
1285 | @node Web Client |
1286 | @subsection Web Client | |
1287 | ||
1288 | @code{(web client)} provides a simple, synchronous HTTP client, built on | |
1289 | the lower-level HTTP, request, and response modules. | |
1290 | ||
1291 | @deffn {Scheme Procedure} open-socket-for-uri uri | |
1292 | @end deffn | |
1293 | ||
1294 | @deffn {Scheme Procedure} http-get uri [#:port=(open-socket-for-uri uri)] [#:version='(1 . 1)] [#:keep-alive?=#f] [#:extra-headers='()] [#:decode-body=#t] | |
1295 | Connect to the server corresponding to @var{uri} and ask for the | |
1296 | resource, using the @code{GET} method. If you already have a port open, | |
1297 | pass it as @var{port}. The port will be closed at the end of the | |
1298 | request unless @var{keep-alive?} is true. Any extra headers in the | |
1299 | alist @var{extra-headers} will be added to the request. | |
1300 | ||
1301 | If @var{decode-body?} is true, as is the default, the body of the | |
1302 | response will be decoded to string, if it is a textual content-type. | |
1303 | Otherwise it will be returned as a bytevector. | |
1304 | @end deffn | |
1305 | ||
1306 | @code{http-get} is useful for making one-off requests to web sites. If | |
1307 | you are writing a web spider or some other client that needs to handle a | |
1308 | number of requests in parallel, it's better to build an event-driven URL | |
1309 | fetcher, similar in structure to the web server (@pxref{Web Server}). | |
1310 | ||
1311 | Another option, good but not as performant, would be to use threads, | |
1312 | possibly via par-map or futures. | |
1313 | ||
1314 | More helper procedures for the other common HTTP verbs would be a good | |
1315 | addition to this module. Send your code to | |
1316 | @email{guile-user@@gnu.org}. | |
1317 | ||
1318 | ||
8db7e094 AW |
1319 | @node Web Server |
1320 | @subsection Web Server | |
1321 | ||
1322 | @code{(web server)} is a generic web server interface, along with a main | |
1323 | loop implementation for web servers controlled by Guile. | |
1324 | ||
5cdab8b8 AW |
1325 | @example |
1326 | (use-modules (web server)) | |
1327 | @end example | |
1328 | ||
1329 | The lowest layer is the @code{<server-impl>} object, which defines a set | |
1330 | of hooks to open a server, read a request from a client, write a | |
1331 | response to a client, and close a server. These hooks -- @code{open}, | |
1332 | @code{read}, @code{write}, and @code{close}, respectively -- are bound | |
1333 | together in a @code{<server-impl>} object. Procedures in this module take a | |
1334 | @code{<server-impl>} object, if needed. | |
1335 | ||
1336 | A @code{<server-impl>} may also be looked up by name. If you pass the | |
1337 | @code{http} symbol to @code{run-server}, Guile looks for a variable | |
1338 | named @code{http} in the @code{(web server http)} module, which should | |
1339 | be bound to a @code{<server-impl>} object. Such a binding is made by | |
1340 | instantiation of the @code{define-server-impl} syntax. In this way the | |
1341 | run-server loop can automatically load other backends if available. | |
8db7e094 AW |
1342 | |
1343 | The life cycle of a server goes as follows: | |
1344 | ||
1345 | @enumerate | |
1346 | @item | |
1347 | The @code{open} hook is called, to open the server. @code{open} takes 0 or | |
1348 | more arguments, depending on the backend, and returns an opaque | |
1349 | server socket object, or signals an error. | |
1350 | ||
1351 | @item | |
1352 | The @code{read} hook is called, to read a request from a new client. | |
5cdab8b8 AW |
1353 | The @code{read} hook takes one argument, the server socket. It should |
1354 | return three values: an opaque client socket, the request, and the | |
1355 | request body. The request should be a @code{<request>} object, from | |
1356 | @code{(web request)}. The body should be a string or a bytevector, or | |
1357 | @code{#f} if there is no body. | |
8db7e094 AW |
1358 | |
1359 | If the read failed, the @code{read} hook may return #f for the client | |
1360 | socket, request, and body. | |
1361 | ||
1362 | @item | |
09b7459b AW |
1363 | A user-provided handler procedure is called, with the request and body |
1364 | as its arguments. The handler should return two values: the response, | |
1365 | as a @code{<response>} record from @code{(web response)}, and the | |
1366 | response body as bytevector, or @code{#f} if not present. | |
1367 | ||
1368 | The respose and response body are run through @code{sanitize-response}, | |
1369 | documented below. This allows the handler writer to take some | |
1370 | convenient shortcuts: for example, instead of a @code{<response>}, the | |
1371 | handler can simply return an alist of headers, in which case a default | |
1372 | response object is constructed with those headers. Instead of a | |
1373 | bytevector for the body, the handler can return a string, which will be | |
1374 | serialized into an appropriate encoding; or it can return a procedure, | |
1375 | which will be called on a port to write out the data. See the | |
1376 | @code{sanitize-response} documentation, for more. | |
8db7e094 AW |
1377 | |
1378 | @item | |
1379 | The @code{write} hook is called with three arguments: the client | |
1380 | socket, the response, and the body. The @code{write} hook returns no | |
1381 | values. | |
1382 | ||
1383 | @item | |
1384 | At this point the request handling is complete. For a loop, we | |
1385 | loop back and try to read a new request. | |
1386 | ||
1387 | @item | |
1388 | If the user interrupts the loop, the @code{close} hook is called on | |
1389 | the server socket. | |
1390 | @end enumerate | |
1391 | ||
5cdab8b8 AW |
1392 | A user may define a server implementation with the following form: |
1393 | ||
2e6f5ea4 | 1394 | @deffn {Scheme Procedure} define-server-impl name open read write close |
5cdab8b8 AW |
1395 | Make a @code{<server-impl>} object with the hooks @var{open}, |
1396 | @var{read}, @var{write}, and @var{close}, and bind it to the symbol | |
1397 | @var{name} in the current module. | |
2e6f5ea4 | 1398 | @end deffn |
8db7e094 | 1399 | |
2e6f5ea4 | 1400 | @deffn {Scheme Procedure} lookup-server-impl impl |
8db7e094 AW |
1401 | Look up a server implementation. If @var{impl} is a server |
1402 | implementation already, it is returned directly. If it is a symbol, the | |
1403 | binding named @var{impl} in the @code{(web server @var{impl})} module is | |
1404 | looked up. Otherwise an error is signaled. | |
1405 | ||
1406 | Currently a server implementation is a somewhat opaque type, useful only | |
1407 | for passing to other procedures in this module, like @code{read-client}. | |
2e6f5ea4 | 1408 | @end deffn |
8db7e094 | 1409 | |
5cdab8b8 AW |
1410 | The @code{(web server)} module defines a number of routines that use |
1411 | @code{<server-impl>} objects to implement parts of a web server. Given | |
1412 | that we don't expose the accessors for the various fields of a | |
1413 | @code{<server-impl>}, indeed these routines are the only procedures with | |
1414 | any access to the impl objects. | |
1415 | ||
2e6f5ea4 | 1416 | @deffn {Scheme Procedure} open-server impl open-params |
f4ec6877 | 1417 | Open a server for the given implementation. Return one value, the new |
8db7e094 AW |
1418 | server object. The implementation's @code{open} procedure is applied to |
1419 | @var{open-params}, which should be a list. | |
2e6f5ea4 | 1420 | @end deffn |
8db7e094 | 1421 | |
2e6f5ea4 | 1422 | @deffn {Scheme Procedure} read-client impl server |
8db7e094 | 1423 | Read a new client from @var{server}, by applying the implementation's |
f4ec6877 | 1424 | @code{read} procedure to the server. If successful, return three |
8db7e094 | 1425 | values: an object corresponding to the client, a request object, and the |
f4ec6877 | 1426 | request body. If any exception occurs, return @code{#f} for all three |
8db7e094 | 1427 | values. |
2e6f5ea4 | 1428 | @end deffn |
8db7e094 | 1429 | |
2e6f5ea4 | 1430 | @deffn {Scheme Procedure} handle-request handler request body state |
8db7e094 AW |
1431 | Handle a given request, returning the response and body. |
1432 | ||
1433 | The response and response body are produced by calling the given | |
1434 | @var{handler} with @var{request} and @var{body} as arguments. | |
1435 | ||
1436 | The elements of @var{state} are also passed to @var{handler} as | |
1437 | arguments, and may be returned as additional values. The new | |
1438 | @var{state}, collected from the @var{handler}'s return values, is then | |
1439 | returned as a list. The idea is that a server loop receives a handler | |
1440 | from the user, along with whatever state values the user is interested | |
1441 | in, allowing the user's handler to explicitly manage its state. | |
2e6f5ea4 | 1442 | @end deffn |
8db7e094 | 1443 | |
2e6f5ea4 | 1444 | @deffn {Scheme Procedure} sanitize-response request response body |
8db7e094 AW |
1445 | "Sanitize" the given response and body, making them appropriate for the |
1446 | given request. | |
1447 | ||
1448 | As a convenience to web handler authors, @var{response} may be given as | |
1449 | an alist of headers, in which case it is used to construct a default | |
1450 | response. Ensures that the response version corresponds to the request | |
1451 | version. If @var{body} is a string, encodes the string to a bytevector, | |
1452 | in an encoding appropriate for @var{response}. Adds a | |
1453 | @code{content-length} and @code{content-type} header, as necessary. | |
1454 | ||
1455 | If @var{body} is a procedure, it is called with a port as an argument, | |
1456 | and the output collected as a bytevector. In the future we might try to | |
1457 | instead use a compressing, chunk-encoded port, and call this procedure | |
1458 | later, in the write-client procedure. Authors are advised not to rely on | |
1459 | the procedure being called at any particular time. | |
2e6f5ea4 | 1460 | @end deffn |
8db7e094 | 1461 | |
2e6f5ea4 | 1462 | @deffn {Scheme Procedure} write-client impl server client response body |
8db7e094 AW |
1463 | Write an HTTP response and body to @var{client}. If the server and |
1464 | client support persistent connections, it is the implementation's | |
1465 | responsibility to keep track of the client thereafter, presumably by | |
1466 | attaching it to the @var{server} argument somehow. | |
2e6f5ea4 | 1467 | @end deffn |
8db7e094 | 1468 | |
2e6f5ea4 | 1469 | @deffn {Scheme Procedure} close-server impl server |
8db7e094 AW |
1470 | Release resources allocated by a previous invocation of |
1471 | @code{open-server}. | |
2e6f5ea4 | 1472 | @end deffn |
8db7e094 | 1473 | |
5cdab8b8 AW |
1474 | Given the procedures above, it is a small matter to make a web server: |
1475 | ||
2e6f5ea4 | 1476 | @deffn {Scheme Procedure} serve-one-client handler impl server state |
8db7e094 | 1477 | Read one request from @var{server}, call @var{handler} on the request |
f4ec6877 | 1478 | and body, and write the response to the client. Return the new state |
8db7e094 | 1479 | produced by the handler procedure. |
2e6f5ea4 | 1480 | @end deffn |
8db7e094 | 1481 | |
2e6f5ea4 | 1482 | @deffn {Scheme Procedure} run-server handler [impl='http] [open-params='()] . state |
8db7e094 AW |
1483 | Run Guile's built-in web server. |
1484 | ||
1485 | @var{handler} should be a procedure that takes two or more arguments, | |
1486 | the HTTP request and request body, and returns two or more values, the | |
1487 | response and response body. | |
1488 | ||
f4ec6877 | 1489 | For examples, skip ahead to the next section, @ref{Web Examples}. |
8db7e094 AW |
1490 | |
1491 | The response and body will be run through @code{sanitize-response} | |
1492 | before sending back to the client. | |
1493 | ||
1494 | Additional arguments to @var{handler} are taken from @var{state}. | |
1495 | Additional return values are accumulated into a new @var{state}, which | |
1496 | will be used for subsequent requests. In this way a handler can | |
1497 | explicitly manage its state. | |
2e6f5ea4 | 1498 | @end deffn |
8db7e094 | 1499 | |
f4ec6877 AW |
1500 | The default web server implementation is @code{http}, which binds to a |
1501 | socket, listening for request on that port. | |
1502 | ||
1503 | @deffn {HTTP Implementation} http [#:host=#f] [#:family=AF_INET] [#:addr=INADDR_LOOPBACK] [#:port 8080] [#:socket] | |
1504 | The default HTTP implementation. We document it as a function with | |
1505 | keyword arguments, because that is precisely the way that it is -- all | |
1506 | of the @var{open-params} to @code{run-server} get passed to the | |
1507 | implementation's open function. | |
1508 | ||
1509 | @example | |
1510 | ;; The defaults: localhost:8080 | |
1511 | (run-server handler) | |
1512 | ;; Same thing | |
1513 | (run-server handler 'http '()) | |
1514 | ;; On a different port | |
1515 | (run-server handler 'http '(#:port 8081)) | |
1516 | ;; IPv6 | |
1517 | (run-server handler 'http '(#:family AF_INET6 #:port 8081)) | |
1518 | ;; Custom socket | |
1519 | (run-server handler 'http `(#:socket ,(sudo-make-me-a-socket))) | |
1520 | @end example | |
1521 | @end deffn | |
5cdab8b8 AW |
1522 | |
1523 | @node Web Examples | |
1524 | @subsection Web Examples | |
1525 | ||
1526 | Well, enough about the tedious internals. Let's make a web application! | |
1527 | ||
1528 | @subsubsection Hello, World! | |
1529 | ||
1530 | The first program we have to write, of course, is ``Hello, World!''. | |
1531 | This means that we have to implement a web handler that does what we | |
1532 | want. | |
1533 | ||
1534 | Now we define a handler, a function of two arguments and two return | |
1535 | values: | |
1536 | ||
1537 | @example | |
1538 | (define (handler request request-body) | |
1539 | (values @var{response} @var{response-body})) | |
1540 | @end example | |
1541 | ||
1542 | In this first example, we take advantage of a short-cut, returning an | |
1543 | alist of headers instead of a proper response object. The response body | |
1544 | is our payload: | |
1545 | ||
1546 | @example | |
1547 | (define (hello-world-handler request request-body) | |
f4ec6877 | 1548 | (values '((content-type . (text/plain))) |
5cdab8b8 AW |
1549 | "Hello World!")) |
1550 | @end example | |
1551 | ||
1552 | Now let's test it, by running a server with this handler. Load up the | |
1553 | web server module if you haven't yet done so, and run a server with this | |
1554 | handler: | |
1555 | ||
8db7e094 AW |
1556 | @example |
1557 | (use-modules (web server)) | |
5cdab8b8 | 1558 | (run-server hello-world-handler) |
8db7e094 AW |
1559 | @end example |
1560 | ||
5cdab8b8 AW |
1561 | By default, the web server listens for requests on |
1562 | @code{localhost:8080}. Visit that address in your web browser to | |
1563 | test. If you see the string, @code{Hello World!}, sweet! | |
8db7e094 | 1564 | |
5cdab8b8 | 1565 | @subsubsection Inspecting the Request |
e471a3ee | 1566 | |
5cdab8b8 AW |
1567 | The Hello World program above is a general greeter, responding to all |
1568 | URIs. To make a more exclusive greeter, we need to inspect the request | |
1569 | object, and conditionally produce different results. So let's load up | |
1570 | the request, response, and URI modules, and do just that. | |
e471a3ee | 1571 | |
5cdab8b8 AW |
1572 | @example |
1573 | (use-modules (web server)) ; you probably did this already | |
1574 | (use-modules (web request) | |
1575 | (web response) | |
1576 | (web uri)) | |
1577 | ||
1578 | (define (request-path-components request) | |
1579 | (split-and-decode-uri-path (uri-path (request-uri request)))) | |
1580 | ||
1581 | (define (hello-hacker-handler request body) | |
1582 | (if (equal? (request-path-components request) | |
1583 | '("hacker")) | |
f4ec6877 | 1584 | (values '((content-type . (text/plain))) |
5cdab8b8 AW |
1585 | "Hello hacker!") |
1586 | (not-found request))) | |
1587 | ||
1588 | (run-server hello-hacker-handler) | |
1589 | @end example | |
e471a3ee | 1590 | |
5cdab8b8 AW |
1591 | Here we see that we have defined a helper to return the components of |
1592 | the URI path as a list of strings, and used that to check for a request | |
1593 | to @code{/hacker/}. Then the success case is just as before -- visit | |
1594 | @code{http://localhost:8080/hacker/} in your browser to check. | |
1595 | ||
1596 | You should always match against URI path components as decoded by | |
1597 | @code{split-and-decode-uri-path}. The above example will work for | |
1598 | @code{/hacker/}, @code{//hacker///}, and @code{/h%61ck%65r}. | |
1599 | ||
1600 | But we forgot to define @code{not-found}! If you are pasting these | |
1601 | examples into a REPL, accessing any other URI in your web browser will | |
1602 | drop your Guile console into the debugger: | |
1603 | ||
1604 | @example | |
1605 | <unnamed port>:38:7: In procedure module-lookup: | |
1606 | <unnamed port>:38:7: Unbound variable: not-found | |
1607 | ||
1608 | Entering a new prompt. Type `,bt' for a backtrace or `,q' to continue. | |
1609 | scheme@@(guile-user) [1]> | |
1610 | @end example | |
1611 | ||
1612 | So let's define the function, right there in the debugger. As you | |
1613 | probably know, we'll want to return a 404 response. | |
1614 | ||
1615 | @example | |
1616 | ;; Paste this in your REPL | |
1617 | (define (not-found request) | |
1618 | (values (build-response #:code 404) | |
1619 | (string-append "Resource not found: " | |
2ebdf6b5 | 1620 | (uri->string (request-uri request))))) |
5cdab8b8 AW |
1621 | |
1622 | ;; Now paste this to let the web server keep going: | |
1623 | ,continue | |
1624 | @end example | |
1625 | ||
1626 | Now if you access @code{http://localhost/foo/}, you get this error | |
1627 | message. (Note that some popular web browsers won't show | |
1628 | server-generated 404 messages, showing their own instead, unless the 404 | |
1629 | message body is long enough.) | |
1630 | ||
1631 | @subsubsection Higher-Level Interfaces | |
1632 | ||
1633 | The web handler interface is a common baseline that all kinds of Guile | |
1634 | web applications can use. You will usually want to build something on | |
1635 | top of it, however, especially when producing HTML. Here is a simple | |
1636 | example that builds up HTML output using SXML (@pxref{sxml simple}). | |
1637 | ||
1638 | First, load up the modules: | |
1639 | ||
1640 | @example | |
1641 | (use-modules (web server) | |
1642 | (web request) | |
1643 | (web response) | |
1644 | (sxml simple)) | |
1645 | @end example | |
1646 | ||
1647 | Now we define a simple templating function that takes a list of HTML | |
1648 | body elements, as SXML, and puts them in our super template: | |
1649 | ||
1650 | @example | |
1651 | (define (templatize title body) | |
1652 | `(html (head (title ,title)) | |
1653 | (body ,@@body))) | |
e471a3ee AW |
1654 | @end example |
1655 | ||
5cdab8b8 AW |
1656 | For example, the simplest Hello HTML can be produced like this: |
1657 | ||
1658 | @example | |
1659 | (sxml->xml (templatize "Hello!" '((b "Hi!")))) | |
1660 | @print{} | |
1661 | <html><head><title>Hello!</title></head><body><b>Hi!</b></body></html> | |
1662 | @end example | |
1663 | ||
1664 | Much better to work with Scheme data types than to work with HTML as | |
1665 | strings. Now we define a little response helper: | |
1666 | ||
1667 | @example | |
1668 | (define* (respond #:optional body #:key | |
1669 | (status 200) | |
1670 | (title "Hello hello!") | |
1671 | (doctype "<!DOCTYPE html>\n") | |
f4ec6877 AW |
1672 | (content-type-params '((charset . "utf-8"))) |
1673 | (content-type 'text/html) | |
5cdab8b8 AW |
1674 | (extra-headers '()) |
1675 | (sxml (and body (templatize title body)))) | |
1676 | (values (build-response | |
1677 | #:code status | |
1678 | #:headers `((content-type | |
1679 | . (,content-type ,@@content-type-params)) | |
1680 | ,@@extra-headers)) | |
1681 | (lambda (port) | |
1682 | (if sxml | |
1683 | (begin | |
1684 | (if doctype (display doctype port)) | |
1685 | (sxml->xml sxml port)))))) | |
1686 | @end example | |
1687 | ||
1688 | Here we see the power of keyword arguments with default initializers. By | |
1689 | the time the arguments are fully parsed, the @code{sxml} local variable | |
1690 | will hold the templated SXML, ready for sending out to the client. | |
1691 | ||
f4ec6877 AW |
1692 | Also, instead of returning the body as a string, @code{respond} gives a |
1693 | procedure, which will be called by the web server to write out the | |
1694 | response to the client. | |
5cdab8b8 AW |
1695 | |
1696 | Now, a simple example using this responder, which lays out the incoming | |
1697 | headers in an HTML table. | |
1698 | ||
1699 | @example | |
1700 | (define (debug-page request body) | |
1701 | (respond | |
1702 | `((h1 "hello world!") | |
1703 | (table | |
1704 | (tr (th "header") (th "value")) | |
1705 | ,@@(map (lambda (pair) | |
1706 | `(tr (td (tt ,(with-output-to-string | |
1707 | (lambda () (display (car pair)))))) | |
1708 | (td (tt ,(with-output-to-string | |
1709 | (lambda () | |
1710 | (write (cdr pair)))))))) | |
1711 | (request-headers request)))))) | |
1712 | ||
1713 | (run-server debug-page) | |
1714 | @end example | |
1715 | ||
1716 | Now if you visit any local address in your web browser, we actually see | |
1717 | some HTML, finally. | |
1718 | ||
1719 | @subsubsection Conclusion | |
e471a3ee | 1720 | |
5cdab8b8 AW |
1721 | Well, this is about as far as Guile's built-in web support goes, for |
1722 | now. There are many ways to make a web application, but hopefully by | |
1723 | standardizing the most fundamental data types, users will be able to | |
1724 | choose the approach that suits them best, while also being able to | |
1725 | switch between implementations of the server. This is a relatively new | |
1726 | part of Guile, so if you have feedback, let us know, and we can take it | |
1727 | into account. Happy hacking on the web! | |
e471a3ee | 1728 | |
8db7e094 AW |
1729 | @c Local Variables: |
1730 | @c TeX-master: "guile.texi" | |
1731 | @c End: |