add web.texi to manual
[bpt/guile.git] / doc / ref / web.texi
CommitLineData
8db7e094
AW
1@c -*-texinfo-*-
2@c This is part of the GNU Guile Reference Manual.
3@c Copyright (C) 2010 Free Software Foundation, Inc.
4@c See the file guile.texi for copying conditions.
5
6@node Web
7@section @acronym{HTTP}, the Web, and All That
8@cindex Web
9@cindex WWW
10@cindex HTTP
11
12When Guile started back in the mid-nineties, the GNU system was still
13focused on producing a good POSIX implementation. This is why Guile's
14POSIX support is good, and has been so for a while.
15
16But times change, and in a way these days the web is the new POSIX: a
17standard and a motley set of implementations on which much computing is
18done. So today's Guile also supports the web at the programming
19language level, by defining common data types and operations for the
20technologies underpinning the web: URIs, HTTP, and XML.
21
22It is particularly important to define native web data types. Though
23the web is text in motion, programming the web in text is like
24programming with @code{goto}: muddy, and error-prone. Most current
25security problems on the web are due to treating the web as text instead
26of as instances of the proper data types.
27
28In addition, common web data types help programmers to share code.
29
30Well. That's all very nice and opinionated and such, but how do I use
31the thing? Read on!
32
33@menu
34* URIs:: Universal Resource Identifiers.
35* HTTP:: The Hyper-Text Transfer Protocol.
36* Requests:: HTTP requests.
37* Responses:: HTTP responses.
38* Web Handlers:: A simple web application interface.
39* Web Server:: Serving HTTP to the internet.
40@end menu
41
42@node URIs
43@subsection Universal Resource Identifiers
44
45@example
46(use-modules (web uri))
47@end example
48
49@verbatim
50 A data type for Universal Resource Identifiers, as defined in RFC
51 3986.
52@end verbatim
53
54@defspec uri?
55@end defspec
56
57@defspec uri-scheme
58@end defspec
59
60@defspec uri-userinfo
61@end defspec
62
63@defspec uri-host
64@end defspec
65
66@defspec uri-port
67@end defspec
68
69@defspec uri-path
70@end defspec
71
72@defspec uri-query
73@end defspec
74
75@defspec uri-fragment
76@end defspec
77
78@defun build-uri scheme [#:userinfo] [#:host] [#:port] [#:path] [#:query] [#:fragment] [#:validate?]
79Construct a URI object. If @var{validate?} is true, also run some
80consistency checks to make sure that the constructed URI is valid.
81@end defun
82
83@defun declare-default-port! scheme port
84Declare a default port for the given URI scheme.
85
86Default ports are for printing URI objects: a default port is not
87printed.
88@end defun
89
90@defun parse-uri string
91Parse @var{string} into a URI object. Returns @code{#f} if the string
92could not be parsed.
93@end defun
94
95@defun unparse-uri uri
96Serialize @var{uri} to a string.
97@end defun
98
99@defun uri-decode str [#:charset]
100Percent-decode the given @var{str}, according to @var{charset}.
101
102Note that this function should not generally be applied to a full URI
103string. For paths, use split-and-decode-uri-path instead. For query
104strings, split the query on @code{&} and @code{=} boundaries, and decode
105the components separately.
106
107Note that percent-encoded strings encode @emph{bytes}, not characters.
108There is no guarantee that a given byte sequence is a valid string
109encoding. Therefore this routine may signal an error if the decoded
110bytes are not valid for the given encoding. Pass @code{#f} for
111@var{charset} if you want decoded bytes as a bytevector directly.
112@end defun
113
114@defun uri-encode str [#:charset] [#:unescaped-chars]
115Percent-encode any character not in @var{unescaped-chars}.
116
117Percent-encoding first writes out the given character to a bytevector
118within the given @var{charset}, then encodes each byte as
119@code{%@var{HH}}, where @var{HH} is the hexadecimal representation of
120the byte.
121@end defun
122
123@defun split-and-decode-uri-path path
124Split @var{path} into its components, and decode each component,
125removing empty components.
126
127For example, @code{"/foo/bar/"} decodes to the two-element list,
128@code{("foo" "bar")}.
129@end defun
130
131@defun encode-and-join-uri-path parts
132URI-encode each element of @var{parts}, which should be a list of
133strings, and join the parts together with @code{/} as a delimiter.
134@end defun
135
136@node HTTP
137@subsection The Hyper-Text Transfer Protocol
138
139@example
140(use-modules (web http))
141@end example
142
143This module has a number of routines to parse textual
144representations of HTTP data into native Scheme data structures.
145
146It tries to follow RFCs fairly strictly---the road to perdition
147being paved with compatibility hacks---though some allowances are
148made for not-too-divergent texts (like a quality of .2 which should
149be 0.2, etc).
150
151@defspec header-decl?
152@end defspec
153
154@defspec make-header-decl
155@end defspec
156
157@defspec header-decl-sym
158@end defspec
159
160@defspec header-decl-name
161@end defspec
162
163@defspec header-decl-multiple?
164@end defspec
165
166@defspec header-decl-parser
167@end defspec
168
169@defspec header-decl-validator
170@end defspec
171
172@defspec header-decl-writer
173@end defspec
174
175@defun lookup-header-decl name
176Return the @var{header-decl} object registered for the given @var{name}.
177
178@var{name} may be a symbol or a string. Strings are mapped to headers in
179a case-insensitive fashion.
180@end defun
181
182@defun declare-header! sym name [#:multiple?] [#:parser] [#:validator] [#:writer]
183Define a parser, validator, and writer for the HTTP header, @var{name}.
184
185@var{parser} should be a procedure that takes a string and returns a
186Scheme value. @var{validator} is a predicate for whether the given
187Scheme value is valid for this header. @var{writer} takes a value and a
188port, and writes the value to the port.
189@end defun
190
191@defun read-header port
192Reads one HTTP header from @var{port}. Returns two values: the header
193name and the parsed Scheme value. May raise an exception if the header
194was known but the value was invalid.
195
196Returns @var{#f} for both values if the end of the message body was
197reached (i.e., a blank line).
198@end defun
199
200@defun parse-header name val
201Parse @var{val}, a string, with the parser for the header named
202@var{name}.
203
204Returns two values, the header name and parsed value. If a parser was
205found, the header name will be returned as a symbol. If a parser was not
206found, both the header name and the value are returned as strings.
207@end defun
208
209@defun valid-header? sym val
210Returns a true value iff @var{val} is a valid Scheme value for the
211header with name @var{sym}.
212@end defun
213
214@defun write-header name val port
215Writes the given header name and value to @var{port}. If @var{name} is a
216symbol, looks up a declared header and uses that writer. Otherwise the
217value is written using @var{display}.
218@end defun
219
220@defun read-headers port
221Read an HTTP message from @var{port}, returning the headers as an
222ordered alist.
223@end defun
224
225@defun write-headers headers port
226Write the given header alist to @var{port}. Doesn't write the final
227\r\n, as the user might want to add another header.
228@end defun
229
230@defun parse-http-method str [start] [end]
231Parse an HTTP method from @var{str}. The result is an upper-case symbol,
232like @code{GET}.
233@end defun
234
235@defun parse-http-version str [start] [end]
236Parse an HTTP version from @var{str}, returning it as a major-minor
237pair. For example, @code{HTTP/1.1} parses as the pair of integers,
238@code{(1 . 1)}.
239@end defun
240
241@defun parse-request-uri str [start] [end]
242Parse a URI from an HTTP request line. Note that URIs in requests do not
243have to have a scheme or host name. The result is a URI object.
244@end defun
245
246@defun read-request-line port
247Read the first line of an HTTP request from @var{port}, returning three
248values: the method, the URI, and the version.
249@end defun
250
251@defun write-request-line method uri version port
252Write the first line of an HTTP request to @var{port}.
253@end defun
254
255@defun read-response-line port
256Read the first line of an HTTP response from @var{port}, returning three
257values: the HTTP version, the response code, and the "reason phrase".
258@end defun
259
260@defun write-response-line version code reason-phrase port
261Write the first line of an HTTP response to @var{port}.
262@end defun
263
264
265@node Requests
266@subsection HTTP Requests
267
268@example
269(use-modules (web request))
270@end example
271
272@defspec request?
273@end defspec
274
275@defspec request-method
276@end defspec
277
278@defspec request-uri
279@end defspec
280
281@defspec request-version
282@end defspec
283
284@defspec request-headers
285@end defspec
286
287@defspec request-meta
288@end defspec
289
290@defspec request-port
291@end defspec
292
293@defun read-request port [meta]
294Read an HTTP request from @var{port}, optionally attaching the given
295metadata, @var{meta}.
296
297As a side effect, sets the encoding on @var{port} to ISO-8859-1
298(latin-1), so that reading one character reads one byte. See the
299discussion of character sets in "HTTP Requests" in the manual, for more
300information.
301@end defun
302
303@defun build-request [#:method] [#:uri] [#:version] [#:headers] [#:port] [#:meta] [#:validate-headers?]
304Construct an HTTP request object. If @var{validate-headers?} is true,
305the headers are each run through their respective validators.
306@end defun
307
308@defun write-request r port
309Write the given HTTP request to @var{port}.
310
311Returns a new request, whose @code{request-port} will continue writing
312on @var{port}, perhaps using some transfer encoding.
313@end defun
314
315@defun read-request-body/latin-1 r
316Reads the request body from @var{r}, as a string.
317
318Assumes that the request port has ISO-8859-1 encoding, so that the
319number of characters to read is the same as the
320@code{request-content-length}. Returns @code{#f} if there was no request
321body.
322@end defun
323
324@defun write-request-body/latin-1 r body
325Write @var{body}, a string encodable in ISO-8859-1, to the port
326corresponding to the HTTP request @var{r}.
327@end defun
328
329@defun read-request-body/bytevector r
330Reads the request body from @var{r}, as a bytevector. Returns @code{#f}
331if there was no request body.
332@end defun
333
334@defun write-request-body/bytevector r bv
335Write @var{body}, a bytevector, to the port corresponding to the HTTP
336request @var{r}.
337@end defun
338
339@defun request-accept request [default='()]
340@defunx request-accept-charset request [default='()]
341@defunx request-accept-encoding request [default='()]
342@defunx request-accept-language request [default='()]
343@defunx request-allow request [default='()]
344@defunx request-authorization request [default=#f]
345@defunx request-cache-control request [default='()]
346@defunx request-connection request [default='()]
347@defunx request-content-encoding request [default='()]
348@defunx request-content-language request [default='()]
349@defunx request-content-length request [default=#f]
350@defunx request-content-location request [default=#f]
351@defunx request-content-md5 request [default=#f]
352@defunx request-content-range request [default=#f]
353@defunx request-content-type request [default=#f]
354@defunx request-date request [default=#f]
355@defunx request-expect request [default='()]
356@defunx request-expires request [default=#f]
357@defunx request-from request [default=#f]
358@defunx request-host request [default=#f]
359@defunx request-if-match request [default=#f]
360@defunx request-if-modified-since request [default=#f]
361@defunx request-if-none-match request [default=#f]
362@defunx request-if-range request [default=#f]
363@defunx request-if-unmodified-since request [default=#f]
364@defunx request-last-modified request [default=#f]
365@defunx request-max-forwards request [default=#f]
366@defunx request-pragma request [default='()]
367@defunx request-proxy-authorization request [default=#f]
368@defunx request-range request [default=#f]
369@defunx request-referer request [default=#f]
370@defunx request-te request [default=#f]
371@defunx request-trailer request [default='()]
372@defunx request-transfer-encoding request [default='()]
373@defunx request-upgrade request [default='()]
374@defunx request-user-agent request [default=#f]
375@defunx request-via request [default='()]
376@defunx request-warning request [default='()]
377@end defun
378
379@defun request-absolute-uri r [default-host] [default-port]
380@end defun
381
382
383
384@node Responses
385@subsection HTTP Responses
386
387@example
388(use-modules (web response))
389@end example
390
391
392@defspec response?
393@end defspec
394
395@defspec response-version
396@end defspec
397
398@defspec response-code
399@end defspec
400
401@defun response-reason-phrase response
402Return the reason phrase given in @var{response}, or the standard reason
403phrase for the response's code.
404@end defun
405
406@defspec response-headers
407@end defspec
408
409@defspec response-port
410@end defspec
411
412@defun read-response port
413Read an HTTP response from @var{port}, optionally attaching the given
414metadata, @var{meta}.
415
416As a side effect, sets the encoding on @var{port} to ISO-8859-1
417(latin-1), so that reading one character reads one byte. See the
418discussion of character sets in "HTTP Responses" in the manual, for more
419information.
420@end defun
421
422@defun build-response [#:version] [#:code] [#:reason-phrase] [#:headers] [#:port]
423Construct an HTTP response object. If @var{validate-headers?} is true,
424the headers are each run through their respective validators.
425@end defun
426
427@defun extend-response r k v . additional
428Extend an HTTP response by setting additional HTTP headers @var{k},
429@var{v}. Returns a new HTTP response.
430@end defun
431
432@defun adapt-response-version response version
433Adapt the given response to a different HTTP version. Returns a new HTTP
434response.
435
436The idea is that many applications might just build a response for the
437default HTTP version, and this method could handle a number of
438programmatic transformations to respond to older HTTP versions (0.9 and
4391.0). But currently this function is a bit heavy-handed, just updating
440the version field.
441@end defun
442
443@defun write-response r port
444Write the given HTTP response to @var{port}.
445
446Returns a new response, whose @code{response-port} will continue writing
447on @var{port}, perhaps using some transfer encoding.
448@end defun
449
450@defun read-response-body/latin-1 r
451Reads the response body from @var{r}, as a string.
452
453Assumes that the response port has ISO-8859-1 encoding, so that the
454number of characters to read is the same as the
455@code{response-content-length}. Returns @code{#f} if there was no
456response body.
457@end defun
458
459@defun write-response-body/latin-1 r body
460Write @var{body}, a string encodable in ISO-8859-1, to the port
461corresponding to the HTTP response @var{r}.
462@end defun
463
464@defun read-response-body/bytevector r
465Reads the response body from @var{r}, as a bytevector. Returns @code{#f}
466if there was no response body.
467@end defun
468
469@defun write-response-body/bytevector r bv
470Write @var{body}, a bytevector, to the port corresponding to the HTTP
471response @var{r}.
472@end defun
473
474@defun response-accept-ranges response [default=#f]
475@defunx response-age response [default='()]
476@defunx response-allow response [default='()]
477@defunx response-cache-control response [default='()]
478@defunx response-connection response [default='()]
479@defunx response-content-encoding response [default='()]
480@defunx response-content-language response [default='()]
481@defunx response-content-length response [default=#f]
482@defunx response-content-location response [default=#f]
483@defunx response-content-md5 response [default=#f]
484@defunx response-content-range response [default=#f]
485@defunx response-content-type response [default=#f]
486@defunx response-date response [default=#f]
487@defunx response-etag response [default=#f]
488@defunx response-expires response [default=#f]
489@defunx response-last-modified response [default=#f]
490@defunx response-location response [default=#f]
491@defunx response-pragma response [default='()]
492@defunx response-proxy-authenticate response [default=#f]
493@defunx response-retry-after response [default=#f]
494@defunx response-server response [default=#f]
495@defunx response-trailer response [default='()]
496@defunx response-transfer-encoding response [default='()]
497@defunx response-upgrade response [default='()]
498@defunx response-vary response [default='()]
499@defunx response-via response [default='()]
500@defunx response-warning response [default='()]
501@defunx response-www-authenticate response [default=#f]
502@end defun
503
504
505@node Web Handlers
506@subsection Web Handlers
507
508from request to response
509
510@node Web Server
511@subsection Web Server
512
513@code{(web server)} is a generic web server interface, along with a main
514loop implementation for web servers controlled by Guile.
515
516The lowest layer is the <server-impl> object, which defines a set of
517hooks to open a server, read a request from a client, write a
518response to a client, and close a server. These hooks -- open,
519read, write, and close, respectively -- are bound together in a
520<server-impl> object. Procedures in this module take a
521<server-impl> object, if needed.
522
523A <server-impl> may also be looked up by name. If you pass the
524@code{http} symbol to @code{run-server}, Guile looks for a variable named
525@code{http} in the @code{(web server http)} module, which should be bound to a
526<server-impl> object. Such a binding is made by instantiation of
527the @code{define-server-impl} syntax. In this way the run-server loop can
528automatically load other backends if available.
529
530The life cycle of a server goes as follows:
531
532@enumerate
533@item
534The @code{open} hook is called, to open the server. @code{open} takes 0 or
535more arguments, depending on the backend, and returns an opaque
536server socket object, or signals an error.
537
538@item
539The @code{read} hook is called, to read a request from a new client.
540The @code{read} hook takes one arguments, the server socket. It
541should return three values: an opaque client socket, the
542request, and the request body. The request should be a
543@code{<request>} object, from @code{(web request)}. The body should be a
544string or a bytevector, or @code{#f} if there is no body.
545
546If the read failed, the @code{read} hook may return #f for the client
547socket, request, and body.
548
549@item
550A user-provided handler procedure is called, with the request
551and body as its arguments. The handler should return two
552values: the response, as a @code{<response>} record from @code{(web
553response)}, and the response body as a string, bytevector, or
554@code{#f} if not present. We also allow the reponse to be simply an
555alist of headers, in which case a default response object is
556constructed with those headers.
557
558@item
559The @code{write} hook is called with three arguments: the client
560socket, the response, and the body. The @code{write} hook returns no
561values.
562
563@item
564At this point the request handling is complete. For a loop, we
565loop back and try to read a new request.
566
567@item
568If the user interrupts the loop, the @code{close} hook is called on
569the server socket.
570@end enumerate
571
572@defspec define-server-impl name open read write close
573@end defspec
574
575@defun lookup-server-impl impl
576Look up a server implementation. If @var{impl} is a server
577implementation already, it is returned directly. If it is a symbol, the
578binding named @var{impl} in the @code{(web server @var{impl})} module is
579looked up. Otherwise an error is signaled.
580
581Currently a server implementation is a somewhat opaque type, useful only
582for passing to other procedures in this module, like @code{read-client}.
583@end defun
584
585@defun open-server impl open-params
586Open a server for the given implementation. Returns one value, the new
587server object. The implementation's @code{open} procedure is applied to
588@var{open-params}, which should be a list.
589@end defun
590
591@defun read-client impl server
592Read a new client from @var{server}, by applying the implementation's
593@code{read} procedure to the server. If successful, returns three
594values: an object corresponding to the client, a request object, and the
595request body. If any exception occurs, returns @code{#f} for all three
596values.
597@end defun
598
599@defun handle-request handler request body state
600Handle a given request, returning the response and body.
601
602The response and response body are produced by calling the given
603@var{handler} with @var{request} and @var{body} as arguments.
604
605The elements of @var{state} are also passed to @var{handler} as
606arguments, and may be returned as additional values. The new
607@var{state}, collected from the @var{handler}'s return values, is then
608returned as a list. The idea is that a server loop receives a handler
609from the user, along with whatever state values the user is interested
610in, allowing the user's handler to explicitly manage its state.
611@end defun
612
613@defun sanitize-response request response body
614"Sanitize" the given response and body, making them appropriate for the
615given request.
616
617As a convenience to web handler authors, @var{response} may be given as
618an alist of headers, in which case it is used to construct a default
619response. Ensures that the response version corresponds to the request
620version. If @var{body} is a string, encodes the string to a bytevector,
621in an encoding appropriate for @var{response}. Adds a
622@code{content-length} and @code{content-type} header, as necessary.
623
624If @var{body} is a procedure, it is called with a port as an argument,
625and the output collected as a bytevector. In the future we might try to
626instead use a compressing, chunk-encoded port, and call this procedure
627later, in the write-client procedure. Authors are advised not to rely on
628the procedure being called at any particular time.
629@end defun
630
631@defun write-client impl server client response body
632Write an HTTP response and body to @var{client}. If the server and
633client support persistent connections, it is the implementation's
634responsibility to keep track of the client thereafter, presumably by
635attaching it to the @var{server} argument somehow.
636@end defun
637
638@defun close-server impl server
639Release resources allocated by a previous invocation of
640@code{open-server}.
641@end defun
642
643@defun serve-one-client handler impl server state
644Read one request from @var{server}, call @var{handler} on the request
645and body, and write the response to the client. Returns the new state
646produced by the handler procedure.
647@end defun
648
649@defun run-server handler [impl] [open-params] . state
650Run Guile's built-in web server.
651
652@var{handler} should be a procedure that takes two or more arguments,
653the HTTP request and request body, and returns two or more values, the
654response and response body.
655
656For example, here is a simple "Hello, World!" server:
657
658@example
659 (define (handler request body)
660 (values '((content-type . ("text/plain")))
661 "Hello, World!"))
662 (run-server handler)
663@end example
664
665The response and body will be run through @code{sanitize-response}
666before sending back to the client.
667
668Additional arguments to @var{handler} are taken from @var{state}.
669Additional return values are accumulated into a new @var{state}, which
670will be used for subsequent requests. In this way a handler can
671explicitly manage its state.
672
673The default server implementation is @code{http}, which accepts
674@var{open-params} like @code{(#:port 8081)}, among others. See "Web
675Server" in the manual, for more information.
676@end defun
677
678@example
679(use-modules (web server))
680@end example
681
682
683@c Local Variables:
684@c TeX-master: "guile.texi"
685@c End: