Commit | Line | Data |
---|---|---|
b0322a85 CE |
1 | '\" t |
2 | .\"<!-- Copyright 2001-2007 Double Precision, Inc. See COPYING for --> | |
3 | .\"<!-- distribution information. --> | |
d9898ee8 | 4 | .\" Title: rfc822 |
b0322a85 CE |
5 | .\" Author: Sam Varshavchik |
6 | .\" Generator: DocBook XSL Stylesheets v1.78.1 <http://docbook.sf.net/> | |
7 | .\" Date: 08/25/2013 | |
d9898ee8 | 8 | .\" Manual: Double Precision, Inc. |
b0322a85 CE |
9 | .\" Source: Courier Mail Server |
10 | .\" Language: English | |
d9898ee8 | 11 | .\" |
b0322a85 CE |
12 | .TH "RFC822" "3" "08/25/2013" "Courier Mail Server" "Double Precision, Inc\&." |
13 | .\" ----------------------------------------------------------------- | |
14 | .\" * Define some portability stuff | |
15 | .\" ----------------------------------------------------------------- | |
16 | .\" ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
17 | .\" http://bugs.debian.org/507673 | |
18 | .\" http://lists.gnu.org/archive/html/groff/2009-02/msg00013.html | |
19 | .\" ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |
20 | .ie \n(.g .ds Aq \(aq | |
21 | .el .ds Aq ' | |
22 | .\" ----------------------------------------------------------------- | |
23 | .\" * set default formatting | |
24 | .\" ----------------------------------------------------------------- | |
d9898ee8 | 25 | .\" disable hyphenation |
26 | .nh | |
27 | .\" disable justification (adjust text to left margin only) | |
28 | .ad l | |
b0322a85 CE |
29 | .\" ----------------------------------------------------------------- |
30 | .\" * MAIN CONTENT STARTS HERE * | |
31 | .\" ----------------------------------------------------------------- | |
d9898ee8 | 32 | .SH "NAME" |
b0322a85 | 33 | rfc822 \- RFC 822 parsing library |
d9898ee8 | 34 | .SH "SYNOPSIS" |
35 | .sp | |
d9898ee8 | 36 | .nf |
b0322a85 | 37 | #include <rfc822\&.h> |
d9898ee8 | 38 | |
b0322a85 | 39 | #include <rfc2047\&.h> |
d9898ee8 | 40 | |
b0322a85 | 41 | cc \&.\&.\&. \-lrfc822 |
d9898ee8 | 42 | .fi |
d9898ee8 | 43 | .SH "DESCRIPTION" |
44 | .PP | |
b0322a85 | 45 | The rfc822 library provides functions for parsing E\-mail headers in the RFC 822 format\&. This library also includes some functions to help with encoding and decoding 8\-bit text, as defined by RFC 2047\&. |
d9898ee8 | 46 | .PP |
47 | The format used by E\-mail headers to encode sender and recipient information is defined by | |
b0322a85 | 48 | \m[blue]\fBRFC 822\fR\m[]\&\s-2\u[1]\d\s+2 |
d9898ee8 | 49 | (and its successor, |
b0322a85 CE |
50 | \m[blue]\fBRFC 2822\fR\m[]\&\s-2\u[2]\d\s+2)\&. The format allows the actual E\-mail address and the sender/recipient name to be expressed together, for example: |
51 | John Smith <jsmith@example\&.com> | |
d9898ee8 | 52 | .PP |
53 | The main purposes of the rfc822 library is to: | |
54 | .PP | |
b0322a85 | 55 | 1) Parse a text string containing a list of RFC 822\-formatted address into its logical components: names and E\-mail addresses\&. |
d9898ee8 | 56 | .PP |
b0322a85 | 57 | 2) Access those individual components\&. |
d9898ee8 | 58 | .PP |
b0322a85 | 59 | 3) Allow some limited modifications of the parsed structure, and then convert it back into a text string\&. |
d9898ee8 | 60 | .SS "Tokenizing an E\-mail header" |
61 | .sp | |
b0322a85 | 62 | .if n \{\ |
d9898ee8 | 63 | .RS 4 |
b0322a85 | 64 | .\} |
d9898ee8 | 65 | .nf |
66 | struct rfc822t *tokens=rfc822t_alloc_new(const char *header, | |
67 | void (*err_func)(const char *, int, void *), | |
68 | void *func_arg); | |
69 | ||
70 | void rfc822t_free(tokens); | |
71 | .fi | |
b0322a85 | 72 | .if n \{\ |
d9898ee8 | 73 | .RE |
b0322a85 | 74 | .\} |
d9898ee8 | 75 | .PP |
76 | The | |
77 | \fBrfc822t_alloc_new\fR() function (superceeds | |
78 | \fBrfc822t_alloc\fR(), which is now obsolete) accepts an E\-mail | |
b0322a85 | 79 | \fIheader\fR, and parses it into individual tokens\&. This function allocates and returns a pointer to an |
d9898ee8 | 80 | rfc822t |
81 | structure, which is later used by | |
b0322a85 | 82 | \fBrfc822a_alloc\fR() to extract individual addresses from these tokens\&. |
d9898ee8 | 83 | .PP |
84 | If | |
85 | \fIerr_func\fR | |
b0322a85 | 86 | argument, if not NULL, is a pointer to a callback function\&. The function is called in the event that the E\-mail header is corrupted to the point that it cannot even be parsed\&. This is a rare instance \-\- most forms of corruption are still valid at least on the lexical level\&. The only time this error is reported is in the event of mismatched parenthesis, angle brackets, or quotes\&. The callback function receives the |
d9898ee8 | 87 | \fIheader\fR |
88 | pointer, an index to the syntax error in the header string, and the | |
89 | \fIfunc_arg\fR | |
b0322a85 | 90 | argument\&. |
d9898ee8 | 91 | .PP |
92 | The semantics of | |
93 | \fIerr_func\fR | |
b0322a85 | 94 | are subject to change\&. It is recommended to leave this argument as NULL in the current version of the library\&. |
d9898ee8 | 95 | .PP |
d9898ee8 | 96 | \fBrfc822t_alloc\fR() returns a pointer to a dynamically\-allocated |
97 | rfc822t | |
b0322a85 | 98 | structure\&. A NULL pointer is returned if there\*(Aqs insufficient memory to allocate this structure\&. The |
d9898ee8 | 99 | \fBrfc822t_free\fR() function destroys |
100 | rfc822t | |
b0322a85 CE |
101 | structure and frees all dynamically allocated memory\&. |
102 | .if n \{\ | |
d9898ee8 | 103 | .sp |
b0322a85 CE |
104 | .\} |
105 | .RS 4 | |
d9898ee8 | 106 | .it 1 an-trap |
107 | .nr an-no-space-flag 1 | |
108 | .nr an-break-flag 1 | |
109 | .br | |
b0322a85 CE |
110 | .ps +1 |
111 | \fBNote\fR | |
112 | .ps -1 | |
113 | .br | |
d9898ee8 | 114 | .PP |
115 | Until | |
116 | \fBrfc822t_free\fR() is called, the contents of | |
117 | \fIheader\fR | |
b0322a85 | 118 | MUST NOT be destroyed or altered in any way\&. The contents of |
d9898ee8 | 119 | \fIheader\fR |
120 | are not modified by | |
121 | \fBrfc822t_alloc\fR(), however the | |
122 | rfc822t | |
123 | structure contains pointers to portions of the supplied | |
b0322a85 CE |
124 | \fIheader\fR, and they must remain valid\&. |
125 | .sp .5v | |
126 | .RE | |
d9898ee8 | 127 | .SS "Extracting E\-mail addresses" |
128 | .sp | |
b0322a85 | 129 | .if n \{\ |
d9898ee8 | 130 | .RS 4 |
b0322a85 | 131 | .\} |
d9898ee8 | 132 | .nf |
133 | struct rfc822a *addrs=rfc822a_alloc(struct rfc822t *tokens); | |
134 | ||
135 | void rfc822a_free(addrs); | |
136 | .fi | |
b0322a85 | 137 | .if n \{\ |
d9898ee8 | 138 | .RE |
b0322a85 | 139 | .\} |
d9898ee8 | 140 | .PP |
141 | The | |
142 | \fBrfc822a_alloc\fR() function returns a dynamically\-allocated | |
143 | rfc822a | |
144 | structure, that contains individual addresses that were logically parsed from a | |
145 | rfc822t | |
b0322a85 | 146 | structure\&. The |
d9898ee8 | 147 | \fBrfc822a_alloc\fR() function returns NULL if there was insufficient memory to allocate the |
148 | rfc822a | |
b0322a85 | 149 | structure\&. The |
d9898ee8 | 150 | \fBrfc822a_free\fR() function destroys the |
151 | rfc822a | |
b0322a85 | 152 | function, and frees all associated dynamically\-allocated memory\&. The |
d9898ee8 | 153 | rfc822t |
154 | structure passed to | |
155 | \fBrfc822a_alloc\fR() must not be destroyed before | |
156 | \fBrfc822a_free\fR() destroys the | |
157 | rfc822a | |
b0322a85 | 158 | structure\&. |
d9898ee8 | 159 | .PP |
160 | The | |
161 | rfc822a | |
162 | structure has the following fields: | |
163 | .sp | |
b0322a85 | 164 | .if n \{\ |
d9898ee8 | 165 | .RS 4 |
b0322a85 | 166 | .\} |
d9898ee8 | 167 | .nf |
168 | struct rfc822a { | |
169 | struct rfc822addr *addrs; | |
170 | int naddrs; | |
171 | } ; | |
172 | .fi | |
b0322a85 | 173 | .if n \{\ |
d9898ee8 | 174 | .RE |
b0322a85 | 175 | .\} |
d9898ee8 | 176 | .PP |
177 | The | |
b0322a85 | 178 | \fInaddrs\fR |
d9898ee8 | 179 | field gives the number of |
180 | rfc822addr | |
181 | structures that are pointed to by | |
b0322a85 | 182 | \fIaddrs\fR, which is an array\&. Each |
d9898ee8 | 183 | rfc822addr |
184 | structure represents either an address found in the original E\-mail header, | |
b0322a85 | 185 | \fIor the contents of some legacy "syntactical sugar"\fR\&. For example, the following is a valid E\-mail header: |
d9898ee8 | 186 | .sp |
b0322a85 | 187 | .if n \{\ |
d9898ee8 | 188 | .RS 4 |
b0322a85 | 189 | .\} |
d9898ee8 | 190 | .nf |
b0322a85 | 191 | To: recipient\-list: tom@example\&.com, john@example\&.com; |
d9898ee8 | 192 | .fi |
b0322a85 | 193 | .if n \{\ |
d9898ee8 | 194 | .RE |
b0322a85 | 195 | .\} |
d9898ee8 | 196 | .PP |
197 | Typically, all of this, except for "To:", is tokenized by | |
198 | \fBrfc822t_alloc\fR(), then parsed by | |
b0322a85 | 199 | \fBrfc822a_alloc\fR()\&. "recipient\-list:" and the trailing semicolon is a legacy mailing list specification that is no longer in widespread use, but must still must be accounted for\&. The resulting |
d9898ee8 | 200 | rfc822a |
201 | structure will have four | |
202 | rfc822addr | |
b0322a85 | 203 | structures: one for "recipient\-list:"; one for each address; and one for the trailing semicolon\&. Each |
d9898ee8 | 204 | rfc822a |
205 | structure has the following fields: | |
206 | .sp | |
b0322a85 | 207 | .if n \{\ |
d9898ee8 | 208 | .RS 4 |
b0322a85 | 209 | .\} |
d9898ee8 | 210 | .nf |
211 | struct rfc822addr { | |
212 | struct rfc822token *tokens; | |
213 | struct rfc822token *name; | |
214 | } ; | |
215 | .fi | |
b0322a85 | 216 | .if n \{\ |
d9898ee8 | 217 | .RE |
b0322a85 | 218 | .\} |
d9898ee8 | 219 | .PP |
220 | If | |
b0322a85 CE |
221 | \fItokens\fR |
222 | is a null pointer, this structure represents some non\-address portion of the original header, such as "recipient\-list:" or a semicolon\&. Otherwise it points to a structure that represents the E\-mail address in tokenized form\&. | |
223 | .PP | |
224 | \fIname\fR | |
225 | either points to the tokenized form of a non\-address portion of the original header, or to a tokenized form of the recipient\*(Aqs name\&. | |
226 | \fIname\fR | |
227 | will be NULL if the recipient name was not provided\&. For the following address: | |
228 | Tom Jones <tjones@example\&.com> | |
d9898ee8 | 229 | \- the |
b0322a85 CE |
230 | \fItokens\fR |
231 | field points to the tokenized form of "tjones@example\&.com", and | |
232 | \fIname\fR | |
233 | points to the tokenized form of "Tom Jones"\&. | |
d9898ee8 | 234 | .PP |
235 | Each | |
236 | rfc822token | |
237 | structure contains the following fields: | |
238 | .sp | |
b0322a85 | 239 | .if n \{\ |
d9898ee8 | 240 | .RS 4 |
b0322a85 | 241 | .\} |
d9898ee8 | 242 | .nf |
243 | struct rfc822token { | |
244 | struct rfc822token *next; | |
245 | int token; | |
246 | const char *ptr; | |
247 | int len; | |
248 | } ; | |
249 | .fi | |
b0322a85 | 250 | .if n \{\ |
d9898ee8 | 251 | .RE |
b0322a85 | 252 | .\} |
d9898ee8 | 253 | .PP |
254 | The | |
b0322a85 CE |
255 | \fInext\fR |
256 | pointer builds a linked list of all tokens in this name or address\&. The possible values for the | |
257 | \fItoken\fR | |
d9898ee8 | 258 | field are: |
259 | .PP | |
260 | 0x00 | |
261 | .RS 4 | |
b0322a85 | 262 | This is a simple atom \- a sequence of non\-special characters that is delimited by whitespace or special characters (see below)\&. |
d9898ee8 | 263 | .RE |
264 | .PP | |
265 | 0x22 | |
266 | .RS 4 | |
b0322a85 | 267 | The value of the ascii quote \- this is a quoted string\&. |
d9898ee8 | 268 | .RE |
269 | .PP | |
b0322a85 | 270 | Open parenthesis: \*(Aq(\*(Aq |
d9898ee8 | 271 | .RS 4 |
b0322a85 | 272 | This is an old style comment\&. A deprecated form of E\-mail addressing uses \- for example \- "john@example\&.com (John Smith)" instead of "John Smith <john@example\&.com>"\&. This old\-style notation defined parenthesized content as arbitrary comments\&. The |
d9898ee8 | 273 | rfc822token |
274 | with | |
b0322a85 CE |
275 | \fItoken\fR |
276 | set to \*(Aq(\*(Aq is created for the contents of the entire comment\&. | |
d9898ee8 | 277 | .RE |
278 | .PP | |
b0322a85 | 279 | Symbols: \*(Aq<\*(Aq, \*(Aq>\*(Aq, \*(Aq@\*(Aq, and many others |
d9898ee8 | 280 | .RS 4 |
281 | The remaining possible values of | |
b0322a85 CE |
282 | \fItoken\fR |
283 | include all the characters in RFC 822 headers that have special significance\&. | |
d9898ee8 | 284 | .RE |
285 | .PP | |
286 | When a | |
287 | rfc822token | |
288 | structure does not represent a special character, the | |
b0322a85 CE |
289 | \fIptr\fR |
290 | field points to a text string giving its contents\&. The contents are NOT null\-terminated, the | |
291 | \fIlen\fR | |
292 | field contains the number of characters included\&. The macro rfc822_is_atom(token) indicates whether | |
293 | \fIptr\fR | |
d9898ee8 | 294 | and |
b0322a85 | 295 | \fIlen\fR |
d9898ee8 | 296 | are used for the given |
b0322a85 | 297 | \fItoken\fR\&. Currently |
d9898ee8 | 298 | \fBrfc822_is_atom\fR() returns true if |
b0322a85 CE |
299 | \fItoken\fR |
300 | is a zero byte, \*(Aq"\*(Aq, or \*(Aq(\*(Aq\&. | |
d9898ee8 | 301 | .PP |
b0322a85 CE |
302 | Note that it\*(Aqs possible that |
303 | \fIlen\fR | |
304 | might be zero\&. This happens with null addresses used as return addresses for delivery status notifications\&. | |
d9898ee8 | 305 | .SS "Working with E\-mail addresses" |
306 | .sp | |
b0322a85 | 307 | .if n \{\ |
d9898ee8 | 308 | .RS 4 |
b0322a85 | 309 | .\} |
d9898ee8 | 310 | .nf |
311 | void rfc822_deladdr(struct rfc822a *addrs, int index); | |
312 | ||
313 | void rfc822tok_print(const struct rfc822token *list, | |
314 | void (*func)(char, void *), void *func_arg); | |
315 | ||
316 | void rfc822_print(const struct rfc822a *addrs, | |
317 | void (*print_func)(char, void *), | |
318 | void (*print_separator)(const char *, void *), void *callback_arg); | |
319 | ||
320 | void rfc822_addrlist(const struct rfc822a *addrs, | |
321 | void (*print_func)(char, void *), | |
322 | void *callback_arg); | |
323 | ||
324 | void rfc822_namelist(const struct rfc822a *addrs, | |
325 | void (*print_func)(char, void *), | |
326 | void *callback_arg); | |
327 | ||
328 | void rfc822_praddr(const struct rfc822a *addrs, | |
329 | int index, | |
330 | void (*print_func)(char, void *), | |
331 | void *callback_arg); | |
332 | ||
333 | void rfc822_prname(const struct rfc822a *addrs, | |
334 | int index, | |
335 | void (*print_func)(char, void *), | |
336 | void *callback_arg); | |
337 | ||
338 | void rfc822_prname_orlist(const struct rfc822a *addrs, | |
339 | int index, | |
340 | void (*print_func)(char, void *), | |
341 | void *callback_arg); | |
342 | ||
343 | char *rfc822_gettok(const struct rfc822token *list); | |
344 | char *rfc822_getaddrs(const struct rfc822a *addrs); | |
345 | char *rfc822_getaddr(const struct rfc822a *addrs, int index); | |
346 | char *rfc822_getname(const struct rfc822a *addrs, int index); | |
347 | char *rfc822_getname_orlist(const struct rfc822a *addrs, int index); | |
348 | ||
349 | char *rfc822_getaddrs_wrap(const struct rfc822a *, int); | |
350 | .fi | |
b0322a85 | 351 | .if n \{\ |
d9898ee8 | 352 | .RE |
b0322a85 | 353 | .\} |
d9898ee8 | 354 | .PP |
355 | These functions are used to work with individual addresses that are parsed by | |
b0322a85 | 356 | \fBrfc822a_alloc\fR()\&. |
d9898ee8 | 357 | .PP |
d9898ee8 | 358 | \fBrfc822_deladdr\fR() removes a single |
359 | rfc822addr | |
360 | structure, whose | |
361 | \fIindex\fR | |
362 | is given, from the address array in | |
b0322a85 CE |
363 | rfc822addr\&. |
364 | \fInaddrs\fR | |
365 | is decremented by one\&. | |
d9898ee8 | 366 | .PP |
d9898ee8 | 367 | \fBrfc822tok_print\fR() converts a tokenized |
368 | \fIlist\fR | |
369 | of | |
370 | rfc822token | |
b0322a85 CE |
371 | objects into a text string\&. The callback function, |
372 | \fIfunc\fR, is called one character at a time, for every character in the tokenized objects\&. An arbitrary pointer, | |
373 | \fIfunc_arg\fR, is passed unchanged as the additional argument to the callback function\&. | |
374 | \fBrfc822tok_print\fR() is not usually the most convenient and efficient function, but it has its uses\&. | |
d9898ee8 | 375 | .PP |
d9898ee8 | 376 | \fBrfc822_print\fR() takes an entire |
377 | rfc822a | |
b0322a85 | 378 | structure, and uses the callback functions to print the contained addresses, in their original form, separated by commas\&. The function pointed to by |
d9898ee8 | 379 | \fIprint_func\fR |
b0322a85 | 380 | is used to print each individual address, one character at a time\&. Between the addresses, the |
d9898ee8 | 381 | \fIprint_separator\fR |
b0322a85 | 382 | function is called to print the address separator, usually the string ", "\&. The |
d9898ee8 | 383 | \fIcallback_arg\fR |
b0322a85 | 384 | argument is passed along unchanged, as an additional argument to these functions\&. |
d9898ee8 | 385 | .PP |
386 | The functions | |
387 | \fBrfc822_addrlist\fR() and | |
388 | \fBrfc822_namelist\fR() also print the contents of the entire | |
389 | rfc822a | |
b0322a85 CE |
390 | structure, but in a different way\&. |
391 | \fBrfc822_addrlist\fR() prints just the actual E\-mail addresses, not the recipient names or comments\&. Each E\-mail address is followed by a newline character\&. | |
392 | \fBrfc822_namelist\fR() prints just the names or comments, followed by newlines\&. | |
d9898ee8 | 393 | .PP |
394 | The functions | |
395 | \fBrfc822_praddr\fR() and | |
396 | \fBrfc822_prname\fR() are just like | |
397 | \fBrfc822_addrlist\fR() and | |
398 | \fBrfc822_namelist\fR(), except that they print a single name or address in the | |
399 | rfc822a | |
400 | structure, given its | |
b0322a85 | 401 | \fIindex\fR\&. The functions |
d9898ee8 | 402 | \fBrfc822_gettok\fR(), |
403 | \fBrfc822_getaddrs\fR(), | |
404 | \fBrfc822_getaddr\fR(), and | |
405 | \fBrfc822_getname\fR() are equivalent to | |
406 | \fBrfc822tok_print\fR(), | |
407 | \fBrfc822_print\fR(), | |
408 | \fBrfc822_praddr\fR() and | |
b0322a85 CE |
409 | \fBrfc822_prname\fR(), but, instead of using a callback function pointer, these functions write the output into a dynamically allocated buffer\&. That buffer must be destroyed by |
410 | \fBfree\fR(3) after use\&. These functions will return a null pointer in the event of a failure to allocate memory for the buffer\&. | |
d9898ee8 | 411 | .PP |
d9898ee8 | 412 | \fBrfc822_prname_orlist\fR() is similar to |
413 | \fBrfc822_prname\fR(), except that it will also print the legacy RFC822 group list syntax (which are also parsed by | |
b0322a85 CE |
414 | \fBrfc822a_alloc\fR())\&. |
415 | \fBrfc822_praddr\fR() will print an empty string for an index that corresponds to a group list name (or terminated semicolon)\&. | |
416 | \fBrfc822_prname\fR() will also print an empty string\&. | |
417 | \fBrfc822_prname_orlist\fR() will instead print either the name of the group list, or a single string ";"\&. | |
418 | \fBrfc822_getname_orlist\fR() will instead save it into a dynamically allocated buffer\&. | |
d9898ee8 | 419 | .PP |
420 | The function | |
421 | \fBrfc822_getaddrs_wrap\fR() is similar to | |
b0322a85 | 422 | \fBrfc822_getaddrs\fR(), except that the generated text is wrapped on or about the 73rd column, using newline characters\&. |
d9898ee8 | 423 | .SS "Working with dates" |
424 | .sp | |
b0322a85 | 425 | .if n \{\ |
d9898ee8 | 426 | .RS 4 |
b0322a85 | 427 | .\} |
d9898ee8 | 428 | .nf |
429 | time_t timestamp=rfc822_parsedt(const char *datestr) | |
430 | const char *datestr=rfc822_mkdate(time_t timestamp); | |
431 | void rfc822_mkdate_buf(time_t timestamp, char *buffer); | |
432 | .fi | |
b0322a85 | 433 | .if n \{\ |
d9898ee8 | 434 | .RE |
b0322a85 | 435 | .\} |
d9898ee8 | 436 | .PP |
437 | These functions convert between timestamps and dates expressed in the | |
438 | Date: | |
b0322a85 | 439 | E\-mail header format\&. |
d9898ee8 | 440 | .PP |
b0322a85 | 441 | \fBrfc822_parsedt\fR() returns the timestamp corresponding to the given date string (0 if there was a syntax error)\&. |
d9898ee8 | 442 | .PP |
b0322a85 CE |
443 | \fBrfc822_mkdate\fR() returns a date string corresponding to the given timestamp\&. |
444 | \fBrfc822_mkdate_buf\fR() writes the date string into the given buffer instead, which must be big enough to accommodate it\&. | |
d9898ee8 | 445 | .SS "Working with 8\-bit MIME\-encoded headers" |
446 | .sp | |
b0322a85 | 447 | .if n \{\ |
d9898ee8 | 448 | .RS 4 |
b0322a85 | 449 | .\} |
d9898ee8 | 450 | .nf |
451 | int error=rfc2047_decode(const char *text, | |
452 | int (*callback_func)(const char *, int, const char *, void *), | |
453 | void *callback_arg); | |
454 | ||
455 | extern char *str=rfc2047_decode_simple(const char *text); | |
456 | ||
457 | extern char *str=rfc2047_decode_enhanced(const char *text, | |
458 | const char *charset); | |
459 | ||
460 | void rfc2047_print(const struct rfc822a *a, | |
461 | const char *charset, | |
462 | void (*print_func)(char, void *), | |
463 | void (*print_separator)(const char *, void *), void *); | |
464 | ||
465 | ||
466 | char *buffer=rfc2047_encode_str(const char *string, | |
467 | const char *charset); | |
468 | ||
469 | int error=rfc2047_encode_callback(const char *string, | |
470 | const char *charset, | |
471 | int (*func)(const char *, size_t, void *), | |
472 | void *callback_arg); | |
473 | ||
474 | char *buffer=rfc2047_encode_header(const struct rfc822a *a, | |
475 | const char *charset); | |
476 | .fi | |
b0322a85 | 477 | .if n \{\ |
d9898ee8 | 478 | .RE |
b0322a85 | 479 | .\} |
d9898ee8 | 480 | .PP |
b0322a85 | 481 | These functions provide additional logic to encode or decode 8\-bit content in 7\-bit RFC 822 headers, as specified in RFC 2047\&. |
d9898ee8 | 482 | .PP |
b0322a85 | 483 | \fBrfc2047_decode\fR() is a basic RFC 2047 decoding function\&. It receives a pointer to some 7bit RFC 2047\-encoded text, and a callback function\&. The callback function is repeatedly called\&. Each time it\*(Aqs called it receives a piece of decoded text\&. The arguments are: a pointer to a text fragment, number of bytes in the text fragment, followed by a pointer to the character set of the text fragment\&. The character set pointer is NULL for portions of the original text that are not RFC 2047\-encoded\&. |
d9898ee8 | 484 | .PP |
485 | The callback function also receives | |
b0322a85 CE |
486 | \fIcallback_arg\fR, as its last argument\&. If the callback function returns a non\-zero value, |
487 | \fBrfc2047_decode\fR() terminates, returning that value\&. Otherwise, | |
488 | \fBrfc2047_decode\fR() returns 0 after a successful decoding\&. | |
489 | \fBrfc2047_decode\fR() returns \-1 if it was unable to allocate sufficient memory\&. | |
d9898ee8 | 490 | .PP |
d9898ee8 | 491 | \fBrfc2047_decode_simple\fR() and |
492 | \fBrfc2047_decode_enhanced\fR() are alternatives to | |
b0322a85 CE |
493 | \fBrfc2047_decode\fR() which forego a callback function, and return the decoded text in a dynamically\-allocated memory buffer\&. The buffer must be |
494 | \fBfree\fR(3)\-ed after use\&. | |
495 | \fBrfc2047_decode_simple\fR() discards all character set specifications, and merely decodes any 8\-bit text\&. | |
496 | \fBrfc2047_decode_enhanced\fR() is a compromise to discarding all character set information\&. The local character set being used is specified as the second argument to | |
497 | \fBrfc2047_decode_enhanced\fR()\&. Any RFC 2047\-encoded text in a different character set will be prefixed by the name of the character set, in brackets, in the resulting output\&. | |
d9898ee8 | 498 | .PP |
d9898ee8 | 499 | \fBrfc2047_decode_simple\fR() and |
b0322a85 | 500 | \fBrfc2047_decode_enhanced\fR() return a null pointer if they are unable to allocate sufficient memory\&. |
d9898ee8 | 501 | .PP |
502 | The | |
503 | \fBrfc2047_print\fR() function is equivalent to | |
504 | \fBrfc822_print\fR(), followed by | |
b0322a85 | 505 | \fBrfc2047_decode_enhanced\fR() on the result\&. The callback functions are used in an identical fashion, except that they receive text that\*(Aqs already decoded\&. |
d9898ee8 | 506 | .PP |
507 | The function | |
508 | \fBrfc2047_encode_str\fR() takes a | |
509 | \fIstring\fR | |
510 | and | |
511 | \fIcharset\fR | |
512 | being the name of the local character set, then encodes any 8\-bit portions of | |
513 | \fIstring\fR | |
b0322a85 | 514 | using RFC 2047 encoding\&. |
d9898ee8 | 515 | \fBrfc2047_encode_str\fR() returns a dynamically\-allocated buffer with the result, which must be |
b0322a85 | 516 | \fBfree\fR(3)\-ed after use, or NULL if there was insufficient memory to allocate the buffer\&. |
d9898ee8 | 517 | .PP |
518 | The function | |
519 | \fBrfc2047_encode_callback\fR() is similar to | |
b0322a85 CE |
520 | \fBrfc2047_encode_str\fR() except that the callback function is repeatedly called to received the encoding string\&. Each invocation of the callback function receives a pointer to a portion of the encoded text, the number of characters in this portion, and |
521 | \fIcallback_arg\fR\&. | |
d9898ee8 | 522 | .PP |
523 | The function | |
524 | \fBrfc2047_encode_header\fR() is basically equivalent to | |
525 | \fBrfc822_getaddrs\fR(), followed by | |
526 | \fBrfc2047_encode_str\fR(); | |
527 | .SS "Working with subjects" | |
528 | .sp | |
b0322a85 | 529 | .if n \{\ |
d9898ee8 | 530 | .RS 4 |
b0322a85 | 531 | .\} |
d9898ee8 | 532 | .nf |
533 | char *basesubj=rfc822_coresubj(const char *subj); | |
534 | ||
535 | char *basesubj=rfc822_coresubj_nouc(const char *subj); | |
536 | .fi | |
b0322a85 | 537 | .if n \{\ |
d9898ee8 | 538 | .RE |
b0322a85 | 539 | .\} |
d9898ee8 | 540 | .PP |
b0322a85 | 541 | This function takes the contents of the subject header, and returns the "core" subject header that\*(Aqs used in the specification of the IMAP THREAD function\&. This function is designed to strip all subject line artifacts that might\*(Aqve been added in the process of forwarding or replying to a message\&. Currently, |
d9898ee8 | 542 | \fBrfc822_coresubj\fR() performs the following transformations: |
543 | .PP | |
544 | Whitespace | |
545 | .RS 4 | |
b0322a85 | 546 | Leading and trailing whitespace is removed\&. Consecutive whitespace characters are collapsed into a single whitespace character\&. All whitespace characters are replaced by a space\&. |
d9898ee8 | 547 | .RE |
548 | .PP | |
549 | Re:, (fwd) [foo] | |
550 | .RS 4 | |
b0322a85 | 551 | These artifacts (and several others) are removed from the subject line\&. |
d9898ee8 | 552 | .RE |
553 | .PP | |
b0322a85 | 554 | Note that this function does NOT do MIME decoding\&. In order to implement IMAP THREAD, it is necessary to call something like |
d9898ee8 | 555 | \fBrfc2047_decode\fR() before calling |
b0322a85 | 556 | \fBrfc822_coresubj\fR()\&. |
d9898ee8 | 557 | .PP |
558 | This function returns a pointer to a dynamically\-allocated buffer, which must be | |
b0322a85 | 559 | \fBfree\fR(3)\-ed after use\&. |
d9898ee8 | 560 | .PP |
d9898ee8 | 561 | \fBrfc822_coresubj_nouc\fR() is like |
b0322a85 | 562 | \fBrfc822_coresubj\fR(), except that the subject is not converted to uppercase\&. |
d9898ee8 | 563 | .SH "SEE ALSO" |
564 | .PP | |
b0322a85 CE |
565 | \m[blue]\fB\fBrfc2045\fR(3)\fR\m[]\&\s-2\u[3]\d\s+2, |
566 | \m[blue]\fB\fBreformail\fR(1)\fR\m[]\&\s-2\u[4]\d\s+2, | |
567 | \m[blue]\fB\fBreformime\fR(1)\fR\m[]\&\s-2\u[5]\d\s+2\&. | |
568 | .SH "AUTHOR" | |
569 | .PP | |
570 | \fBSam Varshavchik\fR | |
571 | .RS 4 | |
572 | Author | |
573 | .RE | |
8d138742 | 574 | .SH "NOTES" |
d9898ee8 | 575 | .IP " 1." 4 |
576 | RFC 822 | |
577 | .RS 4 | |
8d138742 | 578 | \%http://www.rfc-editor.org/rfc/rfc822.txt |
d9898ee8 | 579 | .RE |
580 | .IP " 2." 4 | |
581 | RFC 2822 | |
582 | .RS 4 | |
8d138742 | 583 | \%http://www.rfc-editor.org/rfc/rfc2822.txt |
d9898ee8 | 584 | .RE |
585 | .IP " 3." 4 | |
586 | \fBrfc2045\fR(3) | |
587 | .RS 4 | |
b0322a85 | 588 | \%[set $man.base.url.for.relative.links]/rfc2045.html |
d9898ee8 | 589 | .RE |
590 | .IP " 4." 4 | |
591 | \fBreformail\fR(1) | |
592 | .RS 4 | |
b0322a85 | 593 | \%[set $man.base.url.for.relative.links]/reformail.html |
d9898ee8 | 594 | .RE |
595 | .IP " 5." 4 | |
596 | \fBreformime\fR(1) | |
597 | .RS 4 | |
b0322a85 | 598 | \%[set $man.base.url.for.relative.links]/reformime.html |
d9898ee8 | 599 | .RE |