Commit | Line | Data |
---|---|---|
e015f748 CE |
1 | The Webalizer - A web server log file analysis tool |
2 | Copyright 1997-2011 by Bradford L. Barrett | |
3 | ||
4 | Distributed under the GNU GPL. See the files "COPYING" and | |
5 | "Copyright" supplied with the distribution for additional info. | |
6 | ||
7 | ||
8 | What is The Webalizer? | |
9 | ---------------------- | |
10 | ||
11 | The Webalizer is a web server log file analysis program which produces | |
12 | usage statistics in HTML format for viewing with a browser. The results | |
13 | are presented in both columnar and graphical format, which facilitates | |
14 | interpretation. Yearly, monthly, daily and hourly usage statistics are | |
15 | presented, along with the ability to display usage by site, URL, referrer, | |
16 | user agent (browser), search string, entry/exit page, username and country | |
17 | (some information is only available if supported and present in the log | |
18 | files being processed). Processed data may also be exported into most | |
19 | database and spreadsheet programs that support tab delimited data formats. | |
20 | ||
21 | The Webalizer supports CLF (common log format) log files, as well as | |
22 | Combined log formats as defined by NCSA and others, and variations | |
23 | of these which it attempts to handle intelligently. In addition, The | |
24 | Webalizer supports wu-ftpd xferlog (FTP) formatted logs, squid proxy logs | |
25 | and W3C extended format logs. | |
26 | ||
27 | Gzip compressed logs may be used as input directly. Any log filename | |
28 | that ends with a '.gz' extension will be assumed to be in gzip format and | |
29 | uncompressed on the fly as it is being read. The Webalizer now also has | |
30 | the ability to handle BZip2 compressed logs, if enabled at compile time. | |
31 | Similar to gzipped logs, any log filename that ends with a '.bz2' will be | |
32 | assumed to be in bzip2 format and uncompressed on the fly as it is being | |
33 | read. | |
34 | ||
35 | For sites that do not enable hostname lookups (DNS resolution) on their | |
36 | web servers (and have only IP addresses in their logs), The Webalizer | |
37 | provides its own internal DNS lookup capability as well as geolocation | |
38 | services (GeoDB). The optional GeoIP library from MaxMind Inc. is also | |
39 | supported and may be used instead of the native GeoDB database. | |
40 | ||
41 | A utility program, "The Webalizer (DNS) Cache file Manager", or 'wcmgr' | |
42 | is also provided which allows the creation and manipulation of the DNS | |
43 | cache files used and produced by the webalizer. See the file DNS.README | |
44 | for additional information regarding DNS support. | |
45 | ||
46 | This documentation applies to The Webalizer Version 2.21 | |
47 | ||
48 | Running the Webalizer | |
49 | --------------------- | |
50 | ||
51 | The Webalizer was designed to be run from a Unix command line prompt or | |
52 | as a cron job. There are several command line options which will modify | |
53 | the results it produces, and configuration files can be used as well. | |
54 | The format of the command line is: | |
55 | ||
56 | webalizer [options ...] [log-file] | |
57 | ||
58 | Where 'options' can be one or more of the supported command line | |
59 | switches described below. 'log-file' is the name of the log file | |
60 | to process (see below for more detailed information). If a dash | |
61 | ("-") is specified for the log-file name, STDIN will be used. | |
62 | ||
63 | ||
64 | Once executed, the general flow of the program follows: | |
65 | ||
66 | o A default configuration file is scanned for. A file named | |
67 | 'webalizer.conf' is searched for in the current directory, and if | |
68 | found, its configuration data is parsed. If the file is not | |
69 | present in the current directory, the file '/etc/webalizer.conf' | |
70 | is searched for and, if found, is used instead. | |
71 | ||
72 | o Any command line arguments given to the program are parsed. This | |
73 | may include the specification of a configuration file, which is | |
74 | processed at the time it is encountered. | |
75 | ||
76 | o If a log file was specified, it is opened and made ready for | |
77 | processing. If no log file was given, or the filename '-' is | |
78 | specified on the command line, STDIN is used for input. | |
79 | ||
80 | o If an output directory was specified, the program does a 'chdir' to | |
81 | that directory in preparation for generating output. If no output | |
82 | directory was given, the current directory is used. | |
83 | ||
84 | o If a non-zero number of DNS Children processes were specified, they | |
85 | will be started, and the specified log file will be processed, | |
86 | either creating or updating the specified DNS cache file. | |
87 | ||
88 | o If no hostname was given, the program attempts to get the hostname | |
89 | using a uname system call. If that fails, 'localhost' is used. | |
90 | ||
91 | o A history file is searched for. This file keeps previous month | |
92 | totals used on the main index.html page. The default file is | |
93 | named 'webalizer.hist', kept in the specified output directory, | |
94 | however may be changed using the "HistoryName" configuration file | |
95 | keyword. | |
96 | ||
97 | o If incremental processing was specified, a data file is searched for | |
98 | and loaded if found, containing the 'internal state' data of the | |
99 | program at the end of a previous run. The default file is named | |
100 | 'webalizer.current', kept in the specified output directory, however | |
101 | may be changed using the "IncrementalName" configuration file keyword. | |
102 | ||
103 | o Main processing begins on the log file. If the log spans multiple | |
104 | months, a separate HTML document is created for each month. | |
105 | ||
106 | o After main processing, the main 'index.html' page is created, which | |
107 | has totals by month and links to each months HTML document. | |
108 | ||
109 | o A new history file is saved to disk, which includes totals generated | |
110 | by The Webalizer during the current run. | |
111 | ||
112 | o If incremental processing was specified, a data file is written that | |
113 | contains the 'internal state' data at the end of this run. | |
114 | ||
115 | ||
116 | Incremental Processing | |
117 | ---------------------- | |
118 | ||
119 | Version 1.2x of The Webalizer adds incremental run capability. Simply | |
120 | put, this allows processing large log files by breaking them up into | |
121 | smaller pieces, and processing these pieces instead. What this means | |
122 | in real terms is that you can now rotate your log files as often as you | |
123 | want, and still be able to produce monthly usage statistics without the | |
124 | loss of any detail. This is accomplished by saving and restoring all | |
125 | relevant internal data to a disk file between runs. Doing so allows the | |
126 | program to 'start where it left off' so to speak, and allows the | |
127 | preservation of detail from one run to the next. | |
128 | ||
129 | Some special precautions need to be taken when using the incremental | |
130 | run capability of The Webalizer. Configuration options should not be | |
131 | changed between runs, as that could cause corruption of the internal | |
132 | stored data. For example, changing the MangleAgents level will cause | |
133 | different representations of user agents to be stored, producing invalid | |
134 | results in the user agents section of the report. If you need to change | |
135 | configuration options, do it at the end of the month after normal | |
136 | processing of the previous month and before processing the current month. | |
137 | You may also want to delete the 'webalizer.current' file as well (or | |
138 | whatever name was specified using the "IncrementalName" configuration | |
139 | option). | |
140 | ||
141 | The Webalizer also attempts to prevent data duplication by keeping | |
142 | track of the timestamp of the last record processed. This timestamp | |
143 | is then compared to current records being processed, and any records | |
144 | that were logged previous to that timestamp are ignored. This, in | |
145 | theory, should allow you to re-process logs that have already been | |
146 | processed, or process logs that contain a mix of processed/not yet | |
147 | processed records, and not produce duplication of statistics. The | |
148 | only time this may break is if you have duplicate timestamps in two | |
149 | separate log files... any records in the second log file that do have | |
150 | the same timestamp as the last record in the previous log file processed, | |
151 | will be discarded as if they had already been processed. There are | |
152 | lots of ways to prevent this however, for example, stopping the web | |
153 | server before rotating logs will prevent this situation. This setup | |
154 | also necessitates that you always process logs in chronological order, | |
155 | otherwise data loss will occur as a result of the timestamp compare. | |
156 | ||
157 | ||
158 | Output Produced | |
159 | --------------- | |
160 | ||
161 | The Webalizer produces several reports (html) and graphics for each | |
162 | month processed. In addition, a summary page is generated for the | |
163 | current and previous months (up to 12), a history file is created | |
164 | and if incremental mode is used, the current month's processed data. | |
165 | The exact location and names of these files can be changed using | |
166 | configuration files and command line options. The files produced, | |
167 | (default names) are: | |
168 | ||
169 | index.html - Main summary page (extension may be changed) | |
170 | usage.png - Yearly graph displayed on the main index page | |
171 | usage_YYYYMM.html - Monthly summary page (extension may be changed) | |
172 | usage_YYYYMM.png - Monthly usage graph for specified month/year | |
173 | daily_usage_YYYYMM.png - Daily usage graph for specified month/year | |
174 | hourly_usage_YYYYMM.png - Hourly usage graph for specified month/year | |
175 | site_YYYYMM.html - All sites listing (if enabled) | |
176 | url_YYYYMM.html - All urls listing (if enabled) | |
177 | ref_YYYYMM.html - All referrers listing (if enabled) | |
178 | agent_YYYYMM.html - All user agents listing (if enabled) | |
179 | search_YYYYMM.html - All search strings listing (if enabled) | |
180 | webalizer.hist - Previous month history (may be changed) | |
181 | webalizer.current - Incremental Data (may be changed) | |
182 | site_YYYYMM.tab - tab delimited sites file | |
183 | url_YYYYMM.tab - tab delimited urls file | |
184 | ref_YYYYMM.tab - tab delimited referrers file | |
185 | agent_YYYYMM.tab - tab delimited user agents file | |
186 | user_YYYYMM.tab - tab delimited usernames file | |
187 | search_YYYYMM.tab - tab delimited search string file | |
188 | ||
189 | The yearly (index) report shows statistics for a 12 month period, and | |
190 | links to each month. The monthly report has detailed statistics for | |
191 | that month with additional links to any URLs and referrers found. | |
192 | The various totals shown are explained below. | |
193 | ||
194 | Hits | |
195 | ||
196 | Any request made to the server which is logged, is considered a 'hit'. | |
197 | The requests can be for anything... html pages, graphic images, audio | |
198 | files, CGI scripts, etc... Each valid line in the server log is | |
199 | counted as a hit. This number represents the total number of requests | |
200 | that were made to the server during the specified report period. | |
201 | ||
202 | Files | |
203 | ||
204 | Some requests made to the server, require that the server then send | |
205 | something back to the requesting client, such as a html page or graphic | |
206 | image. When this happens, it is considered a 'file' and the files | |
207 | total is incremented. The relationship between 'hits' and 'files' can | |
208 | be thought of as 'incoming requests' and 'outgoing responses'. | |
209 | ||
210 | Pages | |
211 | ||
212 | Pages are, well, pages! Generally, any HTML document, or anything | |
213 | that generates an HTML document, would be considered a page. This | |
214 | does not include the other stuff that goes into a document, such as | |
215 | graphic images, audio clips, etc... This number represents the number | |
216 | of 'pages' requested only, and does not include the other 'stuff' that | |
217 | is in the page. What actually constitutes a 'page' can vary from | |
218 | server to server. The default action is to treat anything with the | |
219 | extension '.htm', '.html' or '.cgi' as a page. A lot of sites will | |
220 | probably define other extensions, such as '.phtml', '.php3' and '.pl' | |
221 | as pages as well. Some people consider this number as the number of | |
222 | 'pure' hits... I'm not sure if I totally agree with that viewpoint. | |
223 | Some other programs (and people :) refer to this as 'Pageviews'. | |
224 | ||
225 | Sites | |
226 | ||
227 | Each request made to the server comes from a unique 'site', which can | |
228 | be referenced by a name or ultimately, an IP address. The 'sites' | |
229 | number shows how many unique IP addresses made requests to the server | |
230 | during the reporting time period. This DOES NOT mean the number of | |
231 | unique individual users (real people) that visited, which is impossible | |
232 | to determine using just logs and the HTTP protocol (however, this | |
233 | number might be about as close as you will get). | |
234 | ||
235 | Visits | |
236 | ||
237 | Whenever a request is made to the server from a given IP address | |
238 | (site), the amount of time since a previous request by the address | |
239 | is calculated (if any). If the time difference is greater than a | |
240 | pre-configured 'visit timeout' value (or has never made a request before), | |
241 | it is considered a 'new visit', and this total is incremented (both | |
242 | for the site, and the IP address). The default timeout value is 30 | |
243 | minutes (can be changed), so if a user visits your site at 1:00 in | |
244 | the afternoon, and then returns at 3:00, two visits would be registered. | |
245 | Note: in the 'Top Sites' table, the visits total should be discounted | |
246 | on 'Grouped' records, and thought of as the "Minimum number of visits" | |
247 | that came from that grouping instead. Note: Visits only occur on | |
248 | PageType requests, that is, for any request whose URL is one of the | |
249 | 'page' types defined with the PageType and PagePrefix option, and not | |
250 | excluded by the OmitPage option. Due to the limitation of the HTTP | |
251 | protocol, log rotations and other factors, this number should not be | |
252 | taken as absolutely accurate, rather, it should be considered a pretty | |
253 | close "guess". | |
254 | ||
255 | KBytes | |
256 | ||
257 | The KBytes (kilobytes) value shows the amount of data, in KB, that | |
258 | was sent out by the server during the specified reporting period. This | |
259 | value is generated directly from the log file, so it is up to the | |
260 | web server to produce accurate numbers in the logs (some web servers | |
261 | do stupid things when it comes to reporting the number of bytes). In | |
262 | general, this should be a fairly accurate representation of the amount | |
263 | of outgoing traffic the server had, regardless of the web servers | |
264 | reporting quirks. | |
265 | ||
266 | Note: A kilobyte is 1024 bytes, not 1000 :) | |
267 | ||
268 | Top Entry and Exit Pages | |
269 | ||
270 | The Top Entry and Exit tables give a rough estimate of what URLs | |
271 | are used to enter your site, and what the last pages viewed are. | |
272 | Because of limitations in the HTTP protocol, log rotations, etc... | |
273 | this number should be considered a good "rough guess" of the actual | |
274 | numbers, however will give a good indication of the overall trend in | |
275 | where users come into, and exit, your site. | |
276 | ||
277 | ||
278 | Command Line Options | |
279 | -------------------- | |
280 | ||
281 | The Webalizer supports many different configuration options that will | |
282 | alter the way the program behaves and generates output. Most of these | |
283 | can be specified on the command line, while some can only be specified | |
284 | in a configuration file. The command line options are listed below, | |
285 | with references to the corresponding configuration file keywords. | |
286 | ||
287 | -------------------------------------------------------------------------- | |
288 | ||
289 | General Options | |
290 | --------------- | |
291 | ||
292 | -h Display all available command line options and exit program. | |
293 | ||
294 | -v Be Verbose. This will cause the program to print additional | |
295 | information at run time. It is the same as specifying | |
296 | "Quiet no", "ReallyQuiet no" and "Debug yes" config options. | |
297 | ||
298 | -V Display the program version and exit. Additional program | |
299 | specific information will be displayed if 'verbose' mode is | |
300 | also used (e.g. '-vV'), which can be useful when submitting | |
301 | bug reports. | |
302 | ||
303 | -d Display additional 'debugging' information for errors and | |
304 | warnings produced during processing. This normally would | |
305 | not be used except to determine why you are getting all those | |
306 | errors and wanted to see the actual data. Normally The | |
307 | Webalizer will just tell you it found an error, not the | |
308 | actual data. This option will display the data as well. | |
309 | Config file keyword: Debug | |
310 | ||
311 | -F Specify the log file type to process. Normally, the | |
312 | Webalizer expects to find a valid CLF or Combined format | |
313 | we server log file. This option allows you to process | |
314 | wu-ftpd xferlogs, squid and W3C formatted web logs as well. | |
315 | Values can be either 'clf', 'ftp', 'squid' or 'w3c' with | |
316 | 'clf' being the default. Only the first character needs | |
317 | to be specified (eg: -Fs will process a squid log). | |
318 | Config file keyword: LogType | |
319 | ||
320 | -f Fold out of sequence log records back into analysis, by | |
321 | treating them as if they were the same date/time as the | |
322 | last good record. Normally, out of sequence log records | |
323 | are ignored. If you run apache, don't worry about this. | |
324 | Config file keyword: FoldSeqErr | |
325 | ||
326 | -i Ignore history file. USE WITH CAUTION. This causes The | |
327 | Webalizer to ignore any existing history file produced from | |
328 | previous runs and generate its output from scratch. The | |
329 | effect will be as if The Webalizer is being run for the | |
330 | first time and any previous statistics will be lost (although | |
331 | the HTML documents, if any, will not be deleted) on the main | |
332 | index.html (yearly) web page. | |
333 | Config file keyword: IgnoreHist | |
334 | ||
335 | -b Ignore incremental data file. USE WITH CAUTION. This causes | |
336 | The Webalizer to ignore any existing incremental (state) data | |
337 | file produced by previous runs. By ignoring the incremental | |
338 | data file, all previous processing for the current month will | |
339 | be lost, and those logs must be re-processed. | |
340 | Config file keyword: IgnoreState | |
341 | ||
342 | -p Preserve state (incremental processing). This allows the | |
343 | processing of partial logs in increments. At the end of | |
344 | the program, all relevant internal data is saved, so that | |
345 | it may be restored the next time the program is run. This | |
346 | allows sites that must rotate their logs more than once a | |
347 | month to still be able to use The Webalizer, and not worry | |
348 | about having to gather and feed an entire months logs to | |
349 | the program at the end of the month. See the section on | |
350 | "Incremental Processing" below for additional information. | |
351 | The default is to not perform incremental processing. Use | |
352 | this command line option to enable the feature. | |
353 | Config file keyword: Incremental | |
354 | ||
355 | -q Quiet mode. Normally, The Webalizer will produce various | |
356 | messages while it runs letting you know what its doing. | |
357 | This option will suppress those messages. It should be | |
358 | noted that this WILL NOT suppress errors and warnings, which | |
359 | are output to STDERR. | |
360 | Config file keyword: Quiet | |
361 | ||
362 | -Q ReallyQuiet mode. This allows suppression of _all_ messages | |
363 | generated by The Webalizer, including warnings and errors. | |
364 | Useful when The Webalizer is run as a cron job. | |
365 | Config file keyword: ReallyQuiet | |
366 | ||
367 | -T Display timing information. The Webalizer keeps track of the | |
368 | time it begins and ends processing, and normally displays the | |
369 | total processing time at the end of each run. If quiet mode | |
370 | (-q or 'Quiet yes' in configuration file) is specified, this | |
371 | information is not displayed. This option forces the display | |
372 | of timing totals if quiet mode has been specified, otherwise | |
373 | it is redundant and will have no effect. | |
374 | Config file keyword: TimeMe | |
375 | ||
376 | -c file This option specifies a configuration file to use. Configuration | |
377 | files allow greater control over how The Webalizer behaves, and | |
378 | there are several ways to use them. As of version 0.98, The | |
379 | Webalizer searches for a default configuration file in the | |
380 | current directory named "webalizer.conf", and if not found, | |
381 | will search in the /etc/ directory for a file of the same name. | |
382 | In addition, you may specify a configuration file to use with | |
383 | this command line option. | |
384 | ||
385 | -n name This option specifies the hostname for the reports generated. | |
386 | The hostname is used in the title of all reports, and is also | |
387 | prepended to URLs in the reports. This allows The Webalizer | |
388 | to be run on log files for 'virtual' web servers or web servers | |
389 | that are different than the machine the reports are located on, | |
390 | and still allows clicking on the URLs to go to the proper | |
391 | location. If a hostname is not specified, either on the | |
392 | command line or in a configuration file, The Webalizer attempts | |
393 | to determine the hostname using a 'uname' system call. If this | |
394 | fails, "localhost" will be used as the hostname. | |
395 | Config file keyword: HostName | |
396 | ||
397 | -o dir This options specifies the output directory for the reports. | |
398 | If not specified here or in a configuration file, the current | |
399 | default directory will be used for output. | |
400 | Config file keyword: OutputDir | |
401 | ||
402 | -x name This option allows the generated pages to have an extension | |
403 | other than '.html', which is the default. Do not include the | |
404 | leading period ('.') when you specify the extension. | |
405 | Config file keyword: HTMLExtension | |
406 | ||
407 | -P name Specify the file extensions for 'pages'. Pages (sometimes | |
408 | called 'PageViews') are normally html documents and CGI | |
409 | scripts that display the whole page, not just parts of it. | |
410 | Some system will need to define a few more, such as 'phtml', | |
411 | 'php3' or 'pl' in order to have them counted as well. The | |
412 | default is 'htm*' and 'cgi' for web logs and 'txt' for ftp. | |
413 | Config file keyword: PageType | |
414 | ||
415 | -O name Specify URLs which are not counted as 'pages'. Requests | |
416 | matching one of these URLs will not be counted as a page, even | |
417 | if they have an extension matching one of the PageTypes defined | |
418 | above or have no extension at all. | |
419 | Config file keyword: OmitPage | |
420 | ||
421 | -t name This option specifies the title string for all reports. This | |
422 | string is used, in conjunction with the hostname (if not blank) | |
423 | to produce the actual title. If not specified, the default of | |
424 | "Usage Statistics for" will be used. | |
425 | Config file keyword: ReportTitle | |
426 | ||
427 | -Y Suppress Country graph. Normally, The Webalizer produces | |
428 | country statistics in both Graph and Columnar forms. This | |
429 | option will suppress the Country Graph from being generated. | |
430 | Config file keyword: CountryGraph | |
431 | ||
432 | -G Suppress hourly graph. Normally, The Webalizer produces | |
433 | hourly statistics in both Graph and Columnar forms. This | |
434 | option will suppress the Hourly Graph only from being generated. | |
435 | Config file keyword: HourlyGraph | |
436 | ||
437 | -H Suppress Hourly statistics. Normally, The Webalizer produces | |
438 | hourly statistics in both Graph and Columnar forms. This | |
439 | option will suppress the Hourly Statistics table only from | |
440 | being generated. | |
441 | Config file keyword: HourlyStats | |
442 | ||
443 | -K num Specify how many months should be displayed in the main index | |
444 | (yearly summary) table. Default is 12 months. Can be set to | |
445 | anything between 12 and 120 months (1 to 10 years). | |
446 | Config file keyword: IndexMonths | |
447 | ||
448 | -k num Specify how many months should be displayed in the main index | |
449 | (yearly summary) graph. Default is 12 months. Can be set to | |
450 | anything between 12 and 72 months (1 to 6 years). | |
451 | Config file keyword: GraphMonths | |
452 | ||
453 | -L Disable Graph Legends. The color coded legends displayed on | |
454 | the in-line graphs can be disabled with this option. The | |
455 | default is to display the legends. | |
456 | Config file keyword: GraphLegend | |
457 | ||
458 | -l num Graph Lines. Specify the number of background reference | |
459 | lines displayed on the in-line graphics produced. The default | |
460 | is 2 lines, however can range anywhere from zero ('0') for | |
461 | no lines, up to 20 lines (looks funny!). | |
462 | Config file keyword: GraphLines | |
463 | ||
464 | -P name Page type. This is the extension of files you consider to | |
465 | be pages for Pages calculations (sometimes called 'pageviews'). | |
466 | The default is 'htm*' and 'cgi' (plus whatever HTMLExtension | |
467 | you specified if it is different). Don't use a period! | |
468 | ||
469 | -m num Specify a 'visit timeout'. Visits are calculated by looking at | |
470 | the time difference between the current and last request made | |
471 | by a specific host. If the difference is greater that the | |
472 | visit timeout value, the request is considered a new visit. | |
473 | This value is specified in number of seconds. The default | |
474 | is 30 minutes (1800). | |
475 | Config file keyword: VisitTimeout | |
476 | ||
477 | -M num Mangle user agent names. Normally, The Webalizer will keep | |
478 | track of the user agent field verbatim. Unfortunately, there are | |
479 | a ton of different names that user agents go by, and the field | |
480 | also reports other items such as machine type and OS used. For | |
481 | Example, Netscape 4.03 running on Windows 95 will report a | |
482 | different string than Netscape 4.03 running on Windows NT, so even | |
483 | though they are the same browser type, they will be considered | |
484 | as two totally different browsers by The Webalizer. For that | |
485 | matter, Netscape 4.0 running on Windows NT will report different | |
486 | names if one is run on an Alpha and the other on an Intel | |
487 | processor! Internet Exploder is even worse, as it reports itself | |
488 | as if it were Netscape and you have to search the given string a | |
489 | little deeper to discover that it is really MSIE! In order to | |
490 | consolidate generic browser types, this option will cause The | |
491 | Webalizer to 'mangle' the user agent field, attempting to | |
492 | consolidate generic browser types. There are 6 levels that can be | |
493 | specified, each producing different levels of detail. Level 5 | |
494 | displays only the browser name (MSIE or Mozilla) and the major | |
495 | version number. Level 4 will also display the minor version | |
496 | number (single decimal place). Level 3 will display the minor | |
497 | version number to two decimal places. Level 2 will add any | |
498 | sub-level designation (such as Mozilla/3.01Gold or MSIE 3.0b). | |
499 | Level 1 will also attempt to add the system type. The default | |
500 | Level 0 will disable name mangling and leave the user agent | |
501 | field unmodified, producing the greatest amount of detail. | |
502 | Configuration file keyword: MangleAgents | |
503 | ||
504 | -g num This option allows you to specify the level of domains name | |
505 | grouping to be performed. The numeric value represents the | |
506 | level of grouping, and can be thought of as the 'number of | |
507 | dots' to be displayed. The default value of 0 disables any | |
508 | domain name grouping. | |
509 | Configuration file keyword: GroupDomains | |
510 | ||
511 | -D name This allows the specification of a DNS Cache file name. This | |
512 | filename MUST be specified if you have dns lookups enabled | |
513 | (using the -N command line switch or DNSChildren configuration | |
514 | keyword). The filename is relative to the default output | |
515 | directory if an absolute path is not specified (ie: starts | |
516 | with a leading '/'). This option is only available if DNS | |
517 | support was enabled at compile time, otherwise an 'Invalid | |
518 | Keyword' error will be generated. See the DNS.README file | |
519 | for additional information regarding DNS lookups. | |
520 | Configuration file keyword: DNSCache | |
521 | ||
522 | -N num Number of DNS child processes to use for reverse DNS lookups. | |
523 | If specified, a DNSCache name MUST be specified also. If you | |
524 | do not wish a DNS cache file to be generated, specify a value | |
525 | of zero ('0') to disable it. This does not prevent using an | |
526 | existing cache file, only the generation of one at run time. | |
527 | See the DNS.README file for additional information. | |
528 | Configuration file keyword: DNSChildren | |
529 | ||
530 | -j Enable native GeoDB geolocation services. | |
531 | Configuration file keyword: GeoDB | |
532 | ||
533 | -J name Specify an alternate GeoDB database filename to use. This | |
534 | shouldn't normally be needed. If used, the filename 'name' | |
535 | is relative to the output directory being used unless an | |
536 | absolute path is specified (ie: starts with a leading '/'). | |
537 | Configuration file keyword: GeoDBDatabase | |
538 | ||
539 | -w Enable GeoIP support if it is available. | |
540 | Configuration file keyword: GeoIP | |
541 | ||
542 | -W name Specify an alternate GeoIP database filename to use. This | |
543 | shouldn't normally be needed. If used, the filename 'name' | |
544 | is relative to the specified output directory unless an | |
545 | absolute name is given (ie: starts with a leading '/'). | |
546 | Configuration file keyword: GeoIPDatabase | |
547 | ||
548 | -z name Specify location of the country flag graphics and enable | |
549 | their display in the top country table. The directory name | |
550 | is relative to the output directory unless an absolute path | |
551 | is specified (ie: starts with a leading '/'). | |
552 | Configuration file keyword: FlagDir | |
553 | ||
554 | Hide Options | |
555 | ------------ | |
556 | ||
557 | The following options take a string argument to use as a comparison | |
558 | for matching. Except for the IndexAlias option, the string argument | |
559 | can be plain text, or plain text that either starts or ends with the | |
560 | wildcard character '*'. | |
561 | ||
562 | For Example: | |
563 | ||
564 | Given the string "yourmama/was/here", the arguments "was", "*here" and | |
565 | "your*" will all produce a match. | |
566 | ||
567 | ||
568 | -a name This option allows hiding of user agents (browsers) from the | |
569 | "Top User Agents" table in the report. This option really | |
570 | isn't too useful as there are a zillion different names that | |
571 | current browsers go by, depending where they were obtained, | |
572 | however you might have some particular user agents that hit | |
573 | your site a lot that you would like to exclude from the list. | |
574 | You must have a web server that includes user agents in its | |
575 | log files for this option to be of any use. In addition, it | |
576 | is also useless if you disable the user agent table in the | |
577 | report (see the -A command line option or "TopAgents" | |
578 | configuration file keyword). You can specify as many of these | |
579 | as you want on the command line. The wildcard character '*' | |
580 | can be used either in front of or at the end of the string. | |
581 | (ie: Mozilla/4.0* would match anything that starts with the | |
582 | string "Mozilla/4.0"). | |
583 | Config file keyword: HideAgent | |
584 | ||
585 | -r name This option allows hiding of referrers from the "Top Referrer" | |
586 | table in the report. Referrers are URLs, either on your own | |
587 | local site or a remote site, that referred the user to a URL | |
588 | on your web server. This option is normally used to hide | |
589 | your own server from the table, as your own pages are usually | |
590 | the top referrers to your own pages (well, you get the idea). | |
591 | You must have a web server that includes referrer information | |
592 | in the log files for this option to be of any use. In addition, | |
593 | it is also useless if you disable the referrers table in the | |
594 | report (see the -R command line option or "TopReferrers" | |
595 | configuration file keyword). You can specify as many of these | |
596 | as you like on the command line. | |
597 | Config file keyword: HideReferrer | |
598 | ||
599 | -s name This option allows hiding of sites from the "Top Sites" table | |
600 | in the report. Normally, you will only want to hide your own | |
601 | domain name from the report, as it usually is one of the top | |
602 | sites to visit your web server. This option is of no use if | |
603 | you disable the top sites table in the report (see the -S | |
604 | command line option or "TopSites" configuration file option). | |
605 | Config file keyword: HideSite | |
606 | ||
607 | -X This causes all individual sites to be hidden, which results | |
608 | in only grouped sites to be displayed on the report. | |
609 | Config file keyword: HideAllSites | |
610 | ||
611 | -u name This option allows hiding of URLs from the "Top URLs" table | |
612 | in the report. Normally, this option is used to hide images, | |
613 | audio files and other objects your web server dishes out that | |
614 | would otherwise clutter up the table. This option is of no | |
615 | use if you disable the top URLs table in the report (see the | |
616 | -U command line option or "TopURLs" configuration file keyword). | |
617 | Config file keyword: HideURL | |
618 | ||
619 | -I name This option allows you to specify additional index.html aliases. | |
620 | The Webalizer usually strips the string 'index.*' from URLs | |
621 | before processing (unless disabled using the 'DefaultIndex' | |
622 | config option), which has the effect of turning a URL such | |
623 | as /somedir/index.html into just /somedir/ which is really the | |
624 | same URL and should be treated as such. This option allows you | |
625 | to specify _additional_ strings that are to be treated the same | |
626 | way. Use with care, improper use could cause unexpected results. | |
627 | For example, if you specify the alias string of 'home', a URL | |
628 | such as /somedir/homepages/brad/home.html would be converted | |
629 | into just /somedir/ which probably isn't what was intended. | |
630 | This option is useful if your web server uses a different default | |
631 | index page other than the standard 'index.html' or 'index.htm', | |
632 | such as 'home.html' or 'homepage.html'. The string specified | |
633 | is searched for _anywhere_ in the URL, so "home.htm" would | |
634 | turn both "/somedir/home.htm" and "/somedir/home.html" into | |
635 | just "/somedir/". Wildcards are _not_ allowed on this one. | |
636 | Config file keyword: IndexAlias | |
637 | ||
638 | Table Size Options | |
639 | ------------------ | |
640 | ||
641 | -e num This option specifies the number of entries to display in the | |
642 | "Top Entry Pages" table. To disable the table, use a value of | |
643 | zero (0). | |
644 | Config file keyword: TopEntry | |
645 | ||
646 | -E num This option specifies the number of entries to display in the | |
647 | "Top Exit Pages" table. To disable the table, use a value of | |
648 | zero (0). | |
649 | Config file keyword: TopExit | |
650 | ||
651 | -A num This option specifies the number of entries to display in the | |
652 | "Top User Agents" table. To disable the table, use a value of | |
653 | zero (0). | |
654 | Config file keyword: TopAgents | |
655 | ||
656 | -C num This option specifies the number of entries to display in the | |
657 | "Top Countries" table. To disable the table, use a value of | |
658 | zero (0). | |
659 | Config file keyword: TopCountries | |
660 | ||
661 | -R num This option specifies the number of entries to display in the | |
662 | "Top Referrers" table. To disable the table, use a value of | |
663 | zero (0). | |
664 | Config file keyword: TopReferrers | |
665 | ||
666 | -S num This option specifies the number of entries to display in the | |
667 | "Top Sites" table. To disable the table, use a value of | |
668 | zero (0). | |
669 | Config file keyword: TopSites | |
670 | ||
671 | -U num This option specifies the number of entries to display in the | |
672 | "Top URLs" table. To disable the table, use a value of | |
673 | zero (0). | |
674 | Config file keyword: TopURLs | |
675 | ||
676 | -------------------------------------------------------------------------- | |
677 | ||
678 | ||
679 | CONFIGURATION FILES | |
680 | ------------------- | |
681 | ||
682 | The Webalizer allows configuration files to be used in order to simplify | |
683 | life for all. There are several ways that configuration files are accessed | |
684 | by the Webalizer. When The Webalizer first executes, it looks for a | |
685 | default configuration file named "webalizer.conf" in the current directory, | |
686 | and if not found there, will look for "/etc/webalizer.conf". In addition, | |
687 | configuration files may be specified on the command line with the '-c' | |
688 | option. There are lots of different ways you can combine the use of | |
689 | configuration files and command line options to produce various results. | |
690 | The Webalizer always looks for and reads configuration options from a | |
691 | default configuration file before doing anything else. Because of this, | |
692 | you can override options found in the default file by use of additional | |
693 | configuration files specified on the command line or command line options | |
694 | themselves. If you specify a configuration file on the command line, you | |
695 | can override options in it by additional command line options which follow. | |
696 | For example, most users will most likely want to create the default file | |
697 | /etc/webalizer.conf and place options in it to specify the hostname, log | |
698 | file, table options, etc... At the end of the month when a different log | |
699 | file is to be used (the end of month log), you can run The Webalizer as | |
700 | usual, but put the different filename on the end of the command line, which | |
701 | will override the log file specified in the configuration file. It should | |
702 | be noted that you cannot override some configuration file options by the | |
703 | use of command line arguments. For example, if you specify "Quiet yes" in | |
704 | a configuration file, you cannot override this with a command line argument, | |
705 | as the command line option only _enables_ the feature (-q option). | |
706 | ||
707 | The configuration files are standard ASCII text files that may be created | |
708 | or edited using any standard editor. Blank lines and lines that begin | |
709 | with a pound sign ('#') are ignored. Any other lines are considered to | |
710 | be configuration lines, and have the form "Keyword Value", where the | |
711 | 'Keyword' is one of the currently available configuration keywords defined | |
712 | below, and 'Value' is the value to assign to that particular option. Any | |
713 | text found after the keyword up to the end of the line is considered the | |
714 | keyword's value, so you should not include anything after the actual value | |
715 | on the line that is not actually part of the value being assigned. The | |
716 | file "sample.conf" provided with the distribution contains lots of useful | |
717 | documentation and examples as well. It should be noted that you do not | |
718 | have to use any configuration files at all, in which case, default values | |
719 | will be used (which should be sufficient for most sites). | |
720 | ||
721 | -------------------------------------------------------------------------- | |
722 | ||
723 | General Configuration Keywords | |
724 | ------------------------------ | |
725 | ||
726 | LogFile This defines the log file to use. It should be a fully qualified | |
727 | name (ie: contain the path), but relative names will work as | |
728 | well. If not specified, the logfile defaults to STDIN. | |
729 | ||
730 | LogType This specified the log file type being used. Normally, The | |
731 | Webalizer processes web logs in either CLF or Combined format. | |
732 | You may also process wu-ftpd xferlog formatted logs, squid | |
733 | proxy logs or W3C formatted web logs by setting the appropriate | |
734 | type using this keyword. Values may be either 'clf', 'ftp', | |
735 | 'squid' or 'w3c'. Ensure that you specify the proper file type, | |
736 | otherwise you will be presented with a long stream of 'invalid | |
737 | record' messages when the Webalizer is run ;) | |
738 | Command line argument: -F | |
739 | ||
740 | OutputDir This defines the output directory to use for the reports. If | |
741 | it is not specified, the current directory is used. | |
742 | Command line argument: -o | |
743 | ||
744 | HistoryName Allows specification of a history path/filename if desired. | |
745 | The default is to use the file named 'webalizer.hist', kept | |
746 | in the normal output directory (OutputDir above). Any name | |
747 | specified is relative to the normal output directory unless | |
748 | an absolute path name is given (ie: starts with a '/'). | |
749 | ||
750 | ReportTitle This specifies the title to use for the generated reports. | |
751 | It is used in conjunction with the hostname (unless blank) | |
752 | to produce the final report titles. If not defined, the | |
753 | default of "Usage Statistics for" is used. | |
754 | Command line argument: -t | |
755 | ||
756 | HostName This defines the hostname. The hostname is used in the | |
757 | report title as well as being prepended to URLs in the | |
758 | "Top URLs" table. This allows The Webalizer to be run | |
759 | on "virtual" web servers, or servers that do not reside | |
760 | on the local machine, and allows clicking on the URL to | |
761 | go to the right place. If not specified, The Webalizer | |
762 | attempts to get the hostname via a 'uname' system call, | |
763 | and if that fails, will default to "localhost". | |
764 | Command line argument: -n | |
765 | ||
766 | UseHTTPS Causes the links in the 'Top URLs' table to use 'https://' | |
767 | instead of the default 'http://' prefix. Not much use if | |
768 | you run a mix of secure/insecure servers on your machine. | |
769 | Only useful if you run the analysis on a secure servers | |
770 | logs, and want the links in the table to work properly. | |
771 | ||
772 | HTAccess Enables the creation of a default .htaccess file in the | |
773 | output directory. If enabled, the file will be created | |
774 | (with a single "DirectoryIndex" directive), unless one | |
775 | already exists. The default is 'no', which disables the | |
776 | creation of any .htaccess files. | |
777 | ||
778 | Quiet This allows you to enable or disable informational messages | |
779 | while it is running. The values for this keyword can be | |
780 | either 'yes' or 'no'. Using "Quiet yes" will suppress these | |
781 | messages, while "Quiet no" will enable them. The default | |
782 | is 'no' if not specified, which will allow The Webalizer | |
783 | to display informational messages. It should be noted that | |
784 | this option has no effect on Warning or Error messages that | |
785 | may be generated, as they go to STDERR. | |
786 | Command line argument: -q | |
787 | ||
788 | ReallyQuiet This allows all generated output to be suppressed, including | |
789 | warning and error messages. The values for this keyword | |
790 | can be either 'yes' or 'no', with 'no' being the default. | |
791 | Command line argument: -Q | |
792 | ||
793 | TimeMe This allows you to display timing information regardless of | |
794 | any "quiet mode" specified. Useful only if you did in fact | |
795 | tell the webalizer to be quiet either by using the -q command | |
796 | line option or the "Quiet" keyword, otherwise timing stats | |
797 | are normally displayed anyway. Values may be either 'yes' | |
798 | or 'no', with the default being 'no'. | |
799 | Command line argument: -T | |
800 | ||
801 | GMTTime This keyword allows timestamps to be displayed in GMT (UTC) | |
802 | time instead of local time. Normally The Webalizer will | |
803 | display timestamps in the time-zone of the local machine | |
804 | (ie: PST or EDT). This keyword allows you to specify the | |
805 | display of timestamps in GMT (UTC) time instead. Values | |
806 | may be either 'yes' or 'no'. Default is 'no'. | |
807 | ||
808 | Debug This tells The Webalizer to display additional information | |
809 | when it encounters Warnings or Errors. Normally, The | |
810 | Webalizer will just tell you it found a bad record or | |
811 | field. This option will enable the display of the actual | |
812 | data that produced the Warning or Error as well. Useful | |
813 | only if you start getting lots of Warnings or Errors and | |
814 | want to determine the cause. Values may be either 'yes' | |
815 | or 'no', with the default being 'no'. | |
816 | Command line argument: -d | |
817 | ||
818 | IgnoreHist This suppresses the reading of a history file. USE WITH | |
819 | EXTREME CAUTION as the history file is how The Webalizer | |
820 | keeps track of previous months. The effect of this option | |
821 | is as if The Webalizer was being run for the very first | |
822 | time, and any previous data is discarded. Values may be | |
823 | either 'yes' or 'no', with the default being 'no'. | |
824 | Command line argument: -i | |
825 | ||
826 | IgnoreState This suppresses the reading of an existing incremental | |
827 | data file. USE WITH EXTREME CAUTION! By ignoring an | |
828 | existing incremental data file, all previous processing | |
829 | for the current month will be lost, and those logs must | |
830 | be re-processed. Values may be 'yes' or 'no', with the | |
831 | default being 'no'. | |
832 | Command line argument: -b | |
833 | ||
834 | FoldSeqErr Allows log records that are out of sequence to be folded | |
835 | back into the analysis, by treating them as if they had | |
836 | the same date/time as the last good record. Normally, | |
837 | out of sequence log records are simply ignored. If you | |
838 | run apache, don't worry about this. | |
839 | ||
840 | VisitTimeout Set the 'visit timeout' value. Visits are determined by | |
841 | looking at the time difference between the current and last | |
842 | request made by a specific site. If the difference in time | |
843 | is greater than the visit timeout value, the request is | |
844 | considered a new visit. The value is in number of seconds, | |
845 | and defaults to 30 minutes (1800). | |
846 | Command line argument: -m | |
847 | ||
848 | PageType Allows you to define the 'page' type extension. Normally, | |
849 | people consider HTML and CGI scripts as 'pages'. This | |
850 | option allows you to specify what extensions you consider | |
851 | a page. Default is 'htm*' and 'cgi' for web logs, and | |
852 | 'txt' for ftp logs. | |
853 | Command line argument: -P | |
854 | ||
855 | PagePrefix Allows all requests with a specified prefix to be considered | |
856 | as 'pages'. If you want everything under /documents to be | |
857 | treated as pages no matter what their extension is. Also | |
858 | useful if you have cgi-scripts with PATH_INFO. | |
859 | ||
860 | OmitPage Allows specified URLs to not be counted as pages under any | |
861 | circumstance, even if they have an extension matching a | |
862 | PageType or PagePrefix as defined above. | |
863 | ||
864 | GraphLegend Enable/disable the display of color coded legends on the | |
865 | produced graphs. Default is 'yes', to display them. | |
866 | Command line argument: -L | |
867 | ||
868 | GraphLines Specify the number of background reference lines to display | |
869 | on produced graphs. The default is 2. To disable the use | |
870 | of background lines, use zero ('0'). | |
871 | Command line argument: -l | |
872 | ||
873 | IndexMonths Specify the number of months to display in the main index | |
874 | (yearly summary) table. Default is 12 months. Can be set | |
875 | to anything between 12 and 120 months (1 to 10 years). | |
876 | Command line argument: -K | |
877 | ||
878 | YearHeaders Enable/disable the display of year headers in the main index | |
879 | (yearly summary) table. If enabled, year headers will be | |
880 | shown when the table is displaying more than 16 months worth | |
881 | of data. Values can be 'yes' or 'no'. Default is 'yes'. | |
882 | ||
883 | GraphMonths Specify the number of months to display in the main index | |
884 | (yearly summary) graph. Default is 12 months. Can be set | |
885 | to anything between 12 and 72 months (1 to 6 years). | |
886 | Command line argument: -k | |
887 | ||
888 | CountryGraph This keyword is used to either enable or disable the creation | |
889 | and display of the Country Usage graph. Values may be either | |
890 | 'yes' or 'no', with the default being 'yes'. | |
891 | Command line argument: -Y | |
892 | ||
893 | CountryFlags Enables or disables the display of flags in the top country | |
894 | table. If enabled, the default directory 'flags' directly | |
895 | under the output directory will be used unless a different | |
896 | path is specified with the 'FlagDir' option below. | |
897 | Command line argument: -zflags | |
898 | ||
899 | FlagDir Specifies the location of flag graphics. If not specified, | |
900 | the default is in the 'flags' directory directly under the | |
901 | output directory being used for the reports. If specified, | |
902 | the display of flags will be enabled by default. | |
903 | Command line argument: -z | |
904 | ||
905 | DailyGraph This keyword is used to either enable or disable the creation | |
906 | and display of the Daily Usage graph. Values may be either | |
907 | 'yes' or 'no', with the default being 'yes'. | |
908 | ||
909 | DailyStats This keyword is used to either enable or disable the creation | |
910 | and display of the Daily Usage statistics table. Values may | |
911 | be either 'yes' or 'no', with the default being 'yes'. | |
912 | ||
913 | HourlyGraph This keyword is used to either enable or disable the creation | |
914 | and display of the Hourly Usage graph. Values may be either | |
915 | 'yes' or 'no', with the default being 'yes'. | |
916 | Command line argument: -G | |
917 | ||
918 | HourlyStats This keyword is used to either enable or disable the creation | |
919 | and display of the Hourly Usage statistics table. Values may | |
920 | be either 'yes' or 'no', with the default being 'yes'. | |
921 | Command line argument: -H | |
922 | ||
923 | IndexAlias This allows additional 'index.html' aliases to be defined. | |
924 | Normally, The Webalizer scans for and strips the string | |
925 | "index." from URLs before processing them (unless disabled | |
926 | using the DefaultIndex config option below). This turns a | |
927 | URL such as /somedir/index.html into just /somedir/ which | |
928 | is really the same URL. This keyword allows _additional_ | |
929 | names to be treated in the same fashion for sites that use | |
930 | different default names, such as "home.html". The string | |
931 | is scanned for anywhere in the URL, so care should be used | |
932 | if and when you define additional aliases. For example, | |
933 | if you were to use an alias such as 'home', the URL | |
934 | /somedir/homepages/brad/home.html would be turned into just | |
935 | /somedir/ which probably isn't the intended result. Instead, | |
936 | you should have specified 'home.htm' which would correctly | |
937 | turn the URL into /somedir/homepages/brad/ like intended. | |
938 | It should also be noted that specified aliases are scanned | |
939 | for in EVERY log record... A bunch of aliases will noticeably | |
940 | degrade performance as each record has to be scanned for | |
941 | every alias defined. You don't have to specify 'index.' as | |
942 | it is always the default (unless disabled with the config | |
943 | option "DefaultIndex" described below). | |
944 | Command line argument: -I | |
945 | ||
946 | DefaultIndex This option is used to enable/disable the use of "index." as | |
947 | a default index name to be stripped from the end of a URL. | |
948 | Most sites should not need to use this option, however some | |
949 | may find it useful, particularly those whose default index | |
950 | file name is something different, or those sites that use | |
951 | 'index.php' or similar URLs to generate dynamic content. | |
952 | This option does not effect any of the names that may be | |
953 | defined using the IndexAlias option, and those names will | |
954 | still function as described. Values may be 'yes' or 'no', | |
955 | with 'yes' being the default. | |
956 | ||
957 | MangleAgents The MangleAgents keyword specifies the level of user agent | |
958 | name mangling, if any. There are 6 levels that may be specified, | |
959 | each producing a different level of detail displayed. Level 5 | |
960 | displays only the browser name (MSIE or Mozilla) and the major | |
961 | version number. Level 4 adds the minor version (single | |
962 | decimal place). Level 3 adds the minor version to two decimal | |
963 | places. Level 2 will also add any sub-level designation | |
964 | (such as Mozilla/3.01Gold or MSIE 3.0b). Level 1 will also | |
965 | attempt to add the system type. The default level 0 will | |
966 | leave the user agent field unmodified and produces the | |
967 | greatest amount of detail. | |
968 | Command line argument: -M | |
969 | ||
970 | SearchEngine This keyword allows specification of search engines and | |
971 | their query strings. Search strings are obtained from | |
972 | the referrer field in the record, and in order to work | |
973 | properly, the Webalizer needs to know what query strings | |
974 | different search engines use. The SearchEngine allows | |
975 | you to specify the search engine and its query string | |
976 | to parse the search string from. The line is formatted | |
977 | as: "SearchEngine engine-string query-string" where | |
978 | 'engine-string' is a substring for matching the search | |
979 | engine with, such as "yahoo.com" or "altavista". The | |
980 | 'query-string' is the unique query string that is added | |
981 | to the URL for the search engine, such as "search=" or | |
982 | "MT=" with the actual search strings appended to the | |
983 | end. There is no command line option for this keyword. | |
984 | ||
985 | SearchCaseI The SearchCaseI option specifies if search strings should | |
986 | be lowercased (case insensitive) or not. Since most | |
987 | search engines use case insensitive searches (ie: a | |
988 | search for "Hello" is the same as "HELLO" or "hello"), | |
989 | converting to lowercase will improve keyword accuracy, | |
990 | which is the default. If desired, case sensitivity can | |
991 | be forced with this option. The value can be 'yes' or | |
992 | 'no', with 'yes' (case insensitive) being the default. | |
993 | ||
994 | Incremental This allows incremental processing to be enabled or disabled. | |
995 | Incremental processing allows processing partial logs without | |
996 | the loss of detail data from previous runs in the same month. | |
997 | This feature saves the 'internal state' of the program so that | |
998 | it may be restored in following runs. See the section above | |
999 | titled "Incremental Processing" for additional information. | |
1000 | The value may be 'yes' or 'no', with the default being 'no'. | |
1001 | Command line argument: -p | |
1002 | ||
1003 | IncrementalName | |
1004 | Allows specification of the incremental data filename if | |
1005 | desired. Normally, the file named "webalizer.current' is | |
1006 | used, kept in the standard output directory. If specified, | |
1007 | filenames are relative to the standard output directory, | |
1008 | unless an absolute name is given (ie: starts with '/'). | |
1009 | ||
1010 | StripCGI Determines if CGI variables should be stripped from the | |
1011 | end of URLs or not. Normally, these variables are removed | |
1012 | from URLs to improve accuracy, however some sites may wish | |
1013 | to keep them preserved (particularly on highly dynamic | |
1014 | sites). Values may be either 'yes' or 'no', with 'yes' | |
1015 | being the default. | |
1016 | ||
1017 | TrimSquidURL Allows squid log URLs to be reduced in granularity by | |
1018 | truncating them after a specified number of '/' path | |
1019 | separators after the http:// portion. A value of 1 will | |
1020 | cause all URLs to be summarized by domain only. The | |
1021 | default value is zero (0), which leaves URLs unmodified. | |
1022 | ||
1023 | DNSCache Specifies the DNS cache filename. This name is relative | |
1024 | to the default output directory unless an absolute name | |
1025 | is given (ie: starts with '/'). See the DNS.README file | |
1026 | for additional information. | |
1027 | Command line argument: -D | |
1028 | ||
1029 | DNSChildren The number of DNS children processes to run in order to | |
1030 | create/update the DNS cache file. If specified, the DNS | |
1031 | cache filename must also be specified (see above). Use | |
1032 | a value of zero ('0') to disable. See the DNS.README | |
1033 | file for additional information. | |
1034 | Command line argument: -N | |
1035 | ||
1036 | CacheIPs Specifies if unresolved addresses should also be cached | |
1037 | in the DNS database. If enabled, unresolved IP addresses | |
1038 | will be stored along with resolved addresses. This may | |
1039 | be useful on some sites that have lots of unresolved IPs | |
1040 | visiting so they are not looked up each time the program | |
1041 | is run. Values may be 'yes' or 'no'. Default is 'no'. | |
1042 | ||
1043 | CacheTTL Specifies the Time To Live (TTL) value for cached DNS | |
1044 | entries in days. Default value is 7 (1 week). Can be | |
1045 | any value between 1 and 100. | |
1046 | ||
1047 | GeoDB Controls the use of the native GeoDB geolocation services | |
1048 | provided by The Webalizer. Values may be 'yes' or 'no' | |
1049 | with 'no' being the default. | |
1050 | Command line argument: -j | |
1051 | ||
1052 | GeoDBDatabase Specifies and alternate GeoDB database filename to use. | |
1053 | This is relative to the output directory being used unless | |
1054 | an absolute path is given (ie: starts with a '/'). | |
1055 | Command line argument: -J | |
1056 | ||
1057 | GeoIP Controls the use of GeoIP geolocation services. If The | |
1058 | Webalizer was compiled with GeoIP support, it is used by | |
1059 | default. Values may be 'yes' or 'no'. Default is 'yes'. | |
1060 | Command line argument: -w | |
1061 | ||
1062 | GeoIPDatabase Specifies an alternate GeoIP database filename to use. | |
1063 | This name is relative to the default output directory | |
1064 | unless an absolute name is given (ie: starts with '/'). | |
1065 | Command line argument: -W | |
1066 | ||
1067 | ||
1068 | Top Table Keywords | |
1069 | ------------------ | |
1070 | ||
1071 | TopAgents This allows you to specify how many "Top" user agents are | |
1072 | displayed in the "Top User Agents" table. The default | |
1073 | is 15. If you do not want to display user agent statistics, | |
1074 | specify a value of zero (0). The display of user agents | |
1075 | will only work if your web server includes this information | |
1076 | in its log file (ie: a combined log format file). | |
1077 | Command line argument: -A | |
1078 | ||
1079 | AllAgents Will cause a separate HTML page to be generated for all | |
1080 | normally visible User Agents. A link will be added to | |
1081 | the bottom of the "Top User Agents" table if enabled. | |
1082 | Value can be either 'yes' or 'no', with 'no' being the | |
1083 | default. | |
1084 | ||
1085 | TopCountries This allows you to specify how many "Top" countries are | |
1086 | displayed in the "Top Countries" table. The default is | |
1087 | 30. If you want to disable the countries table, specify | |
1088 | a value of zero (0). | |
1089 | Command line argument: -C | |
1090 | ||
1091 | TopReferrers This allows you to specify how many "Top" referrers are | |
1092 | displayed in the "Top Referrers" table. The default is | |
1093 | 30. If you want to disable the referrers table, specify | |
1094 | a value of zero (0). The display of referrer information | |
1095 | will only work if your web server includes this information | |
1096 | in its log file (ie: a combined log format file). | |
1097 | Command line argument: -R | |
1098 | ||
1099 | AllReferrers Will cause a separate HTML page to be generated for all | |
1100 | normally visible Referrers. A link will be added to the | |
1101 | "Top Referrers" table if enabled. Value can be either | |
1102 | 'yes' or 'no', with 'no' being the default. | |
1103 | ||
1104 | TopSites This allows you to specify how many "Top" sites are | |
1105 | displayed in the "Top Sites" table. The default is 30. | |
1106 | If you want to disable the sites table, specify a value | |
1107 | of zero (0). | |
1108 | Command line argument: -S | |
1109 | ||
1110 | TopKSites Identical to TopSites, except for the 'by KByte' table. | |
1111 | Default is 10. No command line switch for this one. | |
1112 | ||
1113 | AllSites Will cause a separate HTML page to be generated for all | |
1114 | normally visible Sites. A link will be added to the | |
1115 | bottom of the "Top Sites" table if enabled. Value can | |
1116 | be either 'yes' or 'no', with 'no' being the default. | |
1117 | ||
1118 | TopURLs This allows you to specify how many "Top" URLs are | |
1119 | displayed in the "Top URLs" table. The default is 30. | |
1120 | If you want to disable the URLs table, specify a value | |
1121 | of zero (0). | |
1122 | Command line argument: -U | |
1123 | ||
1124 | TopKURLs Identical to TopURLs, except for the 'by KByte' table. | |
1125 | Default is 10. No command line switch for this one. | |
1126 | ||
1127 | AllURLs Will cause a separate HTML page to be generated for all | |
1128 | normally visible URLs. A link will be added to the | |
1129 | bottom of the "Top URLs" table if enabled. Value can | |
1130 | be either 'yes' or 'no', with 'no' being the default. | |
1131 | ||
1132 | TopEntry Allows you to specify how many "Top Entry Pages" are | |
1133 | displayed in the table. The default is 10. If you | |
1134 | want to disable the table, specify a value of zero (0). | |
1135 | Command line argument: -e | |
1136 | ||
1137 | TopExit Allows you to specify how many "Top Exit Pages" are | |
1138 | displayed in the table. The default is 10. If you | |
1139 | want to disable the table, specify a value of zero (0). | |
1140 | Command line argument: -E | |
1141 | ||
1142 | TopSearch Allows you to specify how many "Top Search Strings" are | |
1143 | displayed in the table. The default is 20. If you | |
1144 | want to disable the table, specify a value of zero (0). | |
1145 | Only works if using combined log format (ie: contains | |
1146 | referrer information). | |
1147 | ||
1148 | TopUsers This allows you to specify how many "Top" usernames are | |
1149 | displayed in the "Top Usernames" table. Usernames are | |
1150 | only available if you use http authentication on your | |
1151 | web server, or when processing wu-ftpd xferlogs. The | |
1152 | default value is 20. If you want to disable the Username | |
1153 | table, specify a value of zero (0). | |
1154 | ||
1155 | AllUsers Will cause a separate HTML page to be generated for all | |
1156 | normally visible usernames. A link will be added to the | |
1157 | bottom of the "Top Usernames" table if enabled. Value | |
1158 | can be either 'yes' or 'no', with 'no' being the default. | |
1159 | ||
1160 | AllSearchStr Will create a separate HTML page to be generated for all | |
1161 | normally visible Search Strings. A link will be added | |
1162 | to the bottom of the "Top Search Strings" table if | |
1163 | enabled. Value can be either 'yes' or 'no', with 'no' | |
1164 | being the default. | |
1165 | ||
1166 | ||
1167 | Hide Object Keywords | |
1168 | -------------------- | |
1169 | ||
1170 | These keywords allow you to hide user agents, referrers, sites, URLs | |
1171 | and usernames from the various "Top" tables. The value for these keywords | |
1172 | are the same as those used in their command line counterparts. You | |
1173 | can specify as many of these as you want without limit. Refer to the | |
1174 | section above on "Command Line Options" for a description of the string | |
1175 | formatting used as the value. Values cannot exceed 80 characters in | |
1176 | length. | |
1177 | ||
1178 | HideAgent This allows specified user agents to be hidden from the | |
1179 | "Top User Agents" table. Not very useful, since there | |
1180 | a zillion different names by which browsers go by today, | |
1181 | but could be useful if there is a particular user agent | |
1182 | (ie: robots, spiders, real-audio, etc..) that hits your | |
1183 | site frequently enough to make it into the top user agent | |
1184 | listing. This keyword is useless if 1) your log file does | |
1185 | not provide user agent information or 2) you disable the | |
1186 | user agent table. | |
1187 | Command line argument: -a | |
1188 | ||
1189 | HideReferrer This allows you to hide specified referrers from the | |
1190 | "Top Referrers" table. Normally, you would only specify | |
1191 | your own web server to be hidden, as it is usually the | |
1192 | top generator of references to your own pages. Of course, | |
1193 | this keyword is useless if 1) your log file does not include | |
1194 | referrer information or 2) you disable the top referrers | |
1195 | table. | |
1196 | Command line argument: -r | |
1197 | ||
1198 | HideSite This allows you to hide specified sites from the "Top | |
1199 | Sites" table. Normally, you would only specify your own | |
1200 | web server or other local machines to be hidden, as they | |
1201 | are usually the highest hitters of your web site, especially | |
1202 | if you have their browsers home page pointing to it. | |
1203 | Command line argument: -s | |
1204 | ||
1205 | HideAllSites This allows hiding all individual sites from the display, | |
1206 | which can be useful when a lot of groupings are being | |
1207 | used (since grouped records cannot be hidden). It is | |
1208 | particularly useful in conjunction with the GroupDomain | |
1209 | feature, however can be useful in other situations as well. | |
1210 | Value can be either 'yes' or 'no', with 'no' the default. | |
1211 | Command line argument: -X | |
1212 | ||
1213 | HideURL This allows you to hide URLs from the "Top URLs" table. | |
1214 | Normally, this is used to hide items such as graphic files, | |
1215 | audio files or other 'non-html' files that are transferred | |
1216 | to the visiting user. | |
1217 | Command line argument: -u | |
1218 | ||
1219 | HideUser This allows you to hide Usernames from the "Top Usernames" | |
1220 | table. Usernames are only available if you use http based | |
1221 | authentication on your web server. | |
1222 | ||
1223 | ||
1224 | Group Object Keywords | |
1225 | --------------------- | |
1226 | ||
1227 | The Group* keywords allow object grouping based on Site, URL, Referrer, | |
1228 | User Agent and Usernames. Combined with the Hide* keywords, you can | |
1229 | customize exactly what will be displayed in the 'Top' tables. For example, | |
1230 | to only display totals for a particular directory, use a GroupURL and | |
1231 | HideURL with the same value (ie: '/help/*'). Group processing is only | |
1232 | done after the individual record has been fully processed, so name mangling | |
1233 | and site total updates have already been performed. Because of this, groups | |
1234 | are not counted in the main site total (as that would cause duplication). | |
1235 | Groups can be displayed in bold and shaded as well. Grouped records are | |
1236 | not, by default, hidden from the report. This allows you to display a | |
1237 | grouped total, while still being able to see the individual records, even | |
1238 | if they are part of the group. If you want to hide the detail records, | |
1239 | follow the Group* directive with a Hide* one using the same value. There | |
1240 | are no command line switches for these keywords. The Group* keywords also | |
1241 | accept an optional label to be displayed instead of the actual value used. | |
1242 | This label should be separated from the value by at least one whitespace | |
1243 | character, such as a space or tab character. If the match string contains | |
1244 | whitespace (spaces or tabs), the string should be quoted, using either | |
1245 | single or double quotes. See the sample configuration file for examples. | |
1246 | ||
1247 | GroupReferrer Allows grouping Referrers. Can be handy for some of the | |
1248 | major search engines that have multiple host names a | |
1249 | referral could come from. | |
1250 | ||
1251 | GroupURL This keyword allows grouping URLs. Useful for grouping | |
1252 | complete directory trees. | |
1253 | ||
1254 | GroupSite This keywords allows grouping Sites. Most used for | |
1255 | grouping top level domains and unresolved IP address | |
1256 | for local dial-ups, etc... | |
1257 | ||
1258 | GroupAgent Groups User Agents. A handy example of how you could use | |
1259 | this one is to use "Mozilla" and "MSIE" as the values for | |
1260 | GroupAgent and HideAgent keywords. Make sure you put the | |
1261 | "MSIE" one first. | |
1262 | ||
1263 | GroupDomains Allows automatic grouping of domains. The numeric value | |
1264 | represents the level of grouping, and can be thought of | |
1265 | as 'the number of dots' to display. A 1 will display | |
1266 | second level domains only (xxx.xxx), a 2 will display | |
1267 | third level domains (xxx.xxx.xxx) etc... The default | |
1268 | value of 0 disables any domain grouping. | |
1269 | Command line argument: -g | |
1270 | ||
1271 | GroupUser Allows grouping of usernames. Combined with a group | |
1272 | name, this can be handy for displaying statistics on | |
1273 | a particular group of users without displaying their | |
1274 | real usernames. | |
1275 | ||
1276 | GroupShading Allows shading of table rows for groups. Value can be | |
1277 | 'yes' or 'no', with the default being 'yes'. | |
1278 | ||
1279 | GroupHighlight Allows bolding of table rows for groups. Value can be | |
1280 | 'yes' or 'no', with the default being 'yes'. | |
1281 | ||
1282 | ||
1283 | Ignore/Include Object Keywords | |
1284 | ---------------------- | |
1285 | ||
1286 | These keywords allow you to completely ignore log records when generating | |
1287 | statistics, or to force their inclusion regardless of ignore criteria. | |
1288 | Records can be ignored or included based on site, URL, user agent, referrer | |
1289 | and username. Be aware that by choosing to ignore records, the accuracy of | |
1290 | the generated statistics become skewed, making it impossible to produce | |
1291 | an accurate representation of load on the web server. These keywords | |
1292 | behave identical to the Hide* keywords above, where the value can have | |
1293 | a leading or trailing wildcard '*'. These keywords, like the Hide* ones, | |
1294 | have an absolute limit of 80 characters for their values. These keywords | |
1295 | do not have any command line switch counterparts, so they may only be | |
1296 | specified in a configuration file. It should also be pointed out that | |
1297 | using the Ignore/Include combination to selectively exclude an entire | |
1298 | site while including a particular 'chunk' is _extremely_ inefficient, | |
1299 | and should be avoided. Try grep'ing the records into a separate file | |
1300 | and process it instead. | |
1301 | ||
1302 | IgnoreSite This allows specified sites to be completely ignored from | |
1303 | the generated statistics. | |
1304 | ||
1305 | IgnoreURL This allows specified URLs to be completely ignored from | |
1306 | the generated statistics. One use for this keyword would | |
1307 | be to ignore all hits to a 'temporary' directory where | |
1308 | development work is being done, but is not accessible to | |
1309 | the outside world. | |
1310 | ||
1311 | IgnoreReferrer This allows records to be ignored based on the referrer | |
1312 | field. | |
1313 | ||
1314 | IgnoreAgent This allows specified User Agent records to be completely | |
1315 | ignored from the statistics. Maybe useful if you really | |
1316 | don't want to see all those hits from MSIE :) | |
1317 | ||
1318 | IgnoreUser This allows specified username records to be completely | |
1319 | ignored from the statistics. Usernames can only be used | |
1320 | if you use http authentication on your server. | |
1321 | ||
1322 | IncludeSite Force the record to be processed based on hostname. This | |
1323 | takes precedence over the Ignore* keywords. | |
1324 | ||
1325 | IncludeURL Force the record to be processed based on URL. This takes | |
1326 | precedence over the Ignore* keywords. | |
1327 | ||
1328 | IncludeReferrer Force the record to be processed based on referrer. | |
1329 | This takes precedence over the Ignore* keywords. | |
1330 | ||
1331 | IncludeAgent Force the record to be processed based on user agent. | |
1332 | This takes precedence over the Ignore* keywords. | |
1333 | ||
1334 | IncludeUser Force the record to be processed based on username. | |
1335 | Usernames are only available if you use http based | |
1336 | authentication on your server. This takes precedence over | |
1337 | the Ignore* keywords. | |
1338 | ||
1339 | ||
1340 | Dump Object Keywords | |
1341 | -------------------- | |
1342 | ||
1343 | The Dump* Keywords allow text files to be generated that can then be used | |
1344 | for import into most database, spreadsheet and other external programs. | |
1345 | The file is a standard tab delimited text file, meaning that each column | |
1346 | is separated by a tab (0x09) character. A header record may be included | |
1347 | if required, using the 'DumpHeader' keyword. Since these files contain | |
1348 | all records that have been processed, including normally hidden records, | |
1349 | an alternate location for the files can be specified using the 'DumpPath' | |
1350 | keyword, otherwise they will be located in the default output directory. | |
1351 | ||
1352 | DumpPath Specifies an alternate location for the dump files. The | |
1353 | default output location will be used otherwise. The value | |
1354 | is the path portion to use, and normally should be an | |
1355 | absolute path (ie: has a leading '/' character), however | |
1356 | relative path names can be used as well, and will be | |
1357 | relative to the output directory location. | |
1358 | ||
1359 | DumpExtension Allows the dump filename extensions to be specified. The | |
1360 | default extension is "tab", however may be changed with | |
1361 | this option. | |
1362 | ||
1363 | DumpHeader Allows a header record to be written as the first record | |
1364 | of the file. Value can be either 'yes' or 'no', with | |
1365 | the default being 'no'. | |
1366 | ||
1367 | DumpSites Dump tab delimited sites file. Value can be either 'yes' | |
1368 | or 'no', with the default being 'no'. The filename used | |
1369 | is site_YYYYMM.tab (YYYY=year, MM=month). | |
1370 | ||
1371 | DumpURLs Dump tab delimited url file. Value can be either 'yes' or | |
1372 | 'no', with the default being 'no'. The filename used is | |
1373 | url_YYYYMM.tab (YYYY=year, MM=month). | |
1374 | ||
1375 | DumpReferrers Dump tab delimited referrer file. Value can be either | |
1376 | 'yes' or 'no', with the default being 'no'. Filename | |
1377 | used is ref_YYYYMM.tab (YYYY=year, MM=month). Referrer | |
1378 | information is only available if present in the log | |
1379 | file (ie: combined web server log). | |
1380 | ||
1381 | DumpAgents Dump tab delimited user agent file. Value can be either | |
1382 | 'yes' or 'no', with the default being 'no'. Filename | |
1383 | used is agent_YYYYMM.tab (YYYY=year, MM=month). User | |
1384 | agent information is only available if present in the | |
1385 | log file (ie: combined web server log). | |
1386 | ||
1387 | DumpUsers Dump tab delimited username file. Value can be either | |
1388 | 'yes' or 'no', with the default being 'no'. Filename | |
1389 | used is user_YYYYMM.tab (YYYY=year, MM=month). The | |
1390 | username data is only available if processing a wu-ftpd | |
1391 | xferlog or http authentication is used on the web server | |
1392 | and that information is present in the log. | |
1393 | ||
1394 | DumpSearchStr Dump tab delimited search string file. Value can be | |
1395 | either 'yes' or 'no', with the default being 'no'. | |
1396 | Filename used is search_YYYYMM.tab (YYYY=year, MM=month). | |
1397 | the search string data is only available if referrer | |
1398 | information is present in the log being processed and | |
1399 | recognized search engines were found and processed. | |
1400 | ||
1401 | ||
1402 | ||
1403 | HTML Generation Keywords | |
1404 | ------------------------ | |
1405 | ||
1406 | These keywords allow you to customize the HTML code that The Webalizer | |
1407 | produces, such as adding a corporate logo or links to other web pages. | |
1408 | You can specify as many of these keywords as you like, and they will be | |
1409 | used in the order that they are found in the file. Values cannot exceed | |
1410 | 80 characters in length, so you may have to break long lines up into two | |
1411 | or more lines. There are no command line counterparts to these keywords. | |
1412 | ||
1413 | HTMLExtension Allows generated pages to use something other than the | |
1414 | default 'html' extension for the filenames. Do not | |
1415 | include the leading period ('.') when you specify the | |
1416 | extension. | |
1417 | Command line argument: -x | |
1418 | ||
1419 | HTMLPre Allows code to be inserted at the very beginning of the | |
1420 | HTML files. Defaults to the standard HTML 3.2 DOCTYPE | |
1421 | record. Be careful not to include any HTML here, as it | |
1422 | is inserted _before_ the <HTML> tag in the file. Use it | |
1423 | for server-side scripting capabilities, such as php3, to | |
1424 | insert scripting files and other directives. | |
1425 | ||
1426 | HTMLHead Allows you to insert HTML code between the <HEAD></HEAD> | |
1427 | block. There is no default. Useful for adding scripts | |
1428 | to the HTML page, such as Javascript or php3, or even | |
1429 | just for adding a few META tags to the document. | |
1430 | ||
1431 | HTMLBody This keyword defines HTML code to be placed immediately | |
1432 | after the <HEAD> section of the report, just before the | |
1433 | title and "summary period/generated on" lines. If used, | |
1434 | the first HTMLHead line MUST include a <BODY> tag. Put | |
1435 | whatever else you want in subsequent lines, but keep in | |
1436 | mind the placement of this code in relation to the title | |
1437 | and other aspects of the web page. Some typical uses | |
1438 | are to change the page colors and possibly add a corporate | |
1439 | logo (graphic) in the top right. If not specified, a | |
1440 | default <BODY> tag is used that defines page color, text | |
1441 | color and link colors (see "sample.conf" file for example). | |
1442 | ||
1443 | HTMLPost This keyword defines HTML code that is placed after the | |
1444 | title and "summary period/generated on" lines, just before | |
1445 | the initial horizontal rule <HR> tag. Normally this keyword | |
1446 | isn't needed, but is provided in case you included a large | |
1447 | graphic or some other weird formatting tag in the HTMLHead | |
1448 | section that needs to be cleaned up or terminated before the | |
1449 | main report section. | |
1450 | ||
1451 | HTMLTail This keyword defines HTML code that is placed at the bottom | |
1452 | right side of the report. It is inserted in a <TABLE> section | |
1453 | between table data <TD>..</TD> tags, and is top and right | |
1454 | aligned within the table. Normally this keyword is used to | |
1455 | provide a link back to your home page or insert a small | |
1456 | graphic at the bottom right of the page. | |
1457 | ||
1458 | HTMLEnd This allows insertion of closing code, at the very end of | |
1459 | the page. The default is to put the closing </BODY> and | |
1460 | </HTML> tags. If specified, you _must_ specify these tags | |
1461 | yourself. | |
1462 | ||
1463 | LinkReferrer This specifies if the referrers listed in the top referrer | |
1464 | table should be displayed as plain text, or as a link to the | |
1465 | referrer. Values can be either 'yes' or 'no', with 'no' | |
1466 | being the default. | |
1467 | ||
1468 | ||
1469 | Graph Color Commands | |
1470 | -------------------- | |
1471 | ||
1472 | These keywords allow altering the colors used in the various graphs | |
1473 | produced by the Webalizer. The value is specified as a standard HTML | |
1474 | RGB hexdecimal color string, without the leading '#' character. The | |
1475 | value is case insensitive. If not specified, the default color shown | |
1476 | will be used. | |
1477 | ||
1478 | ColorHit Color used for 'Hits'. Default is '00805C' (green) | |
1479 | ||
1480 | ColorFile Color used for 'Files'. Default is '0040FF' (blue) | |
1481 | ||
1482 | ColorSite Color used for 'Sites'. Default is 'FF8000' (orange) | |
1483 | ||
1484 | ColorKbyte Color used for 'KBytes'. Default is 'FF0000' (red) | |
1485 | ||
1486 | ColorPage Color used for 'Pages'. Default is '00E0FF' (cyan) | |
1487 | ||
1488 | ColorVisit Color used for 'Visits'. Default is 'FFFF00' (yellow) | |
1489 | ||
1490 | ColorMisc Color used for miscellaneous titles in various 'Top' | |
1491 | tables (not graphs). Default is '00E0FF' (cyan) | |
1492 | ||
1493 | PieColor1 Pie Chart color #1. Default is '800080' (purple) | |
1494 | ||
1495 | PieColor2 Pie Chart color #2. Default is '80FFC0' (lt. green) | |
1496 | ||
1497 | PieColor3 Pie Chart color #3. Default is 'FF00FF' (lt. purple) | |
1498 | ||
1499 | PieColor4 Pie Chart color #4. Default is 'FFC080' (tan) | |
1500 | ||
1501 | ||
1502 | -------------------------------------------------------------------------- | |
1503 | ||
1504 | ||
1505 | Notes on Web Log Files | |
1506 | ---------------------- | |
1507 | ||
1508 | The Webalizer supports CLF log formats, which should work for just | |
1509 | about everyone. If you want User Agent or Referrer information, you | |
1510 | need to make sure your web server supplies this information in its | |
1511 | log file, and in a format that the Webalizer can understand. While | |
1512 | The Webalizer will try to handle many of the subtle variations in | |
1513 | log formats, some will not work at all. Most web servers output | |
1514 | CLF format logs by default. For Apache, in order to produce the | |
1515 | proper log format, add the following to the httpd.conf file: | |
1516 | ||
1517 | LogFormat "%h %l %u %t \"%r\" %s %b \"%{Referer}i\" \"%{User-agent}i\"" | |
1518 | ||
1519 | This instructs the Apache web server to produce a 'combined' log | |
1520 | that includes the referrer and user agent information on the end of | |
1521 | each record, enclosed in quotes (This is the standard recommended | |
1522 | by both Apache and NCSA). Netscape and other web servers have | |
1523 | similar capabilities to alter their log formats. (note: the above | |
1524 | works for apache servers up to V1.2. V1.3 and higher now have additional | |
1525 | ways to specify log formats... refer to included documentation). | |
1526 | ||
1527 | Notes on FTP Log Files | |
1528 | ---------------------- | |
1529 | ||
1530 | The Webalizer supports ftp logs produced by wu-ftpd, proftpd and others, | |
1531 | as a standard 'xferlog'. To process an ftp log, you must either use the | |
1532 | -Ff command line option or have "LogType ftp" in your configuration file. | |
1533 | It is recommended that you create a separate configuration file for ftp | |
1534 | analysis, since the values used for your web server will most likely not | |
1535 | be suited for ftp log analysis (ie: page types, hostname, etc.. should | |
1536 | be different). | |
1537 | ||
1538 | Because of the difference in web and ftp logs, there are a few limitations: | |
1539 | ||
1540 | o Because there is no concept of a 'response code' in ftp world, response | |
1541 | codes are restricted to either 200 (OK) or 206 (Partial Content), based | |
1542 | on the completion status found in xferlog (for wu-ftpd, 'i'=incomplete | |
1543 | and will generate a 206, 'c'=complete and will generate a 200). If your | |
1544 | ftp server doesn't supply the completion status, all requests will be | |
1545 | assigned a response code of 200. This allows the usage graph to display | |
1546 | all transfer requests (hits), and how many of those completed in success | |
1547 | (files - ie: 200 response codes). | |
1548 | ||
1549 | o Page totals won't accurately reflect reality, since there isn't really | |
1550 | the concept of a 'page' in regards to ftp services. I have found that | |
1551 | setting the PageType value to "README", "FIRST", etc... seems to work | |
1552 | fairly well however, and will give a pretty good indication of how | |
1553 | many 'non-binary' files were requested. Of course, the content of your | |
1554 | ftp site will be different, so your results may vary. | |
1555 | ||
1556 | o Visit totals also won't accurately reflect reality, since visits are | |
1557 | triggered on PageType requests (see above). What you usually wind up | |
1558 | with is visits=sites in most cases. | |
1559 | ||
1560 | o Entry/Exit pages will not be calculated for ftp logs. | |
1561 | ||
1562 | o For obvious reasons, referrers and user agents are not supported. | |
1563 | ||
1564 | o You _cannot_ analyze both web and ftp logs at the same time.. they must | |
1565 | be done separately in different runs. | |
1566 | ||
1567 | ||
1568 | Notes on Referrers | |
1569 | ------------------ | |
1570 | ||
1571 | Referrers are weird critters... They take many shapes and forms, which makes | |
1572 | it much harder to analyze than a typical URL, which at least has some | |
1573 | standardization. What is contained in the referrer field of your log | |
1574 | files varies depending on many factors, such as what site did the referral, | |
1575 | what type of system it comes from and how the actual referral was generated. | |
1576 | Why is this? Well, because a user can get to your site in many ways... They | |
1577 | may have your site bookmarked in their browser, they may simply type your | |
1578 | sites URL field in their browser, they could have clicked on a link on some | |
1579 | remote web page or they may have found your site from one of the many search | |
1580 | engines and site indexes found on the web. The Webalizer attempts to deal | |
1581 | with all this variation in an intelligent way by doing certain things to | |
1582 | the referrer string which makes it easier to analyze. Of course, if your | |
1583 | web server doesn't provide referrer information, you probably don't really | |
1584 | care and are asking yourself why you are reading this section... | |
1585 | ||
1586 | Most referrers will take the form of "http://somesite.com/somepage.html", | |
1587 | which is what you will get if the user clicks on a link somewhere on the | |
1588 | web in order to get to your site. Some will be a variation of this, and | |
1589 | look something like "file:/some/such/sillyname", which is a reference from | |
1590 | a HTML document on the users local machine. Several variations of this can | |
1591 | be used, depending on what type of system the user has, if he/she is on | |
1592 | a local network, the type of network, etc... To complicate things even | |
1593 | more, dynamic HTML documents and HTML documents that are generated by | |
1594 | CGI scripts or external programs produce lots of extra information which | |
1595 | is tacked on to the end of the referrer string in an almost infinite number | |
1596 | of ways. If the user just typed your URL into their browser or clicked on | |
1597 | a bookmark, there won't be any information in the referrer field and will | |
1598 | take the form "-". | |
1599 | ||
1600 | In order to handle all these variations, The Webalizer parses the referrer | |
1601 | field in a certain way. First, if the referrer string begins with "http", | |
1602 | it assumes it is a normal referral and converts the "http://" and following | |
1603 | hostname to lowercase in order to simplify hiding if desired. For example, | |
1604 | the referrer "HTTP://WWW.MyHost.Com/This/Is/A/HTML/Document.html" will become | |
1605 | "http://www.myhost.com/This/Is/A/HTML/Document.html". Notice that only the | |
1606 | "http://" and hostname are converted to lower case... The rest of the | |
1607 | referrer field is left alone. This follows standard convention, as the | |
1608 | actual method (HTTP) and hostname are always case insensitive, while the | |
1609 | document name portion is case sensitive. | |
1610 | ||
1611 | Referrers that came from search engines, dynamic HTML documents, CGI | |
1612 | scripts and other external programs usually tack on additional information | |
1613 | that it used to create the page. A common example of this can be found | |
1614 | in referrals that come from search engines and site indexes common on the | |
1615 | web. Sometimes, these referrers URLs can be several hundred characters | |
1616 | long and include all the information that the user typed in to search for | |
1617 | your site. The Webalizer deals with this type of referrer by stripping | |
1618 | off all the query information, which starts with a question mark '?'. | |
1619 | The Referrer "http://search.yahoo.com/search?p=usa%26global%26link" will | |
1620 | be converted to just "http://search.yahoo.com/search". | |
1621 | ||
1622 | When a user comes to your site by using one of their bookmarks or by | |
1623 | typing in your URL directly into their browser, the referrer field is | |
1624 | blank, and looks like "-". Most sites will get more of these referrals | |
1625 | than any other type. The Webalizer converts this type of referral into | |
1626 | the string "- (Direct Request)". This is done in order to make it easier | |
1627 | to hide via a command line option or configuration file option. This is | |
1628 | because the character "-" is a valid character elsewhere in a referrer | |
1629 | field, and if not turned into something unique, could not be hidden without | |
1630 | possibly hiding other referrers that shouldn't be. | |
1631 | ||
1632 | ||
1633 | Notes on Character Escaping | |
1634 | --------------------------- | |
1635 | ||
1636 | The HTTP protocol defines certain ways that URLs can look and behave. To | |
1637 | some extent, referrer fields follow most of the same conventions. Character | |
1638 | escaping is a technique by which non-printable or other non-ASCII (and even | |
1639 | some ASCII) characters can be used in a URL. This is done by placing the | |
1640 | Hexadecimal value of the character in the URL, preceded by a percent sign '%'. | |
1641 | Since Hex values are made up of ASCII characters, any character can be | |
1642 | escaped to ensure only printable ASCII characters are present in the URL. | |
1643 | Some systems take this concept to the extreme and escape all sorts of stuff, | |
1644 | even characters that don't need to be escaped. To deal with this, The | |
1645 | Webalizer will un-escape URLs and referrers before being processed. For | |
1646 | Example, the URL "/www.webalizer.org/%7Efoo/bar.html" is the same URL as | |
1647 | "/www.webalizer.org/~foo/bar.html", a very common form of a URL to access | |
1648 | users web pages. If the URLs were not un-escaped, they would be treated as | |
1649 | two separate documents, even though they are really one and the same. | |
1650 | ||
1651 | ||
1652 | Search String Analysis | |
1653 | ---------------------- | |
1654 | ||
1655 | The Webalizer will do a minimal analysis on referrer strings that | |
1656 | it finds, looking for well known search string patterns. Most of | |
1657 | the major search engines are supported, such as Yahoo!, Altavista, | |
1658 | Lycos, etc... Unfortunately, search engines are always changing | |
1659 | their internal/CGI query formats, new search engines are coming on | |
1660 | line every day, and the ability to detect _all_ search strings is | |
1661 | nearly impossible. However, it should be accurate enough to give | |
1662 | a good indication of what users were searching for when they stumbled | |
1663 | across your site. Note: as of version 1.31, search engines can now | |
1664 | be specified within a configuration file. See the sample.conf file | |
1665 | for examples of how to specify additional search engines. | |
1666 | ||
1667 | ||
1668 | ||
1669 | Notes on Visits/Entry/Exit Figures | |
1670 | ---------------------------------- | |
1671 | ||
1672 | The majority of data analyzed and reported on by The Webalizer is | |
1673 | as accurate and correct as possible based on the input log file. | |
1674 | However, due to the limitation of the HTTP protocol, the use of | |
1675 | firewalls, proxy servers, multi-user systems, the rotation of your | |
1676 | log files, and a myriad of other conditions, some of these numbers | |
1677 | cannot, without absolute accuracy, be calculated. In particular, | |
1678 | Visits, Entry Pages and Exit Pages are suspect to random errors | |
1679 | due to the above and other conditions. The reason for this is | |
1680 | twofold, 1) Log files are finite in size and time interval, and | |
1681 | 2) There is no way to distinguish multiple individual users apart | |
1682 | given only an IP address. Because log files are finite, they have | |
1683 | a beginning and ending, which can be represented as a fixed time | |
1684 | period. There is no way of knowing what happened previous to this | |
1685 | time period, nor is it possible to predict future events based on | |
1686 | it. Also, because it is impossible to distinguish individual users | |
1687 | apart, multiple users that have the same IP address all appear to | |
1688 | be a single user, and are treated as such. This is most common where | |
1689 | corporate users sit behind a proxy/firewall to the outside world, | |
1690 | and all requests appear to come from the same location (the address | |
1691 | of the proxy/firewall itself). Dynamic IP assignment (used with | |
1692 | dial-up Internet accounts) also present a problem, since the same | |
1693 | user will appear as to come from multiple places. | |
1694 | ||
1695 | For example, suppose two users visit your server from XYZ company, | |
1696 | which has their network connected to the Internet by a proxy server | |
1697 | 'fw.xyz.com'. All requests from the network look as though they | |
1698 | originated from 'fw.xyz.com', even though they were really initiated | |
1699 | from two separate users on different PCs. The Webalizer would | |
1700 | see these requests as from the same location, and would record only | |
1701 | 1 visit, when in reality, there were two. Because entry and exit | |
1702 | pages are calculated in conjunction with visits, this situation | |
1703 | would also only record 1 entry and 1 exit page, when in reality, | |
1704 | there should be 2. | |
1705 | ||
1706 | As another example, say a single user at XYZ company is surfing | |
1707 | around your website.. They arrive at 11:52pm the last day of | |
1708 | the month, and continue surfing until 12:30am, which is now a | |
1709 | new day (in a new month). Since a common practice is to rotate | |
1710 | (save then clear) the server logs at the end of the month, you | |
1711 | now have the users visit logged in two different files (current | |
1712 | and previous months). Because of this (and the fact that the | |
1713 | Webalizer clears history between months), the first page the | |
1714 | user requests after midnight will be counted as an entry page. | |
1715 | This is unavoidable, since it is the first request seen by that | |
1716 | particular IP address in the new month. | |
1717 | ||
1718 | For the most part, the numbers shown for visits, entry and exit | |
1719 | pages are pretty good 'guesses', even though they may not be 100% | |
1720 | accurate. They do provide a good indication of overall trends, | |
1721 | and shouldn't be that far off from the real numbers to count much. | |
1722 | You should probably consider them as the 'minimum' amount possible, | |
1723 | since the actual (real) values should always be equal or greater | |
1724 | in all cases. | |
1725 | ||
1726 | ||
1727 | Exporting Webalizer Data | |
1728 | ------------------------ | |
1729 | ||
1730 | The Webalizer now has the ability to dump all object tables to tab | |
1731 | delimited ASCII text files, which can then be imported into most | |
1732 | popular database and spreadsheet programs. The files are not normally | |
1733 | produced, as on some sites they could become quite large, and are only | |
1734 | enabled by the use of the Dump* configuration keywords. The filename | |
1735 | extensions default to '.tab' however may be changed using the | |
1736 | 'DumpExtension' keyword. Since this data contains all items, even | |
1737 | those normally hidden, it may not be desirable to have them located | |
1738 | in the output directory where they may be visible to normal web users.. | |
1739 | For this reason, the 'DumpPath' configuration keyword is available, | |
1740 | and allows the placement of these files somewhere outside the normal | |
1741 | web server document tree. An optional 'header' record may be written | |
1742 | to these files as well, and is useful when the data is to be imported | |
1743 | into a spreadsheet.. databases will not normally need the header. If | |
1744 | enabled, the header is simply the column names as the first record of | |
1745 | the file, tab separated. | |
1746 | ||
1747 | ||
1748 | Log files and The Webalizer | |
1749 | --------------------------- | |
1750 | ||
1751 | Most sites will choose to have The Webalizer run from cron at specified | |
1752 | intervals. Care should be taken to ensure that data is not lost as a | |
1753 | result of log file rotations. A suggested practice is to rotate your | |
1754 | web server logs at the end of each month as close to midnight as possible, | |
1755 | then have The Webalizer process the 'end of month' log file before running | |
1756 | statistics on the new, current log. On our systems, a shell script called | |
1757 | 'rotate_logs' is run at midnight, the end of each month. This script file | |
1758 | looks like: | |
1759 | ||
1760 | ------------------------- file: rotate_logs ------------------------------ | |
1761 | #!/bin/sh | |
1762 | ||
1763 | # halt the server | |
1764 | kill `cat /var/lib/httpd/logs/httpd.pid` | |
1765 | ||
1766 | # define backup names | |
1767 | OLD_ACCESS_LOG=/var/lib/httpd/logs/old/access_log.`date +%y%m%d-%H%M%S` | |
1768 | OLD_ERROR_LOG=/var/lib/httpd/logs/old/error_log.`date +%y%m%d-%H%M%S` | |
1769 | ||
1770 | # make end of month copy for analyzer | |
1771 | cp /var/lib/httpd/logs/access_log /var/lib/httpd/logs/access_log.backup | |
1772 | ||
1773 | # move files to archive directory | |
1774 | mv /var/lib/httpd/logs/access_log `echo $OLD_ACCESS_LOG` | |
1775 | mv /var/lib/httpd/logs/error_log `echo $OLD_ERROR_LOG` | |
1776 | ||
1777 | # restart web server | |
1778 | /usr/sbin/httpd | |
1779 | ||
1780 | # compress the archived files | |
1781 | /bin/gzip $OLD_ACCESS_LOG | |
1782 | /bin/gzip $OLD_ERROR_LOG | |
1783 | ------------------------- end of file ------------------------------------ | |
1784 | ||
1785 | This script first stops the web server using a 'kill' command. Apache | |
1786 | keeps the PID of the server in the file httpd.pid, so we use it as the | |
1787 | argument for the kill. Next, it defines some names for the backup files, | |
1788 | which are basically the name of the files with the date and time appended | |
1789 | to the end of them. It then makes a copy of the log file, appended with | |
1790 | '.backup' in the log directory, moves the current log files to an archive | |
1791 | directory (/var/lib/httpd/logs/old) and restarts the server. This setup | |
1792 | allows the web server to be down for the minimum amount of time needed, | |
1793 | which is important for busy sites. If you don't want to stop the server, | |
1794 | you can remove the initial 'kill' command, and replace the '/usr/sbin/httpd' | |
1795 | line with "kill -1 `cat /var/lib/httpd/logs/httpd.pid`" command instead, | |
1796 | On most web servers, this will cause a restart of the server and create | |
1797 | the new log files in the process... | |
1798 | ||
1799 | At this point, we have made copies of the previous months logs, the web | |
1800 | server is going about its business as usual, and we have all the time in | |
1801 | the world to do any other additional processing we want. The last two | |
1802 | lines of the script compress the archived logs using the GNU zip program | |
1803 | (gzip). Remember, we still have a copy of the log which we can now run | |
1804 | The Webalizer on without having to do any further processing. | |
1805 | ||
1806 | Next, we define two crontab entries. The first runs the above 'rotate_logs' | |
1807 | script at midnight at the end of the month. The second runs The Webalizer | |
1808 | on the '.backup' log file created above at 5 minutes after midnight. This | |
1809 | gives other end of month processing jobs a chance to run so we don't bog | |
1810 | the system down too much. If you have lots of end of month stuff going on, | |
1811 | you can change the timing to suit your needs. The crontab entries look | |
1812 | something like: | |
1813 | ||
1814 | ------------------------- crontab entries -------------------------------- | |
1815 | # Rotate web server logs and run monthly analysis | |
1816 | 0 0 1 * * /usr/local/adm/rotate_logs | |
1817 | 5 0 1 * * /usr/bin/webalizer -Q /var/lib/httpd/logs/access_log.backup | |
1818 | ------------------------- end of crontab --------------------------------- | |
1819 | ||
1820 | As you can see, the log rotations occur at midnight, and the analysis | |
1821 | is done at 5 minutes after. Once you verify that The Webalizer ran | |
1822 | successfully, the access_log.backup file can be deleted as it isn't | |
1823 | needed any more. If you need to re-run the analysis, you still have | |
1824 | the compressed archive copy that the shell script created. In order | |
1825 | for the above analysis to work properly, you should have already | |
1826 | created an /etc/webalizer.conf configuration file suitable for your | |
1827 | site, or otherwise specify configuration options or a configuration | |
1828 | file on the crontab command line above. | |
1829 | ||
1830 | If you want The Webalizer to be run more often than once a month, you | |
1831 | can specify additional crontab entries to do this as well. Care should | |
1832 | be taken however to ensure that The Webalizer is not running when the | |
1833 | end of month processing above occurs, or unpredictable results may | |
1834 | happen (such as an inability to rotate the logs due to a file lock). | |
1835 | The easiest way is to run it on the half hour with a crontab entry like: | |
1836 | ||
1837 | 30 * * * * /usr/bin/webalizer | |
1838 | ||
1839 | ||
1840 | Reverse DNS Lookups | |
1841 | ------------------- | |
1842 | ||
1843 | The Webalizer fully supports both IPv4 and IPv6 DNS lookups, and | |
1844 | maintains a cache of those lookups to reduce processing the same | |
1845 | addresses in subsequent runs. The cache file can be created at | |
1846 | run-time, or may be created before running the webalizer using either | |
1847 | the stand alone 'webazolver' program, or The Webalizer (DNS) Cache | |
1848 | file Manager program 'wcmgr'. In order to perform reverse lookups, | |
1849 | a DNS Cache file must be specified, either on the command line or in | |
1850 | a configuration file. In order to create/update the cache file at | |
1851 | run-time, the number of DNS Children must also be specified, and can | |
1852 | be anything between 1 and 100. This specifies the number of child | |
1853 | processes to be forked, each of which will perform network DNS | |
1854 | queries in order to lookup up the addresses and update the cache. | |
1855 | Cached entries that are older than a specified TTL (time to live) | |
1856 | will be expired, and if encountered again in a log, will be looked | |
1857 | up at that time in order to 'freshen' them (verify the name is still | |
1858 | the same and update its timestamp). The default TTL is 7 days, however | |
1859 | may be set to anything between 1 and 100 days. Using the 'wcmgr' | |
1860 | program, entries may also be marked as 'permanent', in which case | |
1861 | they will persist (with an infinite TTL) in the cache until manually | |
1862 | removed. See the file DNS.README for additional information. | |
1863 | ||
1864 | ||
1865 | Geolocation Lookups | |
1866 | ------------------- | |
1867 | ||
1868 | The Webalizer has the ability to perform geolocation lookups on IP | |
1869 | addresses using either it's own internal GeoDB database or optionally | |
1870 | the GeoIP database from MaxMind, Inc. (www.maxmind.com). If used, | |
1871 | unresolved addresses will be searched for in the database and it's | |
1872 | country of origin will be returned if found. This actually produces | |
1873 | more accurate Country information than DNS lookups, since the DNS | |
1874 | address space has additional gcTLDs that do not necessarily map to | |
1875 | a specific country (such as '.net' and '.com'). It is possible to | |
1876 | use both DNS lookups and geolocation lookups at the same time, which | |
1877 | will cause any addresses that could not be resolved using DNS lookups | |
1878 | to then be looked up in the database, greatly reducing the number of | |
1879 | 'Unknown/Unresolved' entries in the generated reports. The native | |
1880 | GeoDB geolocation database provided by The Webalizer fully supports | |
1881 | IPv4 and IPv6 lookups, is updated regularly, and is the preferred | |
1882 | geolocation method for use with The Webalizer. The most current | |
1883 | version of the database can be obtained from our ftp site. | |
1884 | ||
1885 | ||
1886 | Language Support | |
1887 | ---------------- | |
1888 | ||
1889 | Version 1.0x of The Webalizer added language support. This | |
1890 | support is only provided at compile time in the form of an | |
1891 | include file containing all the strings used by The Webalizer. | |
1892 | The source distribution contains all language files that were | |
1893 | available at the time, with English being the default as | |
1894 | that is the only human language I speak fluently, and me | |
1895 | Espanol es muy malo. Several people have already indicated | |
1896 | the desire to do translations into various languages, and as | |
1897 | I receive the language files, will make them available via | |
1898 | ftp at ftp://ftp.mrunix.net/pub/webalizer/lang. Unless there | |
1899 | happens to be a binary distribution in the language you need, | |
1900 | you will need to grab the source distribution and compile the | |
1901 | program yourself. See the file INSTALL that comes in the source | |
1902 | distribution for information on how to use a language other than | |
1903 | English. | |
1904 | ||
1905 | It should also be noted that the GD graphics library, used to | |
1906 | produce the in-line graphics in the output HTML, doesn't | |
1907 | support extended character sets, so if you are translating | |
1908 | the language file, you will no doubt encounter this problem. | |
1909 | ||
1910 | New: You can now specify the language to use when you are building | |
1911 | program from source, using the configure script. Just add | |
1912 | --with-language=language_name , where 'language_name' is the | |
1913 | name of a valid language file in the /lang/ directory. For | |
1914 | example, --with-language=french will build using French as | |
1915 | the default language. You should consult the INSTALL file | |
1916 | for additional information on building the program from source. | |
1917 | ||
1918 | ||
1919 | Known Issues | |
1920 | ------------ | |
1921 | ||
1922 | o Memory Usage. The Webalizer makes liberal use of memory for internal | |
1923 | data structures during analysis. Lack of real physical memory will | |
1924 | noticeably degrade performance by doing lots of swapping between memory | |
1925 | and disk. One user who had a rather large log file noticed that The | |
1926 | Webalizer took over 7 hours to run with only 16 Meg of memory. Once | |
1927 | memory was increased, the time was reduced to a few minutes. | |
1928 | ||
1929 | ||
1930 | o Performance. The Hide*, Group*, Ignore*, Include* and IndexAlias | |
1931 | configuration options can cause a performance decrease if lots of | |
1932 | them are used. The reason for this is that every log record must | |
1933 | be scanned for each item in each list. For example, if you are | |
1934 | Hiding 20 objects, Grouping 20 more, and Ignoring 5, each record | |
1935 | is scanned, at most, 46 times (20+20+5 + an IndexAlias scan). | |
1936 | On really large log files, this can have a profound impact. It | |
1937 | is recommended that you use the least amount of these configuration | |
1938 | options that you can, as it will greatly improve performance. | |
1939 | ||
1940 | ||
1941 | Final Notes | |
1942 | ----------- | |
1943 | ||
1944 | A lot of time and effort went into making The Webalizer, and to ensure that | |
1945 | the results are as accurate as possible. If you find any abnormalities or | |
1946 | inconsistent results, bugs, errors, omissions or anything else that doesn't | |
1947 | look right, please let me know so I can investigate the problem or correct | |
1948 | the error. This goes for the minimal documentation as well. Suggestions | |
1949 | for future versions are also welcome and appreciated. |