Imported Upstream version 2.23.05
[hcoop/zz_old/debian/webalizer.git] / CHANGES
1 --------------------------------------------------------------------
2 2.23-xx changes from 2.21-xx
3 --------------------------------------------------------------------
5 Fixes:
6 o Fix sporadic eol problem with some IIS/W3C logs
8 o Fix compiler directive syntax error (broke some 64 bit systems)
10 Changes/Additions:
11 o Modest speed improvements in hash table code
13 --------------------------------------------------------------------
14 2.21-xx changes from 2.20-xx
15 --------------------------------------------------------------------
17 Fixes:
18 o Added missing memory deallocation call in DNS lookup code.
20 o Minor fixes to configure script
22 Changes/Additions:
23 o Added "YearTotals" config option for main index page totals
25 o Rename local stricmp() function to ouricmp() to prevent name
26 confilict on systems that happen to provide it already.
28 --------------------------------------------------------------------
29 2.20-xx changes from 2.01-xx
30 --------------------------------------------------------------------
32 Fixes:
33 o Fixed problem with timing totals.
35 o Fixed referrer linking to avoid possible xss injection.
37 o Fixed month change detection error that caused incorrect report
38 dates when logs had a 'gap' longer than a year.
40 o Fixed buffer overrun possibility in parsing code and user agent
41 mangle logic.
43 o Added symbolic link checks for file I/O to prevent possible
44 privilege escalation exploits. Disallows reading from or writing
45 to any file that is a symlink. Thanks to Julien Danjou.
47 o Added code to preserve the history and incremental data files in
48 the event of a crash before writing to them completely. Thanks
49 to Robert Millan for the idea and initial code.
51 Changes/Additions:
52 o Added native geolocation services, which fully supports both IPv4
53 and IPv6 lookups. Adds the configuration keywords 'GeoDB' and
54 'GeoDBDatabase' along with the '-j' and '-J' command line options.
56 o Added 'wcmgr', "The Webalizer (DNS) Cache file Manager" to the
57 distribution to provide cache file maintenance. See the supplied
58 man page for a description and usage information.
60 o Changed history code and main index page to allow for more than
61 12 months of reports to be displayed. Added the config keywords
62 'IndexMonths' (-K command line option), 'GraphMonths' (-k command
63 line option) and 'YearHeaders' to control how index is displayed.
65 o Changed Berkeley DB code to use current 4.x APIs.
67 o Added support for bzip2 compressed log files (.bz2) as a compile
68 time option (--enable-bz2). If enabled, bzipped files will be
69 decompressed automatically during processing.
71 o Added support for W3C formatted logs. Based on code submitted
72 by Klaus Reimer.
74 o Added GeoIP support as compile time option (--enable-geoip). Adds
75 'GeoIP' and 'GeoIPDatabase' config keywords, '-w' and '-W'
76 command line options. (
78 o Added IPv6 support. Based on initial code by Jose Carlos Meneiros
79 and modified to support Solaris and other problematic platforms.
81 o Added 'CacheIPs' config option to allow saving unresolved addresses
82 in the DNS cache.
84 o Added 'CacheTTL' config option which allows the DNS cache time to
85 live (TTL) value to be specified at run-time.
87 o Added 'SearchCaseI' config option to specify if search strings
88 should be treated as case insensitive or not. The default value,
89 'yes', causes search strings to be treated as case insensitive.
91 o Added 'HTAccess' config option. Allows writing a default .htaccess
92 file to the output directory.
94 o Added ability to display flags in the top country table. Adds the
95 config keywords 'CountryFlags' and 'FlagDir', and -z command line
96 option.
98 o Added 'StripCGI' config option to configure how CGI variables on
99 the end of URLs are treated (can now be stripped or left in place).
101 o Added 'DefaultIndex' config option to enable/disable the use of
102 "index." as a default index name to be stripped from the end of URLs.
104 o Added 'TrimSquidURL' config option to allow squid log URLs to be
105 reduced in granularity by a user definable amount. Thanks to code
106 submitted by Stuart Gall.
108 o Added 'OmitPage' config option (and the '-O' command line switch)
109 to prevent specified URLs from being counted as pages even if they
110 otherwise would be. Thanks to code submitted by Adam Morton.
112 o Added 'IgnoreState' config option (and the -b command line switch)
113 to allow ignoring any existing incremental data file (similar to
114 the IgnoreHist/-i option).
116 o Changed logic to always generate summary report (index.html),
117 even if no records were processed.
119 o Added color support to allow changing graph colors. Based on the
120 Webalizer-usecolor code submitted by Benoit Rouits. Adds 11 new
121 config options, see the README file for complete descriptions.
123 o Added language 'lang=' specification in generated HTML files.
125 o Added 'LinkReferrer' config option to allow/disallow links in the
126 top referrers table.
128 o Added 'PagePrefix' config option to allow URL prefix matches to
129 be counted as pages, regardless of file extension or type. Thanks
130 to code submitted by Remco Van de Meent.
132 o Enabled large file support (LFS) to support logs greater than 2Gb
133 in size on systems that support LFS. Also increased the size of
134 most internal counters to handle larger sites.
136 o Minor changes to generated HTML output
138 o Updated language files country codes for current IANA TLDs
140 o Changed the meaning of the -v command line switch. It now
141 causes verbose information to be displayed at run-time
142 (Informational and Debug messages).
144 o Changed Group* config options to allow a quoted string for
145 the match string. This allows spaces to be embedded in the
146 string.
148 o Changed log record parsing logic to allow spaces in URLs.
150 o Made configuration keywords, boolean configuration values
151 (yes/no), and log file types case insensitive. Also fixed
152 defaults for invalid values to reflect documented defaults.
154 o Changed configure script to use --sysconfdir to specify the
155 location of the default webalizer.conf configuration file.
156 Also added support for DESTDIR during install to aid binary
157 package builds.
159 --------------------------------------------------------------------
160 2.01-xx changes from 1.30-04
161 --------------------------------------------------------------------
163 Fixes:
164 o Fix posible obscure buffer overflow bug in DNS resolver code
166 o Added additional extended character fixes
168 o Let code accept partial content response codes along with 200's
170 o Added code to catch blank hostnames (yes, they have been found!)
171 Will convert them into 'Unknown'
173 o Security fix for cross-site scripting vulnerability found by
174 Flavio Veloso (
176 o Fixed a TOTAL_RC off by one error, which would prevent the last
177 response code from being saved when using incremental mode.
179 o Fixed possible segfault condition in MangleAgent code on
180 some malformed user agent names.
182 o Fixed DNS to prevent hangs on blank and malformed hostnames.
184 o Fixed problem calculating visits. Changed timestamps to use
185 seconds since epoch (1/1/1970) which results in more accurate
186 analysis. Also changed normal out of sequence code to handle
187 up to 1 hour of 'slop' in the timestamps. This changed the
188 semantics of the VisitTimeout and -m configuration options, as
189 the values are now specified in number of seconds.
191 o Fixed hostname lowercase problem (wasn't) when using DNS lookups.
193 o Fixed problem with incremental datafile which could cause a read
194 error under certain circumstances (removes control characters).
195 Also changed code to now abort on a read error.
197 o Fixed problem with hash table node creation where objects that
198 were exactly the maximum length would wind up leaving a garbage
199 byte at the end of the memory space allocated. This was causing
200 some very infrequent and widely different problems.
202 o Fixed problem where country graph could be produced incorrectly
203 if using a non-english language and the country name overlapped
204 the pie chart.
206 o Found and fixed a problem with a possible 32-bit wrap around
207 problem using incremental mode on large sites. The problem
208 would cause the KBytes data on large groups to become inaccuate.
210 Changes/Additions:
211 o Modified configure to allow specification of the default config
212 directory. If not given, will use /etc (/etc/webalizer.conf).
214 o Added DailyGraph and DailyStats configuration options to enable
215 or disable the Daily usage graph and stats table from output.
217 o Improved visit calculation logic to reduce 'false' counts generated
218 by external image referrals.
220 o Added reverse DNS lookup capability. This adds the command
221 line switchs -D and -N, and configuration keywords "DNSCache"
222 and "DNSChildren". See the DNS.README for additional info.
223 Based in part on code submitted by Henning P. Schmiedehausen
224 (
226 o Added ability to dump Sites, URLs, Referrers, User Agents,
227 Usernames and Search Strings to tab delimited files, suitable
228 for import into most database and spreadsheet programs. The
229 location of this file may be specified using the "DumpPath"
230 configuration keyword, allowing the data to be kept someplace
231 outside the web servers document tree. The configuration
232 keywords "DumpSites", "DumpURLs", "DumpReferrers", "DumpAgents",
233 "DumpUsers" and "DumpSearchStr" have been added to control the
234 file dumps. Column headers can be included in the file with
235 the "DumpHeader" keyword. Dump filename extensions may be
236 specified using the "DumpExtension" keyword (default is .tab).
238 o Added username analysis, based on usernames found in the log,
239 and only available if username information is present in the
240 log (ie: http authentication or wu-ftpd xferlog). The keywords
241 'GroupUser', 'HideUser', 'IgnoreUser', 'IncludeUser', 'AllUsers',
242 and 'TopUsers' have been added to the configuration file code.
243 This change also modified the format of the incremental data file.
245 o Added the ability to display ALL sites, URLs, Referrers,
246 User Agents and Search Strings on a seperate HTML page from
247 the normal statistics page. This adds the configuration
248 keywords 'AllSites', 'AllURLs', 'AllReferrers', 'AllAgents'
249 and 'AllSearchStr', which can have either a "yes" or "no"
250 value (default is "no"). Will add a "View All..." link to
251 the bottom of the appropriate "Top" table if enabled.
253 o Added support for squid proxy logs, thanks to code submitted
254 by Steinar H. Gunderson ( To use
255 squid logs, specify a LogType of 'squid' in the configuration
256 file. This also changed the behaviour of the '-F' command
257 line switch, which now requires a second argument of either
258 'clf', 'ftp' or 'squid'.
260 o Completely modified the way the various TOP tables are handled
261 and sorted, which now allows extremely large top tables without
262 any performance degredation. Previously, tables greater than
263 a few hundred elements produced a noticable perfomance penalty
264 during processing.
266 o Added the ability to group domains automatically and to hide
267 individual host names from the report, using the 'GroupDomains'
268 and 'HideAllSites' configuration keywords (-g and -X command
269 line options). Domain Grouping is configurable as to the level
270 of grouping (second level domain, third, etc...). HideAllSites
271 forces only grouped site records to be displayed if any. Based
272 on ideas/code by Michael Klemme ( This changes
273 the behaviour of the '-g' switch, which previously was used to
274 force the use of GMT time for reports.
276 o Added user configurable search engine specification, used for
277 search string analysis. This adds the 'SearchEngine' keyword
278 in configuration files. Based on idea/code by Alexey Kizilov.
280 o Changed code to use the latest version of GD which supports PNG
281 images instead of GIF images. Also included changes in configure
282 script to ensure the presence of the libpng and libz libraries.
284 o Added ability to override log file to STDIN by use of '-' on
285 the command line.
287 o Added gzipped logfile support. The program will automatically
288 detect logfiles with a '.gz' extension and uncompress on the
289 fly. Uses gz file support of zlib, since it's required for
290 our gd/png stuff anyway. Please note that using gzipped logs
291 will incur a small performance penality.
293 o Minor changes to search string code to increase accuracy. This
294 also removes a previous condition that would occasionally cause
295 search strings to incorrectly be counted twice or to be counted
296 as different search strings when only differing by a space.
298 o Minor changes to URL parse code to allow additional characters.
299 Also changed unescape code to properly handle extended chars.
301 o Major changes to hash table node format for reduced memory usage.
302 Instead of fixed size strings, the new format will dynamically
303 allocate string memory and use pointers to existing table data
304 under certain circumstances. The memory savings is significant
305 and will be greatly noticed with large sites. Because of these
306 changes, the formatting of the incremental data file had to be
307 changed, therefore it is incompatible with previous versions.
309 o Major code reorganization and cleanup. This was to facilitate
310 future developent and make things more managable.
312 o Usual documentation updates for new features/functions.
314 --------------------------------------------------------------------
315 1.30-xx changes from 1.22-06
316 --------------------------------------------------------------------
318 Fixes:
319 o Fixed minor bug that would allow incorrect site totals for the
320 first day of the month under certain conditions.
322 Changes/Additions:
323 o Added Top Entry and Exit Page tables. Added configuration file
324 keywords TopEntry (-e command line) and TopExit (-E command line)
325 to specify the number of entries to display for each table. The
326 default for both is 10. See README for additional information.
328 o Added 'Group' labels. Allows display of a specified label for
329 grouped entries (in 'Top' tables). Based on patch submitted
330 by Oliver Graf ( See sample.conf for
331 examples.
333 o Added 'Visits' totals. The length of time that constitutes a
334 'visit' can be set using the VisitTimeout configuration keyword
335 (-m command line option). The value must be given in HHMMSS
336 format, you can omit leading zeros. Default is 30 minutes (3000).
338 o Added 'Pages' totals, based on user specified extensions. Changes
339 made to generated graphs as well. Configuration keyword PageType
340 (and command line -P switch) allows specification of extensions
341 to use (defaults to 'htm*' and 'cgi'). Also called "pageviews".
343 o Added Search String analysis. Keyword 'TopSearch' defines how
344 many of the top search strings to display. Default is 20. Can
345 be disabled by using zero (0).
347 o Added native support for ftp logs (xferlog ala wu-ftpd). Added
348 'LogType' configuration file keyword (-F command line option)
349 to specify log type. Values can be either 'web' or 'ftp', with
350 the default of 'web'.
352 o Changed graphs to handle pages and visits totals. Also added
353 color coded legends, which can be disabled using the GraphLegend
354 configuration keyword (-L command line option). Default is to
355 display them.
357 o Added background lines to graphs. Default is 2 lines, and can
358 be set to any number using the GraphLines configuration keyword
359 (-l command line option). Can use anywhere from none (0) to
360 twenty lines. They will be drawn in all but the country graph.
362 o Added CountryGraph configuration file keyword (-Y command line
363 option) to enable/disable display of country usage pie chart.
365 o Added FoldSeqErr keyword (-f command line option). Normally,
366 the program will ignore log records that are out of sequence
367 (chronological order). This option lets them be folded into
368 the analysis anyway, as if the were the same date/time as the
369 last good record. Apache users can safely ignore :)
371 o Added additonal 'Top' tables for SITES and URLs, sorted by
372 KBytes instead of hits. Two new configuration file keywords,
373 TopKSites and TopKURLs, can be used to specify the number of
374 entries for each (zero to disable). Default for both is 10.
376 o Added additional calculations for max/avg files, pages, visits
377 and KBytes in monthly statistics.
379 o Updated generated HTML code to fully comply with the HTML 4.0
380 Transitional spec. DOCTYPE header reflects this change as well.
382 o Changed code to use 4 digit years in filenames. Purely for the
383 Y2K phobes who couldn't deal with only two digits (even though
384 it was _purely_ for humans, the program couldn't care less).
385 Unfortunately, this means that you will have to rename previous
386 month files to the new format. Not a big deal if you plan on
387 re-running all your logs to take advantage of the new features.
389 o Major changes to both history file and incremental file formats
390 to handle additional totals (pages/visits data). As a result,
391 this version is INCOMPATABLE with previous versions. See the
392 file README.FIRST for important information on upgrading.
394 o Language files and documentation updated for new functions.
396 --------------------------------------------------------------------
397 1.22-xx changes from 1.20-11
398 --------------------------------------------------------------------
400 Fixes:
401 o Fixed bug in country total generation. Caused country table
402 to show bogus entries if logs contain hostnames that were not
403 fully qualified (ie: don't have the domain name/TLD portion).
405 o Changed/fixed incremental data I/O routines to better detect and
406 handle error conditions. This involved some minor incremental
407 data file format changes as well. Fixes problem large sites were
408 having where random tables were getting munged.
410 o Fixed record parse code to better detect and strip query portion
411 from URLs and Referrer strings.
413 o Fixed segfault condition when more than MAX_CTRY entries were
414 specified for the "Top Countries" table.
416 Changes/Additions:
417 o Added code to detect negative byte transfer sizes in logs (another
418 netscape server kludge :) Could cause KByte xfer sizes to become
419 corrupt.
421 o Several small changes (mostly ifdef/endif's) to make code compile
422 clean 'out-of-the-box' across more platforms (ala SunOS). Also
423 added a GNU autoconf 'configure' script which helps a bit as well.
425 o Added Include* keywords. Allows forcing the inclusion of specified
426 log records. Takes precedence over counterpart Ignore* keywords.
428 o Added HTMLPre, HTMLBody, HTMLEnd and HTMLExtension keywords, and
429 changed behaviour of HTMLHead keyword. Previous versions need
430 only change the 'HTMLHead' keword in existing files to 'HTMLBody'
431 to upgrade. Thanks to Colin Viebrock <> for
432 the idea and code examples.
434 o Changed mangle agent code to support Opera and other browsers.
435 Also updated response codes to IETF HTTP/1.1 Rev 6 draft.
436 Thanks to Yves Lafon <> for this these.
438 o Added HistoryName and IncrementalName keywords, which allow the
439 specification of the history and incremental data filenames.
441 o Added UseHTTPS keyword, which allows using 'https://' instead
442 of 'http://' for links to URLS in the 'Top URLs' table. Also
443 added check for URLs that already have the protocol specified
444 (such as on virtual web and proxy servers), and to use unmodified
445 if found (will only force to lowercase for matching).
447 o Added code to ignore out-of-sequence log records.
449 o Added code to force hostnames to lowercase (was causing country skew).
451 o Disabled display of blank (zero hit) days at start of daily stat table.
453 o Added records per second calculation to timing totals.
455 o ALT= tags now use translated strings instead of forcing english.
457 o Updated documentation for new functions/features.
459 --------------------------------------------------------------------
460 1.20-xx changes from 1.12-10
461 --------------------------------------------------------------------
463 Fixes:
464 o Modified record parse routine to not touch stuff between quotes
465 ("). Was causing problems parsing some malformed request fields.
467 o Fixed memory leak in MangleAgent code, and relocated to elimitate
468 un-necessary processing (causing segfault on some machines).
470 Changes/Additions:
471 o Changed transfer totals on host/url structures to support large
472 groupings (such as *.gif) on heavly hit servers. Hopefully, this
473 should cure the 32bit overflow problem large sites were having.
475 o Changed daily transfer totals to support transfers greater than
476 roughly 4.2 gigabytes a day.
478 o Added some missing HTML tags and altered the way totals are
479 calculated on the 'Top' tables (to correct for grouped records).
481 o Added incremental run capability (-p command line option or
482 "Incremental" configuration file keyword).
484 --------------------------------------------------------------------
485 1.1x-xx changes from 1.00-05
486 --------------------------------------------------------------------
488 Fixes:
489 o Re-wrote the Group* logic, fixing a bug that allowed hiding of
490 objects when they shouldn't be.
492 o Fixed broken IgnoreReferrer code.
494 o Modified config parse code to handle extended characters.
496 o Misc. minor bug fixes/changes. Added a missing fclose.
498 o Cleaned up generated HTML.
500 o Fixed duplicate warnings on large referrer fields.
502 o Fixed country table bug adding grouped records to totals.
504 Changes/Additions:
505 o Added GroupSite, GroupReferrer and GroupAgent keywords to round
506 out the Group* configuration options.
508 o Added GroupShading and GroupHighlight keywords to allow selective
509 highlight and shading on grouped rows in table.
511 o Removed the '-L' command line option. Groupings can now only
512 be specified from a configuration file. Language files changed
513 to reflect change.
515 o Added '-V' command line option (identical to '-v') for version.
517 o Added additional language support. Language files will be marked
518 /* New for 1.1 */ where changes have been made.
520 o Various rewrites to streamline the code, accomidate the new
521 group options and make things easier down the road when I implement
522 incremental (partial log) processing.
524 o Usual README and CHANGES documentation updates.
526 --------------------------------------------------------------------
527 1.00-xx changes from 0.99-06
528 --------------------------------------------------------------------
530 Fixes:
531 o Modify record parser so that spaces in usernames (auth field)
532 don't cause record to be skipped (w/'Bad Record' message).
534 o Included various error conditions that were being ignored in
535 the timing statistics ('bad records' value) totals.
537 Changes/Additions:
538 o Added GMTTime (-g) option to force display of timestamps in
539 GMT (UTC) time instead of local timezone.
541 o Added GroupURL (-L) option for grouping of URLs as if they
542 were a single object. See README for details.
544 o Language support in the form of a language specific header
545 file containing all strings used by The Webalizer. English
546 file is used by default unless changed. Support for other
547 languages will be distributed as I receive them.
549 --------------------------------------------------------------------
550 0.99-xx changes from 0.98-16
551 --------------------------------------------------------------------
553 0.99 is mostly a bug-fix release, with a few added extra goodies.
555 Fixes:
556 o Fixed monthly total transfer size (silent) overflow problem.
558 o Fixed the numerous fprintf format errors. Only seemed to wreak havok
559 on non-intel machines though.
561 o Fixed core dump condition on certain machines when using stdin for
562 input.
564 o Fixed floating point code that caused divide by zero errors on some
565 platforms (most noticably on SCO OpenServer).
567 o Netscape server kludges: Added code to deal with Netscape log header
568 record gracefully. Also added workaround for timestamp error where
569 Netscape sometimes makes a day have 0-24 hours instead of 0-23. The
570 Webalizer will now treat anything greater than 23 as 0.
572 o Resized some fixed field sizes to gain memory usage improvements.
574 Changes/Additions:
575 o Ignore* config keywords added. This allows you to completely ignore
576 certain log records based on site name, URL, user agent or referrer.
577 * Use will cause inaccurate statistics results. See documentation.
579 o ReallyQuiet config keyword (-Q command line option) added. Causes
580 The Webalizer to supress _all_ messages. Useful for cron jobs.
582 o Removed the "Sites" total at the bottom of the summary by month.
583 The total for sites is a useless number and produces a misleadingly
584 high value which detracts from the accuracy of the other totals.
586 o Updated README and CHANGES