Webalizer Configuration

Introduction

The Webalizer web stats package is useful it tracking usage of a web-site. The The Webalizer is a fast, free web server log file analysis program available from http://www.mrunix.net/webalizer/. It produces highly detailed, easily configurable usage reports in HTML format, for viewing with a standard web browser.

For a general definition see: http://en.wikipedia.org/wiki/Webalizer although http://www.mrunix.net/webalizer/ is still the best place to go for full information.

The following guidance might be useful in helping to configure Webalizer on an Apache Web-server running on Linux.

My Webalizer Configuration

Because I have a number of virtual host names on the same physical server each writing acces/error information to individual logs files I want to be able to generate Webalizer output for each individual web-server. After experimentation I arrived at the following set of configuration file(s), which allowed me to monitor a number of virtual servers on the same physical server.

To allow this I followed the general instructions in http://www.mrunix.net/webalizer/faq.html, point #17. A fragment of an individual webalizer.conf file in the /etc/webalizer directory is shown below (this is based on, but separate to the /etc/webalizer.conf file):

# cd /etc/webalizer
# more PublicWebSite-webalizer.conf
#
# Sample Webalizer configuration file
# Copyright 1997-2000 by Bradford L. Barrett (brad@mrunix.net)
# ...
#
# 06/06/2007 add revised log for site access log
# LogFile      /var/log/httpd/access_log
LogFile        /var/log/httpd/PublicWebSite-access_log

# LogType defines the log type being processed.  Normally, the Webalizer
# expects a CLF or Combined web server log as input.  Using this option,
# you can process ftp logs as well (xferlog as produced by wu-ftp and
# others), or Squid native logs.  Values can be 'clf', 'ftp' or 'squid',
# with 'clf' the default.

#LogType        clf

# OutputDir is where you want to put the output files.  This should
# should be a full path name, however relative ones might work as well.
# If no output directory is specified, the current directory will be used.

# 06/06/2007 add revised log for site access log
# OutputDir    /var/www/usage
OutputDir      /var/www/usage/PublicWebSite
...

In fact I found this that did not quite do it for me. I had the following further modifications to make that were not mentioned in http://www.mrunix.net/webalizer/faq.html. Without these further additions I found that I was not getting the statistics generated as expected, even thought I knew the log files for individual web virtual servers were being updated correctly.

...
# 12/06/2007 revise to put history in current OutputDir
# HistoryName     /var/lib/webalizer/webalizer.hist
HistoryName     webalizer.hist
...
...
# 12/06/2007 revise to put in current OutputDir
# IncrementalName /var/lib/webalizer/webalizer.current
IncrementalName webalizer.current
...

Finally for good measure:

...
# 06/06/2007 revised
HostName       www.domain.name
...

Regular Web-Stats Generation

On CentOS I found that there is a cron.daily job scheduled each night/monring at 04:02 AM, in the script 00webalizer. However I was getting no web stats generated as a result. It turned out that the logrotate job, carried out by the same cron.daily job was rotating the logs files before 00webalizer could finish generating the web status HTML pages. So I left the default webalizer.conf file to generate it's stats as per normal, but for all my virtual hosts I created a specific crontab job to execute the bespoke Webalizer conf files.

# crontab -l
...
# executing this here, rather than cron.daily, because logrotate is
# deleting (rotating) the files before they can be used
50 03 * * * /usr/local/etc/webalizer.sh
...
#

Where webalizer.sh is:

# more /usr/local/etc/webalizer.sh
#! /bin/bash
# update access statistics for the web site

# if [ -s /var/log/httpd/access_log ] ; then
#     /usr/bin/webalizer
# fi

# 07/06/2007 altered to include per-virtual host conf files in the sub-directory
# 07/06/2007 while still executing the original conf file as per above
# 10/06/2007 not working from cron.daily: logrotate is rotating the logs before they can be actioned

for i in /etc/webalizer/*.conf; do echo $i; /usr/bin/webalizer -c $i; done

exit 0
#

This did the trick! All (seven) virtual host log files were individually analysed and separate web-stats pages made available.


General Links

The following general links are useful references when setting up Webalizer:


URLSummary/Description
http://www.mrunix.net/webalizer/ Home of the Webalizer free web server log file analysis program
http://www.mysql-apache-php.com/ Quick Linux Server Installation