Included is a (modified) core script of our setup.

Features of this setup:
- logging through cronolog solves log rotation problems.
- log files are compressed after processing to save storage.
- fast resolving via jdresolve
- daily processing
- when it's not run every day (due to some outage for example) it will
  catch up on the next day - automatically.
- optional vhost support:
- new vhosts will be automatically added
- vhost configurations can be customized (ignore_referrer and such)

Not solved in this setup:
- can't (yet) merge logfiles from multiple web servers
- can only handle as many vhosts as file descriptors (no fd pool yet)
- can't handle more-than-daily log rotation yet. Should be easy to
  adopt the scripts for this, though.
- bzip instead of gzip ;)
- log splitting done as root right now. Only modlogan is run via sudo.

Preparation:
- Log files should be in "full" format (including vhost) when you want
  to use the vhost splitting feature.
- cronolog needs to be set up with filename mask "%Y/%m/access.log-%Y.%m.%d"
  (or change the source accordingly. You need directories for each month,
  because the script stops searching older logs when a month-dir doesn't exist)
- old logs can be split accordingly with cronosplit (from cronolog)
- install jdresolve and setup some database location

Processing procedure:
- apache logs through cronolog to logdir/%Y/%m/access.log-%Y.%m.%d
- cronjob runs each day and searches for unresolved log files.
- "access.log-*" files are resolved into "resolved.log-*"
- "access.log-*" are gzipped after resolving
- "resolved.log-*" files are (optionally) split into logfiles for vhosts
- new vhosts will be automatically added by reproducing the directory
  structure of the top level and a modlogan.conf.template
- the newly generated "resolved.log-*" files are (for each vhost) run
  through modlogan to generate statistics
- after processing the "resolved.log-*" files are gzipped.

Directory structure (basic, can be adopted):
logdir/200x/xx/access.log-200x.xx.xx       unresvoled logfile
logdir/200x/xx/access.log-200x.xx.xx.gz    processed unresolved logfile
logdir/200x/xx/resolved.log-200x.xx.xx     resolved, unprocessed logfile
logdir/200x/xx/resolved.log-200x,xx.xx.gz  resolved + processed logfile
logdir/modlogan.conf                       modlogan configuration file
logdir/modlogan/                           modlogan output
logdir/modlogan-state/                     modlogan state files

Optional:
logdir/vhosts/modlogan.conf.template       configuration template
logdir/vhosts/hostname/200x/xx/resolved*   logfiles for this vhost
logdir/vhosts/hostname/modlogan.conf       modlogan configration
logdir/vhosts/hostname/modlogan/           modlogan vhost output
logdir/vhosts/hostname/modlogan-state/     modlogan state files


Usage:
- adopt path names in the first lines of the file. Make sure the userid
  to run modlogan as does exist and has write permissions where needed.
  do check the jdresolve database location as well.
- setup the "modlogan" and "modlogan-state" directories
- if you have vhosts: setup "vhost" dir and modlogan.conf.template
  (in the example template, adopt include and output paths!)
- run "modlogrun.pl logdir" if you don't want vhost splitting or
- run "modlogrun.pl logdir vhosts" if you want vhost splitting.
- contribute improvements of this code.

Enjoy,
Erich Schubert <modlogan@vitavonni.de>
Drinsama IT Services GmbH, Ottobrunn, Germany
