# $Id: README,v 1.39 2002/05/01 21:06:32 jerome Exp $
ScanErrLog v2.01 - May 1st 2002
(C) Jerome Alet <alet@librelogiciel.com> 2000-2002
You're welcome to redistribute this software under the
terms of the GNU General Public Licence version 2.0
or, at your option, any higher version.

You can read the complete GNU GPL in the file COPYING
which should come along with this software, or visit
the Free Software Foundation's WEB site http://www.fsf.org

WARNINGS:
=========

Sample reports are not distributed anymore, because it's easy to test
ScanErrLog online at:

        http://www.librelogiciel.com/software/ScanErrLog/try

Since version 1.5, you now have to download the jaxml XML generation Python
module. You can download its latest version freely from:

        http://www.librelogiciel.com/software/

You need at least jaxml-2.22.

To be able to produce the report in PDF format, you have to install
the ReportLab's Python module.
You can download it freely from:

        http://www.reportlab.com/

The latest official release of ReportLab, 1.13 at this time, works just fine,
but any older version may only work partially or not at all.

---------------------------------------------------------------
Nota Bene: Since 2.00 you don't need the jahtml module anymore.
---------------------------------------------------------------

COMPLETE INSTALL:
=================

WARNING: 1, 2, and 4 are now mandatory. 3 is optional.

1 - If you don't have Distutils installed (e.g. python version <= 1.5.2)
then first download it from:

        http://www.python.org/sigs/distutils-sig/

then follow the installation instructions for Distutils and install
it on your system.

2 - If you don't have the jaxml module installed, then download its
latest version from:

        http://www.librelogiciel.com/software/

then follow the installation instructions for jaxml and install
it on your system.

3 - If you don't have the ReportLab module installed, and you want
to produce reports in PDF format, then download its latest version from:

        http://www.reportlab.com/

then follow the installation instructions for ReportLab and install
it on your system.

4 - Download the latest ScanErrLog version from:

        http://www.librelogiciel.com/software/

Extract it:

        gzip -d scanerrlog-x.xx.tar.gz | tar -xf -

        where x.xx is scanerrlog's latest version number.

Go to scanerrlog's directory:

        cd scanerrlog-x.xx

Just type:

        python setup.py install

You may need to be logged in with sufficient privileges (e.g. root)

This will generally install scanerrlog.py in /usr/local/bin or
an equivalent path depending on your system.

If you want to launch ScanErrLog as a CGI script, please consider
looking at the ScanErrLog.html file included in this package to
see a sample HTML form to do it. Then you may want to copy
scanerrlog.py to your web server's cgi-bin directory and allow the
execution of python CGI scripts. Refer to your web server's
documentation for details.

You can launch scanerrlog.py either directly from the command line,
or as a CGI script, or import it in your own python program and use
(or subclass) the ApacheErrorLog class it defines. In the latter case
take care of ensuring that scanerrlog.py is in your python path before
importing it (e.g. do a sys.path.append('/usr/local/bin') before the
import scanerrlog)

You can test ScanErrLog online at:

        http://www.librelogiciel.com/software/ScanErrLog/try

Voil !

HINTS:
======

Producing the same report in different formats is now quickier than
before, thanks to the --continue option:

    * launch ScanErrLog on your error_log file with the --continue option.
    * then for each new format you want of the same report, just
      launch ScanErrLog with the --continue option on an empty file
      in the same directory as the error_log file.

This will make ScanErrLog parse the error_log file only one time, but
produce as many same reports as you want, saving on the processing time and CPU.
Note however that due to the use of the QuickSort algorithm, messages with the same
number of occurences may be ordered differently from one pass to another.

DOCUMENTATION:
==============

ScanErrLog v2.01 (C) 2000-2002 Jerome Alet & Free Software Foundation

This Python module allows people to parse Apache error_log files from
one of different possible sources (filename, stdin, python file object),
and present their datas in decreasing number of occurences of error
messages.

This is particularly useful if you want to quickly solve the most
annoying problems web surfers encounter visiting your site.

If you run this module directly, it will parse each file which name was
passed on the command line.

If you don't pass any argument on the command line, then scanerrlog will
read an error_log from stdin if you've piped some file or command to its
standard input, or it will print its documentation if you've not.

You can also use it as a CGI script, but you'll not be able to
modify the pattern and outputfile used, and the input filename
should not begin with / or contain .. in its name, all for
security reasons. The names you may use for your CGI variables
are: continue, date, withoutheader, title, limit, exclude, format and
inputfile.
if continue, date or withoutheader exist in your form, these options
will be set to TRUE whatever value they have. See ScanErrLog.html for
a sample form to launch ScanErrLog as a CGI script.


e.g.:

    ./scanerrlog.py

prints scanerrlog's documentation (what you are reading now)


    ./scanerrlog.py /var/log/httpd/error_log /var/log/httpd/error_log.1

will read datas from the specified files.


    ./scanerrlog </var/log/httpd/error_log

will read datas from standard input

You can pass some options on the command line:

options:
        -c | --continue         useful if you want to parse the same file
                                many times (e.g. every week): the current
                                state and statistics of the file are saved
                                in a file named ScanErrLog.stats in the
                                same directory, so you don't have to reparse
                                the beginning of the file each time. You
                                should use this option either to tell
                                ScanErrLog to save the statistics or to reuse
                                the saved ones.
                                Without this option the file is completely
                                parsed again, even if you've got an old
                                statistics file saved in the same directory.
                                WARNING: this option is incompatible with
                                the parsing of multiple files.
        -d | --date             include in the final report the date when
                                each message appeared for the last time.
                                this option is mutually exclusive with
                                the --pattern option.
        -e | --exclude e        e is a slash separated list of
                                messages severity. All messages with
                                a severity listed in e are excluded
                                from the final report. By default all
                                messages are included. For example,
                                e can be: info/debug to exclude all
                                messages which severity is info or
                                debug.
        -f | --format f         output format for the report, f can be
                                any of:
                                    'html', 'pdf', 'text', 'xml'
                                the default format is 'html'.
        -h | --help             displays this help screen.
        -l | --limit lim        selects messages only if their number of
                                occurences equals or exceeds lim.
                                lim's default value is 1, meaning all
                                messages are included in the final report.
        -n | --nocumulate       don't cumulate counts for all the files
                                passed on the command line. the old
                                -c | --cumulate option is now the default.
                                if the following option -o is not used,
                                then -n implies -w because all reports
                                will be in the same file (stdout).
        -o | --outputfile f     save the report in the file f.
                                if -n is used, then the filename will
                                be n.f where n is an integer incremented
                                for each new file and starting at 1.
        -p | --pattern regexp   select only the lines which match regexp.
                                the default regexp is:
        ^(httpd: |\B)\[([^\[\]]+)\] \[([^\[\]]+)\] (?:\[([^\[\]]+)\] )?
                                which selects all Apache logged messages,
                                but not errors from CGI scripts for example.
                                to work correctly, your regexp should consume
                                all characters from the beginning of the
                                error line up to the beginning of the real
                                error message.
                                this option is mutually exclusive with
                                the --date option.
        -t | --title t          sets the report title.
        -v | --version          displays ScanErrLog's version number.
        -w | --withoutheader    suppress the header of the HTML report.
                                useful if you want to include the report
                                directly into another HTML document.


Warning: some options may not work with all report formats.

A fifth possibility is to import this module into another python
program and use the ApacheErrorLog class it defines.

ScanErrLog comes with ABSOLUTELY NO WARRANTY
This is free software, and you are welcome to redistribute it under
certain conditions; refer to the Gnu General Public License for details.
You'll find the GNU GPL in the file COPYING which should came along
with this software or at http://www.gnu.org

Please e-mail bugs to: alet@librelogiciel.com (Jerome Alet)
