
Lire Developer's Manual

Joost van Baal

Egon L. Willighagen

Francis J. Lacoste

   Copyright  2000, 2001, 2002, 2003, 2004 Stichting LogReport
   Foundation

   This manual is free software; you can redistribute it and/or modify it
   under the terms of the GNU General Public License as published by the
   Free Software Foundation; either version 2 of the License, or (at your
   option) any later version.

   This is distributed in the hope that it will be useful, but without
   any warranty; without even the implied warranty of merchantability or
   fitness for a particular purpose. See the GNU General Public License
   for more details.

   You should have received a copy of the GNU General Public License
   along with this manual (see COPYING); if not, check with
   http://www.gnu.org/copyleft/gpl.html or write to the Free Software
   Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111, USA.
   Revision History
   Revision 2.0.2 $Date: 2006/07/20 12:11:15 $
   $Id: dev-manual.dbx,v 1.87 2006/07/20 12:11:15 vanbaal Exp $
     _________________________________________________________________

   Table of Contents

   Preface

        What This Book Contains
        How Is This Book Organized?
        Conventions Used
        If You Don't Find Something In This Manual

   I. Lire Architecture

        1. Architecture Overview

              Lire's Design Patterns
              Log File Normalisation
              Log Analysis
              Report Generation
              Report Formatting and Other Post-Processing
              Going Further

   II. Using the Lire Framework

        2. Writing a New DLF Converter

              Prerequisites
              The common_syslog Log Format
              Creating the DLF Converter Skeleton
              Adding a Constructor
              The Meta-Data Methods

                    The DLF Converter Name
                    Providing Information To Users
                    Providing Information to the Framework
                    The Conversion Methods

              Registering Your DLF Converter with the Lire Framework
              DLF Converter API

        3. Writing a DLF Schema

              Designing the ftpproto schema

                    Creating The Schema File
                    Adding the Schema's Description
                    Defining the Schema's Fields
                    Installing The Schema

        4. Writing a New DLF Analyser

              Writing a Categoriser

                    Defining The Extended Schema 
                    Defining the Categoriser
                    Categoriser Configuration
                    Categoriser Implementation

              Writing an Analyser
              DLF Analyser API

        5. Writing a New Report

              Filter Specification

   III. Developer's Reference

        6. Lire Data Types

              Lire Textual Elements

                    title element
                    DocBook Elements
                    description element

        7. Common Textual Elements to All XML Formats 

              Lire Data Types Parameter Entities

                    Boolean Type
                    Integer Type
                    Number Type
                    String Type
                    Timestamp type
                    Time Type
                    Date Type
                    Duration Type
                    IP Type
                    Port Type
                    Hostname Type
                    URL Type
                    Email Type
                    Bytes Type
                    Filename Type
                    Field Type
                    Superservice Type
                    Related Types

        8. The Lire Report Configuration Specification Markup Language

              The Lire Report Configuration Specification Markup Language

                    config-spec element
                    summary element
                    Parameter Specifiations Elements

        9. The Lire Report Configuration Markup Language

              The Lire Report Configuration Markup Language

                    config element
                    global element
                    param element

        10. The Lire DLF Schema Markup Language

              The Lire DLF Schema Markup Language

                    The dlf-schema element
                    extended-schema element
                    derived-schema element
                    field element

        11. The Lire Report Specification Markup Language

              The Lire Report Specification Markup Language

                    report-spec element
                    global-filter-spec element
                    display-spec element
                    param-spec element
                    param element
                    chart-configs element
                    Filter expression elements
                    Report Calculation Elements

        12. The Lire Report Markup Language

              The Report Markup Language

                    report element
                    Meta-information elements
                    section element
                    subreport element
                    missing-subreport element
                    table element
                    table-info element
                    group-info element
                    column-info element
                    group-summary element
                    group element
                    entry element
                    name element
                    value element
                    chart-configs element

   IV. Lire Developers' Conventions

        13. Contributing Code to Lire
        14. Developers' Toolbox

              Required Tools To Build From CVS
              Accessing Lire's CVS

                    CVS primer

              SourceForge
              Mailing Lists

        15. Coding Standards

              Shell Coding Standards
              Perl Coding Standards

        16. Making Lire "Test-infected"

              Unit Tests in Lire

                    PerlUnit

              Writing Tests
              Running Tests
              Some "Best Practices" on Unit Testing

        17. Commit Policy

              CVS Branches

                    Hands-on example
                    Naming, what it looks like
                    Creating a Branch
                    Accessing a Branch
                    Merging Branches on the Trunk

        18. Testing and debugging

              Test before releasing
              Test-installations and test-runs
              Using the Perl debugger on Lire code

        19. Making a Release

              Setting version in NEWS file, checking ChangeLog
              Tagging the CVS
              Building The Tarball
              Building The Debian Package
              Building The RPM Package
              Making sure the FreeBSD port gets updated
              Uploading The Release

                    The LogReport Webserver

              Advertising The Release

                    SourceForge
                    Freshmeat.net

        20. Website Maintenance

              Documentation on the LogReport Website

                    Publishing the DTD's

        21. Writing Documentation

              Plain Text
              Perl's Plain Old Documentation: maintaining manpages
              Docbook XML: Reference Books and Extensive User Manuals

   V. Implementation Details

        22. Adding a New Superservice in Lire's Distribution
        23. Issues with Report Merging
        24. Overview of Lire scripts
        25. Source Tree Layout

   Glossary

   List of Figures

   1.1. Log Processing in the Lire's Framework
   1.2. The Log Normalisation Process
   1.3. The Log Analysis Process
   1.4. Report Generation Process
   1.5. Processing of the XML Report Using The APIs

   List of Tables

   11.1. weekly overview

   List of Examples

   11.1. timeslot with 1d unit
   11.2. timeslot with 2m unit
   3. DNS DLF Excerpts

Preface

   Table of Contents

   What This Book Contains
   How Is This Book Organized?
   Conventions Used
   If You Don't Find Something In This Manual

   Log file analysis is both an essential and tedious part of system
   administration. It is essential because it's the best way of profiling
   the usage of the service installed on the network. It's tedious
   because programs generate a lot of data and tools to report on this
   data are often unavailable or incomplete. When such tools exist, they
   are generally specific to one product, which means that you can't
   compare e.g. your Qmail(TM) and Exim(TM) mail servers.

   Lire is a software package developed by the Stichting LogReport
   Foundation to generate useful reports from raw log files of various
   network programs. Multiple programs are supported for various types of
   network services. Lire also supports various output formats for the
   generated reports.

What This Book Contains

   This book is the Lire Developer's Manual. Its purpose is to present
   Lire as a log analysis framework. To this ends, it describes the
   architecture and design of Lire and contains comprehensive
   instructions on how to use it. Its intended audience is system
   administrators or programmers who want to extend Lire or want to
   understand its internals.

   There is another book, the Lire User's Manual which describes how to
   install, configure and use Lire, as a "off-the-shelf" log analyzer.
   Its intended audience is system administrators who want to install and
   use Lire to gather information about the services operating on their
   network.

How Is This Book Organized?

   This book is divided in five parts. Part I gives an overview of the
   architecture and design of Lire.

   You will find in Part II information on extending Lire. In this part,
   you will learn how to add a new DLF format to Lire, write log file
   converters and add reports for a superservice.

   Part III is a reference section which gives comprehensive details
   about the various XML formats used by Lire and gives in-depth
   descriptions of its various APIs.

   Part IV is targeted at developers who want to participate in Lire's
   development. It contains information about CVS access, coding
   conventions, tools needed to build from CVS, release management and
   other aspects important to those part of the Lire development team.
   Furthermore, it gives some information on how to contribute code to
   Lire, as an external party.

   Finally, Part V contains various implementation details that may be
   interesting to people wanting to learn more about Lire internals.

Conventions Used

If You Don't Find Something In This Manual

   You can report typos, incorrect grammar or any other editorial
   problems to <bugs@logreport.org>. We welcome reader's feedback. If you
   feel that certain parts of this manual aren't clear, are missing
   information or lacking in any other aspect, please tell us. Of course,
   if you feel like writing the missing information yourself, we'll very
   happily accept your patch. We will make our best effort to improve
   this manual.

   Remember, that there is another manual, the Lire User's Manual which
   contains comprehensive information on how to install, use and
   configure Lire. It also contains reference information about all of
   Lire's standard reports and supported services.

   There are various public mailing lists for Lire's users. There is a
   general users' discussion list where you can find help on how to
   install and use Lire. You can subscribe to this list by sending an
   empty email with a subject of subscribe to
   <questions-request@logreport.org>. Email for the list should be sent
   to <questions@logreport.org>.

   You can keep track of Lire's new release by subscribing to the
   announcement mailing list. You can subscribe yourself by sending an
   empty email with a subject of subscribe to
   <announcement-request@logreport.org>.

   Finally, if you're interested in Lire's development, there is a
   development mailing list to which you can subscribe by sending an
   empty email with a subject of subscribe to
   <development-request@logreport.org>. Email to the list should be sent
   to <development@logreport.org>.

   All posts on these lists are archived on a public website.

Lire Architecture

   Table of Contents

   1. Architecture Overview

        Lire's Design Patterns
        Log File Normalisation
        Log Analysis
        Report Generation
        Report Formatting and Other Post-Processing
        Going Further

Chapter 1. Architecture Overview

   Table of Contents

   Lire's Design Patterns
   Log File Normalisation
   Log Analysis
   Report Generation
   Report Formatting and Other Post-Processing
   Going Further

   From a developer's point of view, Lire intends to be the universal log
   analysis framework. To this end, it provides a reliable, complete,
   framework upon which to build log analysis and reporting solution.
   Lire, the tool, is a proof of the versality and extendability of the
   framework as it is able to produce reports for many of the services
   that run in today's heterogeneous networks in a variety of output
   formats.

   As a framework, Lire is the best choice to replace all those
   home-grown scripts developed to produce reports from all the log files
   from the little-known products or custom-developed programs that run
   on your system. Leveraging Lire framework will make those scripts a
   lot more versatile while not being really more complicated to develop.
   It will be easier to add new reports or to support multiple report
   formats.

   Figure 1.1. Log Processing in the Lire's Framework

                   Log Processing in the Lire's Framework

   The Lire's framework divides log analysis in four different processes.
   The figure Figure 1.1, "Log Processing in the Lire's Framework" shows
   those four processes:
    1. Log Normalisation. The first process normalise logs from different
       products into a generic format that can be shared by all products
       that have similar functionality. For example, log files from
       products as different as Apache(TM) and Microsoft Internet
       Information Server(TM) will be transformed into an identical
       format.
    2. Log Analysis. In the analysis process, other information is
       created, inferred or extracted from the normalised data. For
       example, an anlyser in the www superservice infers the browser
       used by the client from the referrer information.
    3. Report Generation. The third process generates a report from the
       normalised and analysed data. This process is done by a generic
       report engine that computes the report based on specifications
       describing what and how the information should appear in the
       report. The report is generated in a generic XML format.
    4. Report Post-processing and Formatting. The last process converts
       the generic report into a specific format like ASCII, PDF, HTML
       but other kind of post-processing (like charts generation) can
       also be accomplished in this stage.

   Before going into a more detailed description of each of these
   procesesses, we'll introduce some of the common design's patterns that
   you'll find throughout the Lire's framework.

Lire's Design Patterns

   At the center of each of these processes is an XML based file format.
   Having things specified in data files makes it easier to extend. For
   example, the reports are built using a generic report builder which
   finds the instructions on how to build the reports in XML files. So
   this makes it easy to add new information to a report: you just have
   to write an XML file. The fact that there are a lot of tools to
   process XML files is also an interesting aspect. For example, emacs
   lovers will appreciate the help that its psgml module gives them in
   writing report specifications.

   Another important aspects is that we tried to interoperate and to
   build upon other standards while defining our XML formats . The best
   illustration of this is that in all the XML file formats that Lire
   use, a DocBook subset is used for all elements related to narrative
   descriptions.

   Another common aspect you'll encounter is that each of these processes
   and XML file formats come with an API to manipulate them, making it
   easy to add functionalities at each processing stage. APIs are also a
   good thing because, even if in theory an open file format somewhat
   constitutes an API, having libraries that provide convenient access to
   the file formats makes it a lot easier to write new components
   providing new functionalities.

Log File Normalisation

   Figure 1.2. The Log Normalisation Process

                       The Log Normalisation Process

   The first process of the Lire log analysis framework is the log file
   normalisation process. That process is summarized in the Figure 1.2,
   "The Log Normalisation Process" figure. This process is centered
   around the DLF concept which is kind of a universal log format. DLF
   stands for Distilled Log Format. The concept is that each product
   specific log file is transformed into a log format that can be common
   to all the products providing similar functionalities. In Lire's
   terminology, a class of applications providing similar functionality
   (e.g. MTA's supplying email) is called a superservice. Still in Lire's
   terminology, the service from which the super is derived (e.g. postfix
   or sendmail) refers to the native log format that is converted in the
   superservice's DLF. One can view the DLF as a table where the rows are
   the logged events and the fields are logged information related to
   each event.

   Since the information logged by an email server is totally different
   from a web server, each superservice should have its own data models.
   In Lire, the data model is called a DLF schema. The DLF schemas are
   defined in XML files using the DLF Schema Markup Language. The schema
   describes what fields are available for each logged events.

   One interesting aspect of Lire, is that altough the email DLF is used
   by all email servers, the email DLF data model isn't restricted to the
   lowest common denominator across the log formats supported by each
   email servers. In the Lire's architecture, the superservice's schema
   can represent the information logged by the most sophisticated
   product. When some part of the information isn't available in one log
   format, the DLF log file will contain this information and the reports
   that needs this information won't be included.

   This architecture means that to support a new service, i.e. a new log
   format, in Lire you just need to write a plugin, called a DLF
   converter. This is just a simple perl module that parses the native
   log format and maps the information according to the schema.

Log Analysis

   After normalisation, comes the analysis process. The analysis process
   responsability is to extracts, infers or derives other information
   from the logged data. Since the superservice's logged data is in a
   standard format, the analysers are generic in the sense that they can
   operate for all the superservice's supported log formats, if the
   product's was clever enough to log the information required by the
   analyser. The analysis process is shown in the Figure 1.3, "The Log
   Analysis Process" figure.

   Figure 1.3. The Log Analysis Process

                          The Log Analysis Process

   Since each analyser can add information to or create a new DLF, each
   analyser will generate data according to special kind of schemas.

   Lire's framework include two kind of analysers. The difference between
   the two resides in the mapping between the source data and the new
   data they generate. Extended analysers generate new data for each DLF
   record whereas derived analysers are used when the new data doesn't
   have a one-to-one mapping with the source data.

   The analysers produce data according to a data model which is
   specified in other DLF schemas. There are extended schemas and derived
   schemas. An extended schema simply adds new fields to the base
   superservice's schema. For example, in the web superservice's schema,
   a lot of information can be obtained from the referer field. From this
   information, it is possible to guess the user's browser, language or
   operating system. Those fields are specified in the www-referer
   extended schema; one analyser is responsible for extracting this
   information from the referer field.

   But sometimes the analysis cannot just simply add information to each
   event record, an altogether different schema is needed then. For those
   cases, there is the derived schema. An example of the use of such a
   schema in the current Lire distribution is the analyser which creates
   user sessions based on the logged client IP address and user agent.
   This analyser defines the www-session derived schema.

Report Generation

   Once you have all this data, it's time to generate some useful reports
   out of it. Lire's framework includes a generic report builder. What
   Lire calls a report is actually a collection of what one may
   understand as reports; Lire however speaks about a subreports. For
   example, the proxy's superservice report will contain subreports about
   the top visited sites, another subreport on the cache hit ratio, as
   well as several others. The subreports are defined using the Report
   Specification Markup Language. This markup language contains elements
   for several things: information regarding the schema on which it
   operates; descriptions that should be included in the generated report
   to help in the interpretation of the data; parameters that can be used
   to modify the generated report (for example, to generate a top 20
   subreport instead of a top 10); a filter that selects the records that
   will be used for the subreport; and finally the operations that make
   up the subreport: grouping, summing, counting, etc. The report markup
   language covers most simple needs and there is an extension element as
   well as an API that can be used to hook in more fancy computations.
   There are no subreport specifications in the current distribution that
   make use of this feature yet, however. You can see an overview of this
   process in the Figure 1.4, "Report Generation Process" figure.

   Figure 1.4. Report Generation Process

                         Report Generation Process

   The generated report is another XML file that uses another markup
   language, this time called the Lire's Report Markup Language. An
   actual report contains the help descriptions from the report
   specifications, information on the subreport specifications used, as
   well as the actual subreport's data.Using another intermediary XML
   file as output format makes all sort of things possible in the
   formatting and post-processing stage.

Report Formatting and Other Post-Processing

   The last process works with the generic XML report. Using a
   domain-specific XML format for the generated format makes it easy for
   the framework to support multiple different formats. Supporting a new
   output format is just a matter of writing a new module that processes
   the XML report file.

   Figure 1.5. Processing of the XML Report Using The APIs

                Processing of the XML Report Using The APIs

   As shown in the Figure 1.5, "Processing of the XML Report Using The
   APIs" figure, you can also process the XML files using the APIs to the
   XML report format.

Going Further

   As you can see form this overview, the Lire framework provides a
   powerful architecture to use for your log analysis needs. The
   architecture provides extensibility from log normalisation to
   post-processing of the reports. Exactly how to use the framework is
   the topic of the next part.

Using the Lire Framework

   In this part, you'll learn how to leverage the Lire's framework for
   your own log analysis need. The most common use cases are developing a
   converter for a new log format and developping new reports.

   The first chapter Chapter 2, Writing a New DLF Converter explains how
   to write a converter for a new log format.

   The responsibility of the converter is to map the information
   contained in a log file to the data model of a specific DLF schema.
   When developping a converter for a log format which doesn't fall in
   the domain one of the existing DLF schema, you'll need to write a new
   one. This is the topic of the following chapter Chapter 3, Writing a
   DLF Schema.

   The chaper Chapter 4, Writing a New DLF Analyser gives information on
   how to write DLF analysers that can adds data to the base log
   information.

   The chapter Chapter 5, Writing a New Report this part gives some notes
   on how to develop new reports.

   Table of Contents

   2. Writing a New DLF Converter

        Prerequisites
        The common_syslog Log Format
        Creating the DLF Converter Skeleton
        Adding a Constructor
        The Meta-Data Methods

              The DLF Converter Name
              Providing Information To Users
              Providing Information to the Framework
              The Conversion Methods

        Registering Your DLF Converter with the Lire Framework
        DLF Converter API

   3. Writing a DLF Schema

        Designing the ftpproto schema

              Creating The Schema File
              Adding the Schema's Description
              Defining the Schema's Fields
              Installing The Schema

   4. Writing a New DLF Analyser

        Writing a Categoriser

              Defining The Extended Schema 
              Defining the Categoriser
              Categoriser Configuration
              Categoriser Implementation

        Writing an Analyser
        DLF Analyser API

   5. Writing a New Report

        Filter Specification

Chapter 2. Writing a New DLF Converter

   Table of Contents

   Prerequisites
   The common_syslog Log Format
   Creating the DLF Converter Skeleton
   Adding a Constructor
   The Meta-Data Methods

        The DLF Converter Name
        Providing Information To Users
        Providing Information to the Framework
        The Conversion Methods

   Registering Your DLF Converter with the Lire Framework
   DLF Converter API

   Before Lire can do various analysis and generate reports on the data
   contained in your various log files, it must first be converted to a
   common data model. This is specifically the job of the DLF converter.

   So if you want to generate the same reports for your RealServer(TM)
   log files (currently unsupported) than for you web server, you only
   need to develop a DLF converter which maps the RealServer content to
   the www DLF schema.

Note

   If no existing DLF schemas represent correctly the domain of your
   application log file, it is easy to develop a new one. Consult the
   chapter Chapter 3, Writing a DLF Schema for the whole story.

   This chapter will show you through an example how to develop a new DLF
   converter for a kind of useless log format: the common log format
   encapsulated in syslog. (It is useless because there is not many
   reasons to make your web server logs it requests through syslog. And
   it would be probably be simpler to just use the cut command to remove
   the syslog header.)

Note

   The doc/examples in the source distribution contains another commented
   example which could serve as a starting point for your converters.

Prerequisites

   Developing a new DLF converter requires some basic programming skills
   in perl. Altough not strictly necessarily, you should be familiar with
   perl object-oriented programming model. If you aren't, you should read
   perltoot(1) before continuing.

The common_syslog Log Format

   The log format supported by our DLF converter is simply the standard
   Common Log Format supported by most web servers with a syslog header
   prepended to each line. Here is an example of what such a log file
   might contain:

May 10 11:13:10 hibou httpd[12344]: Apache/1.3.26 (Unix) Debian GNU/Linux Embpe
rl/1.3.3 PHP/4.1.2 mod_perl/1.26 configured -- resuming normal operations
May 10 11:13:11 hibou httpd[12345]: 192.168.250.10 - - \
  [10/May/2003:11:13:11 +0200] "GET /" HTTP/1.1 200 1523
May 10 11:13:12 hibou httpd[12346]: 192.168.250.10 - - \
  [10/May/2003:11:13:11 +0200] "GET /images/logo.png" HTTP/1.1 200 1201
May 10 11:13:12 hibou httpd[12348]: 192.168.250.10 - - \
  [10/May/2003:11:13:11 +0200] "GET /images/corner.png" HTTP/1.1 200 1021



   Remember that the other layer is a syslog log file and could contains
   other things than only the web server's requests. The first line in
   the example isn't a request record but really what usually ends up in
   the "error_log" and is a message about the server starting.

Creating the DLF Converter Skeleton

   Put simply, a DLF converter is a perl object which implements a set of
   predefined methods (aka an "interface" in the object-oriented jargon).

   Since a DLF converter is a perl object, it must be instantiated from a
   class. Classes in perl are defined in packages. We'll name the package
   which implements our converter MyConverters::SyslogCommonConverter. To
   create such a package, you need to create a file named
   MyConverters/SyslogCommonConverter.pm in a directory searched by perl.

Note

     * You can obtain perl's default search list by running the command $
       perl -V.
     * This search list can be modified by setting the PERL5LIB
       environment variables.

   Here is a first cut of our DLF converter:

package MyConverters::SyslogCommonConverter;

use base qw/Lire::DlfConverter/;

1;



   The first line declare that the code is in the
   MyConvertersw::SyslogCommonConverter package. The second one specifies
   that objects in this package are subclasses of the Lire::DlfConverter
   packages. The last line fullfill perl's requirement that package
   returns a true value once they are initialized.

   This is a complete DLF, altough useless, DLF Converter. In fact, it
   isn't complete because if you tried to register an instance of that
   class, you'll get "unimplemented method" errors. Besides, we don't
   even yet have a formal way to create instance of our converter. This
   is our next task.

Adding a Constructor

   The Lire framework doesn't place any restrictions on your DLF
   converter constructor. In fact, the constructor isn't used by the
   framework at all, it will only be used by your DLF converter
   registration script (the section called "Registering Your DLF
   Converter with the Lire Framework").

   We will follow perl's convention of using a method named new for our
   constructor and of using an hash reference to hold our object's data.

   Here is our complete constructor:

use Lire::Syslog;

sub new {
    my $pkg = shift;

    my $self = bless {}, $pkg;

    $self->{syslog_parser} = new Lire::Syslog();

    return $self;
}



   Since our log format is based on syslog, we will reuse the syslog
   parsing code included in Lire. This is the reason we instantiate a
   Lire::Syslog object and save a reference to it in our constructor.

The Meta-Data Methods

   The Lire::DlfConverter interface requires two kinds of methods. First,
   it requires methods which provide information to the framework on your
   converter. Second, it requires methods which will actually implement
   the conversion process. It this the format that this section
   documents.

The DLF Converter Name

   The method name() should returns the name of our DLF converter. It is
   this name that is passed to the lr_log2report command. This name must
   be unique among all the converters registered and it should be
   restricted to alphanumerical characters (hyphens, period and
   underscores can also be used).

   We will name our converter common_syslog:


sub name {
    return "common_syslog";
}



Providing Information To Users

   The next two required methods are used to give more verbose
   information on your converter to the users. The converter's title()
   and description() can be use to display information about your
   converter from the user interface or to generate documentation.

   The title() should simply returns a string:

sub title {
    return "Common Log Format embedded in Syslog DLF Converter";
}



   The description() method should returns a DocBook fragment describing
   your converter and the log formats it support. If you don't know
   DocBook just restrict yourself to using the para elements to make
   paragraphs:

sub description {
   return <<EOD;
<para>This DLF Converter extracts web server's requests and error
information from a syslog file.
</para>
<para>The requests and errors should be logged under the
<literal>httpd</literal> program name. The errors are mapped to the
<type>syslog</type> schema, the requests are mapped to the
<type>www</type> schema.
</para>
<para>Syslog records from another program than
<literal>httpd</literal> are ignored.
</para>
EOF
}



Providing Information to the Framework

   Two other meta-data methods are used by the framework itself. The
   first one specifies to what DLF schemas your DLF converter is
   converting to:

sub schemas {
    return ( "www", "syslog" );
}



   In our case, we are converting to the syslog and www schemas. Like we
   described it in our converter's description, we will map the web
   server's error message to the syslog schema and the request logs to
   the www schema. Other alternatives would have been to only map the
   requests information to www schema or map all the non-request records
   to the syslog schema. The rationale behind the current choice (besides
   this being an example) is that it make it convenient to process one
   log file to obtain a report containing the requests and errors from
   our web server. For that use case, it is best to ignore the non-web
   server related stuff.

   The other method affects how the conversion process will be handled.
   Lire offers two mode of conversion, the line oriented one and the file
   oriented one. (Both will be described in the next section). If your
   log file is line-oriented (each lines is one log record) like most log
   files are, you should use the line-oriented conversion mode:

sub handle_log_lines {
    return 1;
}



The Conversion Methods

   The actual conversion process is handled through three methods:
   init_dlf_converter, finish_conversion() and either process_log_file()
   or process_log_line() depending on the conversion mode (as determined
   by handle_log_lines()'s return value.

Conversion Initialization

   The method init_dlf_converter() will be called once before the log
   file is processed. It should be use to initialize the state of your
   converter. Since our DLF Converter doesn't need any initialization and
   doesn't need any configuration, the method is simply empty:

sub init_dlf_converter {
    my ( $self, $process ) = @_;

    return;
}



   The $process parameter which is passed to all the processing methods
   is an instance of Lire::DlfConverterProcess. This is the object which
   is driving the conversion process and it defines several methods which
   you will use in the actual conversion process.

Conversion Finalization

   The method finish_conversion() will be called once after the log file
   has been completely processed. This method will be mostly of use to
   stateful converter, that is DLF converters which generates DLF records
   from more than one line. Since this is not our case, we simply leave
   the method empty:

sub finish_conversion {
    my ( $self, $process ) = @_;

    return;
}



The DLF Conversion Process

   Whether you are using the file-oriented or line-oriented conversion
   mode, the principles are the same. You extract information from the
   log file and creates DLF records from it. Your DLF converter
   communicates with the framework by calling methods on the
   Lire::DlfConverterProcess object which is passed as parameter to your
   methods.

   Here is the complete code of our conversion method:

use Lire::Apache qw/parse_common/;

sub process_log_line {
    my ( $self, $process, $line ) = @_;

    my $sys_rec = eval { $self->{syslog_parser}->parse( $line ) };
    if ( $@ ) {
        $process->error( $@, $line );
        return;
    } elsif ( $sys_rec->{process} ne 'httpd' ) {
        $process->ignore_log_line( $line, "not an httpd record" );
        return;
    } else {
        my $common_dlf = {};
        eval { parse_common( $sys_rec->{content}, $common_dlf ) };
        if ( $@ ) {
            $sys_rec->{message} = $sys_rec->{content};
            $process->write_dlf( "syslog", $sys_rec );
        } else {
            $process->write_dlf( "www", $common_dlf );
        }
    }

}




   The first thing that should be noted is that in the line-oriented
   conversion mode, the method process_log_line() will be called once for
   each line in the log file.

   Secondly, the actual parsing of the line is done using two functions:
   parse_common and Lire::Syslog's parse. These methods simply uses
   regular expressions to extract the appropriate information from the
   line and put it in an hash reference. What is important is that these
   methods already uses as key names the schema's field names.

   Finally, you can see that there are four different methods used on the
   $process object to report different kind of information:

   Reporting Error
          The example uses the eval statement to trap errors during the
          syslog record parsing. If the line cannot be parsed as a valid
          syslog record, it is an error and it is reported through the
          error() method. The first parameter is the error message and
          the second one is the line to which the error is associated.
          This last parameter is optional.

   Ignoring Information
          When the syslog event doesn't come from the httpd process, we
          ignore the line. Ignored line are reported to the framework by
          using the ignore_log_line() method. The first parameter is the
          line which is ignored. The second optional parameter gives the
          reason why the line was ignored.

   Creating DLF Records
          Finally, DLF records are created by using the write_dlf()
          method. Its first parameter is the schema to which the DLF
          record complies. This schema must be one that is listed by your
          converter's schemas() method. The second parameter is the DLF
          data contained in an hash reference. The DLF record will be
          created by taking for each field in the schema the value under
          the same name in the hash. (Since in the syslog schema, the
          field which contains the actual log message is called message,
          this is the reason we are assigning the content value to the
          message key.) Missing fields or fields whose value is undef
          will contains the special LR_NA missing value marker. Keys in
          the hash that don't map to a schema's field are simply ignored.

          In our example, we distinguish between the server's error
          message (mapped to the syslog schema) and the request
          information (mapped to the www schema) based on whether
          parse_common succeeded in parsing the line.

   Saving Log Line
          Another possibility, not shown in our example, is to ask that
          the line be saved for a later processing. This is mostly of use
          to converters who maitains state between lines. In the cases,
          it is quite the case that there are related lines that are
          missing from the end of the log file. In that case, you save
          the line and they will automatically seen by the next run of
          your converter on the same DLF store. This option is only
          available in the line-oriented mode of conversion.

File-Oriented Conversion

   The same principles apply when you are using the file-oriented mode of
   conversion. This mode will usually be used for binary log formats or
   format which aren't line-oriented like XML.

   For demonstration purpose, the following code could be added to
   transform our line-oriented converter into a file-oriented one:

sub handle_log_lines {
    return 0;
}

sub process_log_file {
    my ( $self, $process, $fh ) = @_;

    my $line;
    while ( defined( $line = <$fh> ) {
        chomp $line;
        $self->process_log_line( $process, $line );
    }
}




   The difference between the above code and using the line oriented mode
   is that the framework won't be aware of the number of log lines
   processed and your converter might have troubles when processing log
   files which uses a different line-ending convention than the host you
   are runnig on. Bottom line is that you should use the line-oriented
   conversion mode when your log format is line oriented.

Registering Your DLF Converter with the Lire Framework

   We first said that DLF converters are perl objects which implements
   the Lire::DlfConverter interface. What we did is write a class which
   implements the said interface. Creating the object from that class is
   the responsability of the DLF converter registration script. This is
   simply a snippet of perl code which instantiates your object and
   registers it with the Lire::PluginManager:

use Lire::PluginManager;
use MyConverters::SyslogCommonConverter;

Lire::PluginManager->register_plugin(
            MyConverters::SyslogCommonConverter->new() );



   That's all there is to it, really. You put this snippet in a file
   named syslog_common_init in one of the directories listed in the
   plugins_init_path configuration variable.

Note

   Some other notes on this topic:
    1. The file can actually be named anything you want, the name
       service_init just make it clear what is the purpose of the file.
    2. The initial value of the plugins_init_path contains the
       directories sysconfdir/lire/plugins and HOME/.lire/plugins. You
       can change this list by using the lire tool.
    3. Your registration script can create and register more than one
       object.

   You can now generate a www report for log files in that format using
   the command lr_log2report common_syslog < file.log.

DLF Converter API

   The complete DLF Converter API documentation is included in POD format
   in the Lire distribution. It is usually formatted as man pages. You
   can alway read it using the perldoc command.

   The following packages documentation should be consulted:
   Lire::DlfConverter(3), Lire::DlfConverterProcess(3) and
   Lire::PluginManager(3).

Chapter 3. Writing a DLF Schema

   Table of Contents

   Designing the ftpproto schema

        Creating The Schema File
        Adding the Schema's Description
        Defining the Schema's Fields
        Installing The Schema

   If you want to develop a DLF converter for an application whose
   logging data model isn't adequately represented by one of the existing
   DLF schema, you'll need to develop a new one.

   If you are familiar with SQL, a DLF schema is similar to a table
   schema description. A DLF file can be seen as a table, where each log
   record is represented by a table row. Each log record in the same DLF
   schema shares the same fields.

Designing the ftpproto schema

   In this chapter, we will create a new schema for logging of FTP
   session. That DLF schema could serve for an improved DLF converter for
   log files generated by Microsoft Internet Information Server(TM). Lire
   currently has a DLF converter for these log files but the current ftp
   DLF schema is modelled after the xferlog log file which only
   represents file transfers whereas the log generated by Microsoft
   Internet Information Server(TM) contains more detailed information on
   the ftp session.

   Here is an example of such a log file:
#Software: Microsoft Internet Information Server 4.0
#Version: 1.0
#Date: 2001-11-29 00:01:32
#Fields: time c-ip cs-method cs-uri-stem sc-status
00:01:32 10.0.0.1 [56]created spacedat/091001092951LGW_Data.zip 226
00:01:32 10.0.0.1 [56]created spacedat/html/bx01g01.gif 226
00:01:32 10.0.0.1 [56]created spacedat/html/catlogo.gif 226
00:01:32 10.0.0.1 [56]QUIT - 226
00:03:32 10.0.0.1 [58]USER badm 331
00:03:32 10.0.0.1 [58]PASS - 230


   As you can see, this log file contains other information beyond the
   simple upload/download represented in the standard FTP schema. It a
   session identifier, the command executed, as well as the result code
   of the action. Our new schema should be able to represent these
   things.

Creating The Schema File

   To create a DLF schema, you have to create a XML file named after your
   schema identifier: ftpproto.xml. Schema name should be made of
   alphanumeric characters. This schema identifier is case sensitive. You
   schema identifer shouldn't contains hyphens (-) or underscore
   characters (_). (The hyphen is used for a special purpose).

   All DLF schemas starts and ends the same way:

<?xml version="1.0" encoding="ascii"?>
<!DOCTYPE lire:dlf-schema PUBLIC
  "-//LogReport.ORG//DTD Lire DLF Schema Markup Language V1.1//EN"
  "http://www.logreport.org/LDSML/1.1/ldsml.dtd">
<lire:dlf-schema xmlns:lire="http://www.logreport.org/LDSML/"

              superservice="ftpproto"
              timestamp="time"

              >
<!-- Other elements will go here -->
</lire:dlf-schema>



   The first lines contains the usual XML declaration and DOCTYPE
   declarations, you'll find in many XML documents. The real stuff starts
   at the lire:dlf-schema. What is important for your schema are the
   value of the superservice and timestamp attributes. The first one
   contains your schema identifier. It is called "superservice" for
   historical reasons. The other one should contains the name of the
   field which order the record by their event type. (See the section
   called "The Field Types" for more information.)

   The last line in the above excerpt would be the last thing in the file
   and closes the lire:dlf-schema element.

Adding the Schema's Description

   The next things that goes into the schema file are the schema's title
   and description. Both are intended for developers to read and should
   be informative of the scope of the schema:

 <!-- Starting lire:dlf-schema element was omitted -->

  <lire:title>DLF Schema for FTP Protocol</lire:title>

  <lire:description>
    <para>This DLF schema should be used for FTP servers that have
          detailed information on the FTP connection in their log
          files.
    </para>
    <para>Each record represents a command done by the client during
     the FTP session.
    </para>
  </lire:description>




   The content of the lire:description elements are DocBook elements. If
   you don't know DocBook, you just need to know that paragraphs are
   delimited using the para elements.

Defining the Schema's Fields

   The only remaining things in the schema definitions are the field
   specifications. Here is the definition of the first one:

  <lire:field name="time" type="timestamp" label="Timestamp">
    <lire:description>
      <para>This field contains the timestamp at which the command was
              issued.
      </para>
    </lire:description>
  </lire:field>



   As you can see, the fields are defined using the lire:field element
   which has three attributes:

   name
          This attribute contains the name of the field. This name should
          contains only alphanumeric characters. It can also make use of
          the underscore character.

   type
          This attribute contains the type of the field. The available
          types will described shortly.

   label
          This should contains the column label that should be used by
          default in your report for data coming from this field. This
          label should be short but descriptive.

   The field's description is held in the lire:description element which
   contains DocBook markup. The field's description should be descriptive
   enough so that someone implementing a DLF converter for this schema
   knows what goes where.

The Field Types

   The main types available for fields are:

   timestamp
          This should be use for field which contains a value to indicate
          a particular point in time. All timestamp values are
          represented in the usual UNIX convention: number of seconds
          since January 1st 1970.

          Each DLF schema must contains at least one field of this kind
          and its name should be in the lire:dlf-schema's timestamp
          attribute.

   hostname
          This type should be used for fields which contains an hostname
          or IP address.

          It is important to mark such fields, because it will possible
          eventually to resolve automatically IP addresses to hostname.

   bool
          Type for boolean values.

   number
          Type for numeric values.

Important

          You shouldn't use this type when the values are limited in
          number and are semantically related to an enumeration like
          result code. You should use the string type for this.

          You should only use the number type for values which you'll
          want to report in classes instead on the individual values.

   bytes
          This type should be use for numeric values which are quantities
          in bytes. The more specific typing is useful for display
          purpose.

   duration
          This type should be use for numeric values which are quantities
          of time. The more specific typing is useful for display
          purpose.

   string
          This is the type which can be use for all other purpose.

Note

   If you read the specifications, you'll find other types which are
   used. These additional types don't bring anything over the basic ones
   defined above and you shouldn't use them.

   In addition to the time field defined above, here are the remaining
   field defintions which make our complete ftpproto schema:

  <lire:field name="sessid" type="string" label="Session">
    <lire:description>
     <para>This field should contains an identifier that can used
     to related the commands done in the same FTP session. This
     identifier can be reused, but shouldn't be while the FTP session
     isn't closed.
     </para>
    </lire:description>
  </lire:field>

  <lire:field name="command" type="string" label="Command">
    <lire:description>
     <para>This field contains the FTP command executed. The FTP
      protocol command names (STOR, RETR, APPE, USER, etc.) should be used.
     </para>
    </lire:description>
  </lire:field>

  <lire:field name="result" type="string" label="Result">
    <lire:description>
     <para>This should contains the FTP result code after executing
     the command.
     </para>
    </lire:description>
  </lire:field>

  <lire:field name="cmd_args" type="string" label="Argument">
    <lire:description>
     <para>This field should contains the parameters to the FTP command.
     </para>
    </lire:description>
  </lire:field>

  <lire:field name="size" type="bytes" label="Bytes Transferred">
    <lire:description>
     <para>When the command involves a transfer like for the RETR or STOR
      command, it should contains the number of bytes transferred.
     </para>
    </lire:description>
  </lire:field>

  <lire:field name="elapsed" type="duration" label="Elasped">
    <lire:description>
     <para>This field contains the number of seconds executing the
           command took.
     </para>
    </lire:description>
  </lire:field>



Installing The Schema

   Making available the new schema to the Lire framework is pretty easy:
   just copy the file to one of the directories set in the
   lr_schemas_path configuration variable. By default, this variable
   contains the directories datadir/lire/schemas and HOME/.lire/schemas.
   Like all other configuration variables, its value can be changed using
   the lire tool.

   Since we want our schema to be available for other users as well, we
   will install it in the system directory:

&root-prompt; install -m 644 ftproto.xml /usr/local/share/lire/schemas



   (In this case, Lire was installed under /usr/local.

Chapter 4. Writing a New DLF Analyser

   Table of Contents

   Writing a Categoriser

        Defining The Extended Schema 
        Defining the Categoriser
        Categoriser Configuration
        Categoriser Implementation

   Writing an Analyser
   DLF Analyser API

   In Lire, a DLF Analyser is a plugin that can extract or derived data
   from other DLF data. The idea is that these analysis do not depends on
   the underlying log format but that it can be found simply by using the
   data normalised in the DLF schema.

   For example, an analyser could assign category based on the url that
   was visited (like assigning the 'Public' or 'Private' category). This
   categorising operation doesn't depends on the log format but only on
   the presence of the requested_page field in the schema. This would be
   an example of a special kind of analyser, a Lire DLF Categoriser. This
   is a simpler analyser that can create new fields based on one DLF
   record.

Note

   The doc/examples in the source distribution contains the complete code
   for this categoriser.

   There is a more generic kind of analysers that create data in another
   dlf streams based on arbitrary queries on the source DLF schema. An
   example of this kind is an analyser that construct session summary
   from the www requests. It reads the DLF records of the www DLF schema
   and creates www-user_session DLF records from that.

   Writing an analyser is similar to writing a DLF converter, so consult
   Chapter 2, Writing a New DLF Converter for the details converning
   registration and using configuration.

Writing a Categoriser

   The simplest form of analyser are categorisers. In this section, we
   will show an example of how to write a categoriser that can assign
   categories using regular expressions to each www requested page.

Defining The Extended Schema

   A categoriser writes DLF in an extended schema. An extended schemas is
   an extension of a base schema. If you are familiar with SQL you can
   see it as an inner join with the main schema. That is each fields in
   the main schema will have the extension fields of the extended schema.

   In our case our extended schema is very simple, it only adds one
   category field to the www schema.

   Defining an extended schema is identical to writing a DLF Schema with
   exception that we use a different top-level element. You should
   consult Chapter 3, Writing a DLF Schema for all the details. Here is
   the extended schema that our categoriser will use:

<?xml version="1.0"?>
<!DOCTYPE lire:extended-schema PUBLIC
  "-//LogReport.ORG//DTD Lire DLF Schema Markup Language V1.1//EN"
  "http://www.logreport.org/LDSML/1.1/ldsml.dtd">
<lire:extended-schema id="www-category" base-schema="www"
 xmlns:lire="http://www.logreport.org/LDSML/">

 <lire:title>Category Extended Schema for WWW service</lire:title>

 <lire:description>
  <para>This is an extended schema for the WWW service which adds a
    category field based on the regexp matched by the requested_page.
  </para>
 </lire:description>

 <lire:field name="category" type="string" label="Category">
  <lire:description>
   <para>This fields contain the page category.</para>
  </lire:description>
 </lire:field>
</lire:extended-schema>



   The difference with a regular DLF schema is that it starts with the
   extended-schema tag which has a base-schema attribute which should
   contain the DLF schema or derived DLF schema that is extended.

Defining the Categoriser

   Like a DLF Converter, the categoriser s an object deriving from a base
   class which defines the categoriser interface. In the categoriser
   case, that interface is Lire::DlfCategoriser. The categoriser also has
   to provide some meta-information to the framework. Here is the code
   for all of this:

package MyAnalysers::PageCategoriser;

use base qw/Lire::DlfCategoriser/;

sub new {
    return bless {}, shift;
}

sub name {
    return 'page-categoriser';
}

sub title {
    return "A page categoriser";
}

sub description {
    return "<para>A categoriser that assigns categories based on a map
    of regular expressions to categories.</para>";
}

sub src_schema {
    return "www";
}

sub dst_schema {
    return "www-category";
}




   The methods different from the DLf converter case are the src_schema
   which specifies the schema which to which fields are added and the
   dst_schema which gives the schema specifying the fields that will be
   added.

Categoriser Configuration

   Our categoriser will assign categories based on a mapping from regular
   expression to category names. To be useful, this mapping should be
   configurable. Like all plugins in Lire, DLF categorisers can use the
   Lire Configuration Specification Markup Language to defines the
   configuration data they use (see Chapter 8, The Lire Report
   Configuration Specification Markup Language for the full details). The
   convention is that if there is a parameter named yourname_propeties,
   this is considered the configuration specification for the plugin
   yourname. This will mean that a little button will appear in the lire
   user interface so that the user can configure your plugin data.

   In our categoriser case, we will define a list of records which will
   enable the user to define many pairs of regular expression and
   category name:

<?xml version="1.0"?>
<!DOCTYPE lrcsml:config-spec PUBLIC
  "-//LogReport.ORG//DTD Lire Report Configuration Specification Markup Languag
e V1.0//EN"
  "http://www.logreport.org/LRCSML/1.1/lrcsml.dtd">
<lrcsml:config-spec xmlns:lrcsml="http://www.logreport.org/LRCSML/"
                    xmlns:lrcml="http://www.logreport.org/LRCML/">

 <lrcsml:list name="page-categoriser_properties">
  <lrcsml:summary>Page Categoriser Configuration</lrcsml:summary>

  <lrcsml:description>
   <para>This is a list of regexp that will be apply in this order
    along the category that should be applied when the regexp match.
   </para>
  </lrcsml:description>

  <lrcsml:record name="regex2category">
   <lrcsml:summary>The Regexp-Category Association</lrcsml:summary>
   <lrcsml:string name="regex">
    <lrcsml:summary>Regex</lrcsml:summary>
    <lrcsml:description>
     <para>The regular expression to test.</para>
    </lrcsml:description>
   </lrcsml:string>

   <lrcsml:string name="category">
    <lrcsml:summary>Category</lrcsml:summary>
    <lrcsml:description>
     <para>This field contains the category that should be assigned.</para>
    </lrcsml:description>
   </lrcsml:string>
  </lrcsml:record>
 </lrcsml:list>
p <lrcml:param name="page-categoriser_properties">
  <lrcml:param name="regex2category">
   <lrcml:param name="regex">.*</lrcml:param>
   <lrcml:param name="category">Unknown</lrcml:param>
  </lrcml:param>
 </lrcml:param>
 </lrcsml:list>
</lrcsml:config-spec>



   This specification also sets a list containing one catchall regex with
   the category 'Uknown'. The user could add other values before that. An
   alternative implementation could define a field specifying the default
   category to assign when no regular expression matches.

Categoriser Implementation

   Two methods are needed to implement the categoriser. The first is an
   initialisation method called initialise. This method receives as
   parameter the configuration data entered by the user.

   In our case, we will compile the regular expressions for faster
   processing later on :

sub initialise {
    my ( $self, $config ) = @_;

    foreach my $map ( @$config ) {
        $map->[0] = qr/$map->[0]/;
    }

    $self->{'categories'} = $config;
    return;
}



   The categorising is made in the categorise method. This method
   receives as parameter the DLF record to which the extended fields
   should be added. This DLF record is an hash reference containing one
   key for each of the fields defined in the source DLF schema. We simply
   assign the extended fields by adding new keys to the hash reference :

sub categorise {
    my ( $self, $dlf ) = @_;

    foreach my $map ( @{$self->{'categories'}} ) {
        if ( $dlf->{'requested_page'} =~ /$map->[0]/ ) {
            $dlf->{'category'} = $map->[1];
            return;
        }
    }
    return;
}



   That's all. Like for the DLF converter you'll need to register this
   analyser with the Lire::PluginManager (see the section called
   "Registering Your DLF Converter with the Lire Framework" for more
   information.

Writing an Analyser

   When a categoriser isn't sufficient for your needs, you can write an
   Lire::DlfAnalyser which gets complete control on the analysis process.
   The main difference with at categoriser is that the dst_schema method
   will contain refer to a derived schema instead of an extended schema.

   The core of the analyser is done in the analyse method that takes a
   reference to the store onto which data will be analysed and to a
   Lire::DlfAnalyserProcess callback object which should be use to write
   new DLF records and report errors. The method also receives the plugin
   configuration data. The analyser should create a Lire::DlfQuery to
   select the records necessary for its analysis.

   The doc/examples in the source distribution contains the a boiler
   plate for witing an Analyser.

DLF Analyser API

   The complete DLF Analyser API documentation is included in POD format
   in the Lire distribution. It is usually formatted as man pages. You
   can alway read it using the perldoc command.

   The following packages documentation should be consulted:
   Lire::DlfAnalyser(3), Lire::DlfAnalyserProcess(3),
   Lire::DlfCategoriser(3), Lire::DlfQuery(3) and Lire::PluginManager(3).

Chapter 5. Writing a New Report

   Table of Contents

   Filter Specification

   Writing a new report involves writing a report specification, e.g.
   /service/<superservice>/reports/top-foo-by-bar.xml, and adding this
   report along with possible configuration parameters to <service>.cfg.
   E.g., to create a new report, based upon email/from-domain.xml: copy
   the file /usr/local/etc/lire/email.cfg to ~/.lire/etc/email.cfg. Copy
   the file /usr/local/share/lire/reports/email/top-from-domain.xml to
   e.g. ~/.lire/reports/reports/email/from-domain.xml. Edit the last file
   to your needs, and enable it by listing it in your
   ~/.lire/etc/email.cfg.

   Beware! The name of the report generally consists of alphanumerics and
   '-', but the name of parameters may not contain any '-' characters. It
   generally consists of alphanumerics and '_' characters.

Filter Specification

   For now, you'll have to refer to the example filters as found in the
   current report specification files. We'll give one other example here:
   specifying a time range.

   Suppose you want to be able to report on only a specific time range.
   You could build a (possibly global and reused) filter like:

      <lire:filter-spec>
        <lire:and>
          <lire:ge arg1="$timestamp" arg2="$period-start"/>
          <lire:le arg1="$period-end" arg2="$timestamp"/>
        </lire:and>
      </lire:filter-spec>



   When trying your new filter, you could install it in
   ~/.lire/filters/your-filter-name.xml. When lr_dlf2xml looks up a
   filter which was mentioned in the report configuration file, it looks
   first in ~/.lire/filters/, and then in .../share/lire/filters/.

Developer's Reference

   Table of Contents

   6. Lire Data Types

        Lire Textual Elements

              title element
              DocBook Elements
              description element

   7. Common Textual Elements to All XML Formats 

        Lire Data Types Parameter Entities

              Boolean Type
              Integer Type
              Number Type
              String Type
              Timestamp type
              Time Type
              Date Type
              Duration Type
              IP Type
              Port Type
              Hostname Type
              URL Type
              Email Type
              Bytes Type
              Filename Type
              Field Type
              Superservice Type
              Related Types

   8. The Lire Report Configuration Specification Markup Language

        The Lire Report Configuration Specification Markup Language

              config-spec element
              summary element
              Parameter Specifiations Elements

   9. The Lire Report Configuration Markup Language

        The Lire Report Configuration Markup Language

              config element
              global element
              param element

   10. The Lire DLF Schema Markup Language

        The Lire DLF Schema Markup Language

              The dlf-schema element
              extended-schema element
              derived-schema element
              field element

   11. The Lire Report Specification Markup Language

        The Lire Report Specification Markup Language

              report-spec element
              global-filter-spec element
              display-spec element
              param-spec element
              param element
              chart-configs element
              Filter expression elements
              Report Calculation Elements

   12. The Lire Report Markup Language

        The Report Markup Language

              report element
              Meta-information elements
              section element
              subreport element
              missing-subreport element
              table element
              table-info element
              group-info element
              column-info element
              group-summary element
              group element
              entry element
              name element
              value element
              chart-configs element

Chapter 6. Lire Data Types

   Table of Contents

   Lire Textual Elements

        title element
        DocBook Elements
        description element

Lire Textual Elements

   This DTD module defines elements related that contains human-readable
   content in all the Lire DTDs.

   This module will also imports some DocBook XML V4.1.2 elements for
   richer semantic tagging.

   This module is also namespace aware and will honor the setting of
   LIRE.pfx to scope its element

   The latest version of that module is 2.0 and its public identifier is
   -//LogReport.ORG//ELEMENTS Lire Textual Elements V2.0//EN(TM).

<!--
    Make sure LIRE.pfx is defined. This declaration will be
    ignored if it was already defined.
                                                                   -->
<!ENTITY % LIRE.pfx          "lire:"                                 >

<!ENTITY % LIRE.title        "%LIRE.pfx;title"                       >
<!ENTITY % LIRE.description  "%LIRE.pfx;description"                 >


title element

   The title element contains a descriptive title.

   This element represent some title in Lire. It can be used to give a
   title to a report specification or to specifify the title of a report
   or subreport.

   The content of this element should be localized.

   This element doesn't have any attribute.

<!ELEMENT %LIRE.title; (#PCDATA)                                     >



DocBook Elements

   The standard para, formalpara and admonition elements (note, tip,
   warning, important and caution) are used as well as their content may
   be used.


<!ENTITY % docbook-block.mix
           "para|formalpara|warning|tip|important|caution|note">

<!ENTITY % DocBookDTD PUBLIC
  "-//OASIS//DTD DocBook XML V4.1.2//EN"
  "http://www.oasis-open.org/docbook/xml/4.1.2/docbookx.dtd">
%DocBookDTD;



description element

   The description element is used to describe an element. It can be used
   to describe DLF fields, describe a report specification or include
   descriptions in the generated reports.

   This element can contains one or more of the block-level DocBook
   elements we use.

   The content of this element should be localized.

   This element doesn't have any attributes.

<!ELEMENT %LIRE.description; (%docbook-block.mix;)+>



Chapter 7. Common Textual Elements to All XML Formats

   Table of Contents

   Lire Data Types Parameter Entities

        Boolean Type
        Integer Type
        Number Type
        String Type
        Timestamp type
        Time Type
        Date Type
        Duration Type
        IP Type
        Port Type
        Hostname Type
        URL Type
        Email Type
        Bytes Type
        Filename Type
        Field Type
        Superservice Type
        Related Types

Lire Data Types Parameter Entities

   This module contains the parameter entity declarations for the data
   types used by all Lire DTDs.

   All defined data types have a <type>.type parameter entity which
   defines their type as an XML type valid in an attribute declaration
   and a <type>.name parameter entity that declare their name.

   Additionally, this module declares <name>.types parameter entities
   that group related types together.

   The latest version of that module is 1.0 and its public identifier is
   -//LogReport.ORG//ENTITIES Lire Data Types V1.0//EN(TM).

Boolean Type

   The bool type. It contains a boolean value, either 0, 1, f, t, false
   or true.

<!ENTITY % bool.type         "0 | 1 | f | t | false | true | yes | no">
<!ENTITY % bool.name         "bool"                                  >



Integer Type

   The int type can contains positive or negative 32 bits integer.

<!ENTITY % int.type          "CDATA"                                 >
<!ENTITY % int.name          "int"                                   >



Number Type

   The number type can contains any number either integral or floating
   point.

<!ENTITY % number.type       "CDATA"                                 >
<!ENTITY % number.name       "number"                                >



String Type

   The string type contains any displayable text string.

<!ENTITY % string.type       "CDATA"                                 >
<!ENTITY % string.name       "string"                                >



Timestamp type

   The timestamp type contains a time representation which contains the
   date and time informations. It can be represented in UNIX epoch time.

<!ENTITY % timestamp.type    "CDATA"                                 >
<!ENTITY % timestamp.name    "timestamp"                             >



Time Type

   The time type contains a time representation which contains only the
   time of the day, not the date. For example, this data type can
   represent 12h00, 15:13:10, etc.

<!ENTITY % time.type         "CDATA"                                 >
<!ENTITY % time.name         "time"                                  >



Date Type

   The date type contains a time representation which contains only a
   date.

<!ENTITY % date.type         "CDATA"                                 >
<!ENTITY % date.name         "date"                                  >



Duration Type

   The duration type contains a quantity of time. For example : 5s, 30h,
   2days, 3w, 2M, 1y. (The authoritive list of supported duration types
   is coded in Lire::DataTypes::duration2sec.)

<!ENTITY % duration.type     "CDATA"                                 >
<!ENTITY % duration.name     "duration"                              >



IP Type

   The ip type contains an IPv4 address.

<!ENTITY % ip.type           "CDATA"                                 >
<!ENTITY % ip.name           "ip"                                    >



Port Type

   The port type contains a port as used in the TCP to name the ends of
   logical connections. See also RFC 1700 and
   http://www.iana.org/numbers.htm. Commonly found in /etc/services on
   Unix systems.

<!ENTITY % port.type         "CDATA"                                 >
<!ENTITY % port.name         "port"                                  >



Hostname Type

   The hostname type contains an DNS hostname. (It can also contains the
   IPv4 address of the host).

<!ENTITY % hostname.type     "NMTOKEN"                               >
<!ENTITY % hostname.name     "hostname"                              >



URL Type

   The url type represents URL.

<!ENTITY % url.type          "CDATA"                                 >
<!ENTITY % url.name          "url"                                   >



Email Type

   The email type can be used to represent an email address.

<!ENTITY % email.type        "CDATA"                                 >
<!ENTITY % email.name        "email"                                 >



Bytes Type

   The bytes type can be used to represent quantity of data. (5m, 1.2g,
   300bytes, etc.)

<!ENTITY % bytes.type        "CDATA"                                 >
<!ENTITY % bytes.name        "bytes"                                 >



Filename Type

   The filenametype can be used to Represent the name of a file or
   directory.

<!ENTITY % filename.type     "CDATA"                                 >
<!ENTITY % filename.name     "filename"                              >



Field Type

Important

   This type should be considered internal to Lire and shouldn't be used
   as a parameter or DLF field type.

   The field type can contains a DLF field name. It is used in the
   parameter specification to represent a choice of sort field for
   example.

<!ENTITY % field.type        "NMTOKEN"                               >
<!ENTITY % field.name        "field"                                 >



Superservice Type

Important

   This type should be considered internal to Lire and shouldn't be used
   as a parameter or DLF field type.

<!ENTITY % superservice.type "NMTOKEN"                               >
<!ENTITY % superservice.name "superservice"                          >



Related Types


<!ENTITY % basic.types       "%bool.name; | %int.name; |
                              %number.name; | %string.name;"         >
<!ENTITY % internet.types    "%email.name; | %url.name; |
                              %ip.name; | %hostname.name; |
                              %port.name;"                           >
<!ENTITY % misc.types        "%filename.name; | %bytes.name; "       >
<!ENTITY % time.types        "%date.name; | %time.name; |
                              %timestamp.name; | %duration.name;"    >

<!ENTITY % lire.types        "%basic.types; | %time.types; |
                              %internet.types; | %misc.types;"       >



Chapter 8. The Lire Report Configuration Specification Markup Language

   Table of Contents

   The Lire Report Configuration Specification Markup Language

        config-spec element
        summary element
        Parameter Specifiations Elements

The Lire Report Configuration Specification Markup Language

   Document Type Definition for the Lire Report Configuration
   Specification Markup Language.

   This DTD defines a grammar that is used to specify the configuration
   parameters used by the Lire framework. Besides the framework
   parameters, this DTD can be used by extensions writers to register
   their parameters with the framework. The configuration specifications
   are usually stored in prefix/share/lire/config-spec.

   Currently, Lire's configuration namespace is flat, which means that
   two different specification documents cannot define parameters of the
   same names.

   Elements of this DTD uses the http://www.logreport.org/LRCSML/
   namespace that is usually mapped to the lrcsml prefix.

   The latest version of that DTD is 1.1 and its public identifier is
   -//LogReport.ORG//DTD Lire Report Specification Markup Language
   V1.1//EN(TM). Its canonical system identifier is
   http://www.logreport.org/LRCSML/1.1/lrcsml.dtd.

<!--
                                                                   -->

<!--                    Namespace prefix for validation using the
                        DTD                                        -->
<!ENTITY % LRCSML.xmlns.pfx    "lrcsml"                                >
<!ENTITY % LRCSML.pfx          "%LRCSML.xmlns.pfx;:"                   >
<!ENTITY % LRCSML.xmlns.attr.name "xmlns:%LRCSML.xmlns.pfx;"           >
<!ENTITY % LRCSML.xmlns.attr
  "%LRCSML.xmlns.attr.name; CDATA #FIXED 'http://www.logreport.org/LRCSML/'">

<!ENTITY % LRCML.xmlns.pfx    "lrcml"                                 >
<!ENTITY % LRCML.pfx          "%LRCML.xmlns.pfx;:"                    >
<!ENTITY % LRCML.xmlns.attr.name "xmlns:%LRCML.xmlns.pfx;">
<!ENTITY % LRCML.xmlns.attr
  "%LRCML.xmlns.attr.name; CDATA #FIXED 'http://www.logreport.org/LRCML/'">

<!-- For the modules which we are including                         -->
<!ENTITY % LIRE.pfx           "%LRCSML.pfx;"                          >



   This DTD uses the common lire-desc.mod module which is used to include
   a subset of DocBook in description and text elements.


<!ENTITY % lire-desc.mod PUBLIC
    "-//LogReport.ORG//ELEMENTS Lire Description Elements V2.0//EN"
    "lire-desc.mod">
%lire-desc.mod;



   Each configuration specification is a XML document which has one
   config-spec as its root element.

<!ENTITY % LRCSML.config-spec     "%LRCSML.pfx;config-spec"              >
<!ENTITY % LRCSML.summary         "%LRCSML.pfx;summary"                  >
<!ENTITY % LRCSML.boolean         "%LRCSML.pfx;boolean"                  >
<!ENTITY % LRCSML.integer         "%LRCSML.pfx;integer"                  >
<!ENTITY % LRCSML.string          "%LRCSML.pfx;string"                   >
<!ENTITY % LRCSML.dlf-schema      "%LRCSML.pfx;dlf-schema"               >
<!ENTITY % LRCSML.dlf-streams     "%LRCSML.pfx;dlf-streams"              >
<!ENTITY % LRCSML.dlf-converter   "%LRCSML.pfx;dlf-converter"            >
<!ENTITY % LRCSML.command         "%LRCSML.pfx;command"                  >
<!ENTITY % LRCSML.file            "%LRCSML.pfx;file"                     >
<!ENTITY % LRCSML.executable      "%LRCSML.pfx;executable"               >
<!ENTITY % LRCSML.directory       "%LRCSML.pfx;directory"                >
<!ENTITY % LRCSML.select          "%LRCSML.pfx;select"                   >
<!ENTITY % LRCSML.option          "%LRCSML.pfx;option"                   >
<!ENTITY % LRCSML.list            "%LRCSML.pfx;list"                     >
<!ENTITY % LRCSML.object          "%LRCSML.pfx;object"                   >
<!ENTITY % LRCSML.output-format   "%LRCSML.pfx;output-format"            >
<!ENTITY % LRCSML.plugin          "%LRCSML.pfx;plugin"                   >
<!ENTITY % LRCSML.record          "%LRCSML.pfx;record"                   >
<!ENTITY % LRCSML.reference       "%LRCSML.pfx;reference"                >
<!ENTITY % LRCSML.report-config   "%LRCSML.pfx;report-config"            >

<!ENTITY % LRCML.param            "%LRCML.pfx;param"                     >

<!ENTITY % LRCSML.summary         "%LRCSML.pfx;summary"                  >
<!ENTITY % types-spec           "%LRCSML.boolean;|%LRCSML.integer;|
                                 %LRCSML.string;|%LRCSML.dlf-schema;|
                                 %LRCSML.dlf-converter;|%LRCSML.dlf-streams;|
                                 %LRCSML.command;|%LRCSML.file;|
                                 %LRCSML.executable;|%LRCSML.directory;|
                                 %LRCSML.select;|%LRCSML.list;|%LRCSML.object;|
                                 %LRCSML.output-format;|
                                 %LRCSML.plugin;|%LRCSML.record;|%LRCSML.refere
nce;
                                 |%LRCSML.report-config;
                                ">
<!ENTITY % common.mix           "(%LRCSML.summary;)?,(%LIRE.description;)?">
<!ENTITY % default              "(%LRCML.param;)?"                  >
<!ENTITY % common.mix.default   "(%common.mix;, %default;)"         >

<!ELEMENT %LRCML.param; (#PCDATA|%LRCML.param;)*                       >
<!ATTLIST %LRCML.param;
             name      NMTOKEN                             #REQUIRED
             value     CDATA                               #IMPLIED >



config-spec element

   Root element of a configuration specification document. It contains a
   list of parameter specifications..

   This element doesn't have any attributes.

<!ELEMENT %LRCSML.config-spec; ((%types-spec;)+)                       >
<!ATTLIST %LRCSML.config-spec;
             %LRCSML.xmlns.attr;
             %LRCML.xmlns.attr;                                         >



summary element

   This element is used for a short one description of the parameter's
   purpose. Use the description element for longer help text.

   This element doesn't have any attribute.

<!ELEMENT %LRCSML.summary;    (#PCDATA)                                >



Parameter Specifiations Elements

Common Attributes

   These attributes are common to all parameters specification elements:

   name
          Contains the name of the parameter to which this specification
          apply.

   required
          Determines if a valid value is required to make the container
          validates. Defaults to true.

   section
          This attribute can be used to set a menu section which can be
          used by configuration frontends to group parameters together.

   summary
          This attribute is equivalent to the summary element.

   obsolete
          This attribute can be used to mark a parameter as obsolete.
          Obsolete parameters will be removed from the specification in a
          future Lire release.


<!ENTITY % common.attr "
        name     NMTOKEN                                   #REQUIRED
        required NMTOKEN                                   '1'
        section  CDATA                                     #IMPLIED
        summary  CDATA                                     #IMPLIED
        obsolete NMTOKEN                                   '0'">



boolean element

   This element is used to define a boolean parameter which can takes a
   yes or no value.

   This element doesn't have any specific attributes.

<!ELEMENT %LRCSML.boolean; (%common.mix.default;)                    >
<!ATTLIST %LRCSML.boolean;
 %common.attr;
                                                                     >



integer element

   This element is used to define an integer parameter.

   This element doesn't have any specific attributes.

<!ELEMENT %LRCSML.integer; (%common.mix.default;)                    >
<!ATTLIST %LRCSML.integer;
 %common.attr;
                                                                     >



string element

   This element is used to define an string parameter. These parameters
   can contains any value.

   This can have a valid-re attribute which specify a regular expression
   that the value must match.

<!ELEMENT %LRCSML.string; (%common.mix.default;)                    >
<!ATTLIST %LRCSML.string;
 %common.attr;
 valid-re CDATA                                         #IMPLIED
                                                                     >



dlf-converter element

   This element is used to select a registered DlfConverter.

   This element doesn't have any specific attributes.

<!ELEMENT %LRCSML.dlf-converter; (%common.mix.default;)              >
<!ATTLIST %LRCSML.dlf-converter;
 %common.attr;
                                                                     >



dlf-schema element

   This element is used to select an available DlfSchema.

   If this element has the superservices set, only superservices can be
   selected.

<!ELEMENT %LRCSML.dlf-schema; (%common.mix.default;)                 >
<!ATTLIST %LRCSML.dlf-schema;
 %common.attr;
 superservices NMTOKEN                                            '0'
                                                                     >



dlf-streams element

   This element is used to configure Lire::DlfStream in Lire::DlfStore.

   This element has no attribute.

<!ELEMENT %LRCSML.dlf-streams; (%common.mix.default;)                >
<!ATTLIST %LRCSML.dlf-streams;
 %common.attr;
                                                                     >



command element

   This element is used to define a command parameter. To be accepted as
   valid the parameter's value must point to an executable file or an
   executable file with the specified value must exist in a directory of
   the PATH environment variable.

   This element doesn't have any specific attributes.

<!ELEMENT %LRCSML.command; (%common.mix.default;)                    >
<!ATTLIST %LRCSML.command;
 %common.attr;
                                                                     >



file element

   This element is used to define a file parameter. To be accepted as
   valid, the parameter's value must point to an existing file.

   This element doesn't have any specific attributes.

<!ELEMENT %LRCSML.file; (%common.mix.default;)                       >
<!ATTLIST %LRCSML.file;
 %common.attr;
                                                                     >



directory element

   This element is used to define a directory parameter. To be accepted
   as valid, the parameter's value must point to an existing directory.

   This element doesn't have any specific attributes.

<!ELEMENT %LRCSML.directory; (%common.mix.default;)                  >
<!ATTLIST %LRCSML.directory;
 %common.attr;
                                                                     >



executable element

   This element is used to define an executable parameter. To be accepted
   as valid, the parameter's value must point to an existing executable
   file.

   This element doesn't have any specific attributes.

<!ELEMENT %LRCSML.executable; (%common.mix.default;)                 >
<!ATTLIST %LRCSML.executable;
 %common.attr;
                                                                     >



select element

   This element is used to define a parameter for which the value is
   selected among a set of options. The allowed set of options is
   specified using option elements.

   This element doesn't have any specific attributes.

<!ELEMENT %LRCSML.select;     (%common.mix;,(%LRCSML.option;)+, %default;) >
<!ATTLIST %LRCSML.select;
 %common.attr;
                                                                     >



option element

   This element is used to define the valid values for a select
   parameter.

   This element doesn't have any specific attributes.

<!ELEMENT %LRCSML.option;     (%common.mix;)                           >
<!ATTLIST %LRCSML.option;
 %common.attr;
                                                                     >



list element

   This element is used to define a parameter that can contains an
   ordered set of values. The type of values which can be contained is
   specified using other parameters elements. Any number of parameters of
   the type specified by the children elements can be contained by the
   defined parameter.

   This element doesn't have any specific attributes.

<!ELEMENT %LRCSML.list;       (%common.mix;,(%types-spec;)+,%default;) >
<!ATTLIST %LRCSML.list;
 %common.attr;
                                                                     >



object element

   This element is used to define a parameter that will instantiate an
   object. The object will be instantiated by calling the
   "new_from_config()" class method defined in the package specified by
   the element's class attribute. The constructor will receive the hash
   instantiated from the parameter's components as parameter.

   The label attribute can be used to specify the contained element that
   should be used to represent this object in lists.

<!ELEMENT %LRCSML.object;       (%common.mix;,(%types-spec;)+,%default;) >
<!ATTLIST %LRCSML.object;
 %common.attr;
        class    NMTOKEN                                    #REQUIRED
        label    NMTOKEN                                    #IMPLIED
                                                                     >



output-format element

   This element is used to select an available OutputFormat.

   This element doesn't have any specific attributes.

<!ELEMENT %LRCSML.output-format; (%common.mix.default;)              >
<!ATTLIST %LRCSML.output-format;
 %common.attr;
                                                                     >



record element

   This element is used to define a parameter that holds record-like
   data.

   The label attribute can be used to specify the contained element that
   should be used to represent this record in lists.

<!ELEMENT %LRCSML.record;       (%common.mix;,(%types-spec;)+, %default;) >
<!ATTLIST %LRCSML.record;
 %common.attr;
        label    NMTOKEN                                    #IMPLIED
                                                                     >



record element

   This element is used to define a parameter that holds record-like
   data.

   The label attribute can be used to specify the contained element that
   should be used to represent this record in lists.

<!ELEMENT %LRCSML.record;       (%common.mix;,(%types-spec;)+,%default;) >
<!ATTLIST %LRCSML.record;
 %common.attr;
        label    NMTOKEN                                    #IMPLIED
                                                                     >



reference element

   This element is used to select from an index. The index in which the
   available values is taken is specified in the index attribute.

<!ELEMENT %LRCSML.reference; (%common.mix.default;)                   >
<!ATTLIST %LRCSML.reference;
 %common.attr;
 index  CDATA                                             #REQUIRED
                                                                     >



report-config element

   This element is used to configure a report configuration.

   This element doesn't have any attribute. Each superservice can define
   a default report configuration using this element with a name of
   superservice_default.

<!ELEMENT %LRCSML.report-config; (%common.mix.default;)              >
<!ATTLIST %LRCSML.report-config;
 %common.attr;
                                                                     >



plugin element

   This element is used to define a parameter for which the value is
   selected among a set of options. The allowed set of options is
   specified using option elements. The element will also contain
   additional parameters based on the selected value. The available
   paramaters should be defined in a record or similar specification
   named name_properties. For example, the additional parameters when the
   option_1 option is selected will be found in the specification named
   option_1_properties.

   This element doesn't have any specific attributes.

<!ELEMENT %LRCSML.plugin;     (%common.mix;,(%LRCSML.option;)+, %default;) >
<!ATTLIST %LRCSML.plugin;
 %common.attr;
                                                                     >



Chapter 9. The Lire Report Configuration Markup Language

   Table of Contents

   The Lire Report Configuration Markup Language

        config element
        global element
        param element

The Lire Report Configuration Markup Language

   Document Type Definition for the Lire Report Configuration Markup
   Language.

   This DTD defines a grammar that is used to store the Lire
   configuration. The configuration is stored in one or more XML files.
   Parameters set in later configuration files override the ones set in
   the formers. The valid parameter names as well as their description
   and type are specified using configuration specification documents.

   Elements of this DTD use the http://www.logreport.org/LRCML/
   namespace, which is usually mapped to the lrcml prefix.

   The latest version of the DTD is 1.0 and its public identifier is
   -//LogReport.ORG//DTD Lire Report Specification Markup Language
   V1.0//EN(TM). Its canonical system identifier is
   http://www.logreport.org/LRCML/1.0/lrcml.dtd.

<!--
                                                                   -->

<!--                    Namespace prefix for validation using the
                        DTD                                        -->
<!ENTITY % LRCML.xmlns.pfx    "lrcml"                                 >
<!ENTITY % xmlns.colon       ":"                                     >
<!ENTITY % LRCML.pfx          "%LRCML.xmlns.pfx;%xmlns.colon;"         >
<!ENTITY % LRCML.xmlns.attr.name "xmlns%xmlns.colon;%LRCML.xmlns.pfx;" >
<!ENTITY % LRCML.xmlns.attr
  "%LRCML.xmlns.attr.name; CDATA #FIXED 'http://www.logreport.org/LRCML/'">

<!-- For the module which we are including                          -->
<!ENTITY % LIRE.pfx           "%LRCML.pfx;"                          >



   Each configuration specification is an XML document which has one
   config as its root element.

<!ENTITY % LRCML.config          "%LRCML.pfx;config"                   >
<!ENTITY % LRCML.global          "%LRCML.pfx;global"                   >
<!ENTITY % LRCML.param           "%LRCML.pfx;param"                    >



config element

   Root element of a configuration document. It contains presently only
   one global element which is used to hold the global configuration
   parameters.

   This element doesn't have any attributes.

<!ELEMENT %LRCML.config; (%LRCML.global;)                              >
<!ATTLIST %LRCML.config;
             %LRCML.xmlns.attr;                                       >



global element

   This element starts the global configuration data. (This is the only
   scope currently defined). It contains a list of param elements.

<!ELEMENT %LRCML.global; (%LRCML.param;)+                              >



param element

   This element contains the parameter's value. The parameter's name is
   defined in the name attribute.

   The value attribute can be used to store scalar's value.

   When the parameter's type is a list, the values are stored in children
   param elements.

Warning

   This element has a mixed content type. We should probably use a value
   attribute to hold scalar values.

<!ELEMENT %LRCML.param; (#PCDATA|%LRCML.param;)*                       >
<!ATTLIST %LRCML.param;
             name      NMTOKEN                             #REQUIRED
             value     CDATA                               #IMPLIED >



Chapter 10. The Lire DLF Schema Markup Language

   Table of Contents

   The Lire DLF Schema Markup Language

        The dlf-schema element
        extended-schema element
        derived-schema element
        field element

The Lire DLF Schema Markup Language

   The Lire DLD Schema Markup Language (LDSML) is used describe the
   fields used by DLF records of a specific schema like www, email or
   msgstore.

   DLF schemas are defined in one XML document that should be installed
   in one of the directories that is included in the schema path (usually
   HOME/.lire/schemas and prefix/share/lire/schemas ). This document must
   conforms to the LDSML DTD which is described here. Elements of that
   DTD are defined in the namespace http://www.logreport.org/LDSML/ which
   will be usually mapped to the lire prefix (altough other prefixes may
   be used).

   The latest version of that DTD is 1.1 and its public identifier is
   -//LogReport.ORG//DTD Lire DLF Schema Markup Language V1.1//EN(TM).
   Its canonical system identifier is
   http://www.logreport.org/LDSML/1.1/ldsml.dtd.

<!--                    Namespace prefix for validation using the
                        DTD                                        -->
<!ENTITY % LIRE.xmlns.pfx    "lire"                                  >
<!ENTITY % LIRE.pfx          "%LIRE.xmlns.pfx;:"                     >
<!ENTITY % LIRE.xmlns.attr.name "xmlns:%LIRE.xmlns.pfx;"             >
<!ENTITY % LIRE.xmlns.attr
  "%LIRE.xmlns.attr.name; CDATA #FIXED
                        'http://www.logreport.org/LDSML/'">



   This DTD uses the common modules lire-types.mod which defines the data
   types recognized by Lire and lire-desc.mod which is used to include a
   subset of DocBook in description and text elements.


<!ENTITY % lire-types.mod PUBLIC
    "-//LogReport.ORG//ENTITIES Lire Data Types V1.0//EN"
    "lire-types.mod">
%lire-types.mod;

<!ENTITY % lire-desc.mod PUBLIC
    "-//LogReport.ORG//ELEMENTS Lire Description Elements V2.0//EN"
    "lire-desc.mod">
%lire-desc.mod;



   The top-level element in XML documents describing a DLF schema will be
   either a dlf-schema, extented-schema or derived-schema depending on
   the schema's type. DLF schemas are used as base schema for one
   superservice. For example, the DLF schema of the www superservice is
   named www. An extended schema is used to define additional fields
   which values are to be computed by an analyser.

   Extended schemas are named after the schema which they extend. For
   example, the www-attack extended schema adds an attack field which
   contains, if any, the "attack" that was attempted in that request.

   Derived schemas are used by another type of analysers which defines an
   entirely different schema. Whereas in the extended schema the new
   fields will be added to all the DLF records of the base schema, the
   derived schema will create new DLF records based on the DLF records of
   the base schema. An example of this is the www-session schema which
   computes users' session information based on the web requests
   contained in the www schema. Like for the extended-schema case,
   derived schemas are named after the base schema from which they are
   derived.

   The fields that makes each schema are defined using field elements.

<!-- Prefixed names declaration.                                   -->
<!ENTITY % LIRE.dlf-schema   "%LIRE.pfx;dlf-schema"                  >
<!ENTITY % LIRE.extended-schema "%LIRE.pfx;extended-schema"          >
<!ENTITY % LIRE.derived-schema  "%LIRE.pfx;derived-schema"           >
<!ENTITY % LIRE.field        "%LIRE.pfx;field"                       >



The dlf-schema element

   The dlf-schema element is used to define the base schema of a
   superservice. It should contains optional title and description
   elements followed by field elements describing the schema structure.

   The title is an optional text string that will be used to in the
   automatic documentation generation that can be extracted from the
   schema definition. The description element should describe what is
   represented by each DLF records (one web request, one email delivery,
   one firewall event, etc.)

   dlf-schema's attributes

   superservice
          This required attribute contains the name of the superservice
          described by this schema. This will also be used as the base
          schema's identifier.

   timestamp
          This required attribute contains the name of the field which
          contains the official event's timestamp. This field will be
          used to sort the DLF records for timegroup and timeslot report
          operations.



<!ELEMENT %LIRE.dlf-schema;  ( (%LIRE.title;)?, (%LIRE.description;)?,
                               (%LIRE.field;)+ )                     >
<!ATTLIST %LIRE.dlf-schema;
             superservice    %superservice.type;           #REQUIRED
             timestamp       IDREF                         #REQUIRED
             %LIRE.xmlns.attr;                                       >




extended-schema element

   This is the root element of an extended DLF Schema. Extended-schema
   defines additional fields that will be added to the base schema. It
   contains an optional title, an optional description and one or more
   field specifications.

   dlf-schema's attributes

   id
          This required attribute contains the identifier of that schema.
          This identifier should be composed of the superservice's name
          followed by an hypen (-) and then an word describing the
          extended schema.

   base-schema
          This required attribute contains the identifier of the schema
          that is extended.

   required-fields
          This optional attribute contains a space delimited list of
          field names that must be available in the base schema for the
          analyser to do its job. If any of the listed field is missing
          in the DLF, extended fields for the base schema cannot be
          computed.

   module
          This required attribute contains the name of the analyser that
          is used to compute the extended fields. This is a perl module
          that should be installed in perl's library path.



<!ELEMENT %LIRE.extended-schema;
                             ( (%LIRE.title;)?, (%LIRE.description;)?,
                               (%LIRE.field;)+ )                     >
<!ATTLIST %LIRE.extended-schema;
             id              NMTOKEN                       #REQUIRED
             base-schema     NMTOKEN                       #REQUIRED
             module          NMTOKEN                       #REQUIRED
             required-fields NMTOKENS                      #IMPLIED
             %LIRE.xmlns.attr;                                       >



derived-schema element

   This is the root element of a derived DLF Schema. The difference
   between a normal schema and a derived schema is that the data is
   generated from another DLF instead of a log file.

   derived-schema's attributes

   id
          This required attribute contains the identifier of that schema.
          This identifier should be composed of the superservice's name
          followed by an hypen (-) and then an word describing the
          derived schema.

   base-schema
          This required attribute contains the identifier of the schema
          from which this derived schema's data is derived.

   required-fields
          This optional attribute contains a space delimited list of
          field names that must be available in the base schema for the
          analyser to do its job. If any of the listed field is missing
          in the DLF, the derived records cannot be computed.

   module
          This required attribute contains the name of the analyser that
          is used to compute the derived records. This is a perl module
          that should be installed in perl's library path.

   timestamp
          This required attribute contains the name of the field which
          contains the official event's timestamp. This field will be
          used to sort the DLF records for timegroup and timeslot report
          operations.


<!ELEMENT %LIRE.derived-schema;
                             ( (%LIRE.title;)?, (%LIRE.description;)?,
                               (%LIRE.field;)+ )                     >
<!ATTLIST %LIRE.derived-schema;
             id              NMTOKEN                       #REQUIRED
             base-schema     NMTOKEN                       #REQUIRED
             module          NMTOKEN                       #REQUIRED
             required-fields NMTOKENS                      #IMPLIED
             timestamp       IDREF                         #REQUIRED
             %LIRE.xmlns.attr;                                       >



field element

   The field is used to describe the fields of the schema. Each field is
   specified by its name and type. The field element may contain an
   optional description element which gives more information on the data
   contained in the field. Description should be used to give better
   information to the DLF converter implementors on what should appears
   in that field.

   field's attributes

   name
          This required attribute contains the name of the field.

   type
          This required attribute contains the the field's type.

   default

Warning

          This attribute is obsolete and will be removed in a future Lire
          release.

   label
          This optional attribute gives the label that should be used to
          display this field in reports. Defaults to the field's name
          when omitted.


<!ELEMENT %LIRE.field;       (%LIRE.description;)?                   >
<!ATTLIST %LIRE.field;
             name            ID                            #REQUIRED
             type            (%lire.types;)                #REQUIRED
             default         CDATA                         #IMPLIED
             label           CDATA                         #IMPLIED  >





Chapter 11. The Lire Report Specification Markup Language

   Table of Contents

   The Lire Report Specification Markup Language

        report-spec element
        global-filter-spec element
        display-spec element
        param-spec element
        param element
        chart-configs element
        Filter expression elements
        Report Calculation Elements

The Lire Report Specification Markup Language

   Document Type Definition for the Lire Report Specification Markup
   Language.

   This DTD defines a grammar that is used to specify reports that can be
   generated by Lire. Elements of this DTD uses the
   http://www.logreport.org/LRSML/ namespace that is usually mapped to
   the lire prefix.

   The latest version of that DTD is 2.1 and its public identifier is
   -//LogReport.ORG//DTD Lire Report Specification Markup Language
   V2.1//EN(TM). Its canonical system identifier is
   http://www.logreport.org/LRSML/2.1/lrsml.dtd.

<!--
                                                                   -->

<!--                    Namespace prefix for validation using the
                        DTD                                        -->
<!ENTITY % LIRE.xmlns.pfx    "lire"                                  >
<!ENTITY % LIRE.pfx          "%LIRE.xmlns.pfx;:"                     >
<!ENTITY % LIRE.xmlns.attr.name "xmlns:%LIRE.xmlns.pfx;"             >
<!ENTITY % LIRE.xmlns.attr
  "%LIRE.xmlns.attr.name; CDATA #FIXED 'http://www.logreport.org/LRSML/'">

<!ENTITY % LRCML.xmlns.pfx    "lrcml"                                 >
<!ENTITY % LRCML.pfx          "%LRCML.xmlns.pfx;:"                    >
<!ENTITY % LRCML.xmlns.attr.name "xmlns:%LRCML.xmlns.pfx;">
<!ENTITY % LRCML.xmlns.attr
  "%LRCML.xmlns.attr.name; CDATA #FIXED 'http://www.logreport.org/LRCML/'">




   This DTD uses the common modules lire-types.mod which defines the data
   types recognized by Lire and lire-desc.mod which is used to include a
   subset of DocBook in description and text elements.


<!ENTITY % lire-types.mod PUBLIC
    "-//LogReport.ORG//ENTITIES Lire Data Types V1.0//EN"
    "lire-types.mod">
%lire-types.mod;

<!ENTITY % lire-desc.mod PUBLIC
    "-//LogReport.ORG//ELEMENTS Lire Description Elements V2.0//EN"
    "lire-desc.mod">
%lire-desc.mod;



   Each report specification is a XML document which has one report-spec
   as its root element. This DTD can also be used for filter
   specification which have one global-filter-spec as root element.

<!ENTITY % LIRE.report-spec     "%LIRE.pfx;report-spec"              >
<!ENTITY % LIRE.global-filter-spec "%LIRE.pfx;global-filter-spec">
<!ENTITY % LIRE.display-spec    "%LIRE.pfx;display-spec"             >
<!ENTITY % LIRE.param-spec      "%LIRE.pfx;param-spec"               >
<!ENTITY % LIRE.param           "%LIRE.pfx;param"                    >
<!ENTITY % LIRE.chart-configs   "%LIRE.pfx;chart-configs"            >
<!ENTITY % LRCML.param          "%LRCML.pfx;param"                   >
<!ENTITY % LIRE.filter-spec     "%LIRE.pfx;filter-spec"              >
<!ENTITY % LIRE.report-calc-spec "%LIRE.pfx;report-calc-spec"        >

<!ELEMENT %LRCML.param; (#PCDATA|%LRCML.param;)*                     >
<!ATTLIST %LRCML.param;
             name      NMTOKEN                             #REQUIRED
             value     CDATA                               #IMPLIED  >


report-spec element

   Root element of a report specification. It contains descriptive
   elements about the report specification (title, description). It
   contains the display elements that will be in the generated report
   (display-spec).

   It contains specification for the parameters that can be used to
   customize the report generated from this specification (param-spec).
   Finally, it contains elements to specify a filter expression which can
   be used to select a subset of the records (filter-spec) and the
   expression to build the report (report-calc-spec).

   report-spec's attributes

   id
          the name of the superservice for which this report is available
          : i.e. email, www, dns, etc.

   schema
          The DLF schema used by the report. This defaults to the
          superservice's schema, but can be one of its derived or
          extended schema.

   joined-schemas
          A whitespace delimited list of additional schemas that will be
          joined for this report. This will make all fields define in
          these schemas available for the operators. The schemas that can
          be joined depends on the specification's schema.

   id
          An unique identifier for the report specification


<!ELEMENT %LIRE.report-spec;
                        (%LIRE.title;, %LIRE.description;,
                         (%LIRE.param-spec;)?, %LIRE.display-spec;,
                        (%LIRE.filter-spec;)?, (%LIRE.chart-configs;)?,
                        %LIRE.report-calc-spec;)
                                                                     >
<!ATTLIST %LIRE.report-spec;
             id             ID                              #REQUIRED
             superservice   %superservice.type;             #REQUIRED
             schema         NMTOKEN                         #IMPLIED
             joined-schemas NMTOKENS                        #IMPLIED
             %LIRE.xmlns.attr;
             %LRCML.xmlns.attr;                                       >



global-filter-spec element

   Root element of a filter specification. It contains descriptive
   elements about the filter specification (title, description). It
   contains the display elements that will be used when that filter is
   used in a generated report (display-spec). It contains specification
   for the parameters that can be used to customize the filter generated
   from this specification (param-spec). Finally, it contains element to
   specify the filter expression which can be used to select a subset of
   the records (filter-spec).

   global-filter-spec's attributes

   superservice
          the name of the superservice for which this filter is available
          : i.e. email, www, dns, etc.

   schema
          the DLF schema used by the report. This defaults to the
          superservice's schema, but can be one of its derived or
          extended schema.

   joined-schemas
          A whitespace delimited list of additional schemas that will be
          joined for this report. This will make all fields define in
          these schemas available for the operators. The schemas that can
          be joined depends on the specification's schema.

   id
          An unique identifier for the filter specification


<!ELEMENT %LIRE.global-filter-spec;
                        (%LIRE.title;, %LIRE.description;,
                         (%LIRE.param-spec;)?, %LIRE.display-spec;,
                        (%LIRE.filter-spec;))
                                                                     >

<!ATTLIST %LIRE.global-filter-spec;
             id             ID                              #REQUIRED
             superservice   %superservice.type;             #REQUIRED
             schema         NMTOKEN                         #IMPLIED
             joined-schemas NMTOKENS                        #IMPLIED
             %LIRE.xmlns.attr;                                       >



display-spec element

   This element contains the descriptive element that will appear in the
   generated report.

   It contains one title and may contains one description which will be
   used as help message

   This element has no attribute.

<!ELEMENT %LIRE.display-spec; (%LIRE.title;, (%LIRE.description;)?)  >



param-spec element

   This element contains the parameters than can be customized in this
   report specification.

   This element doesn't have any attribute.

<!ELEMENT %LIRE.param-spec; (%LIRE.param;)+                          >



param element

   This element contains the specification for a parameter than can be
   used to customize this report.

   This element can contains a description element which can be used to
   explain the parameter's purpose.

   It is an error to define a parameter with the same name than one of
   the superservice's field.

   param's attributes

   name
          the name of the parameter.

   type
          the parameter's data type

   default
          the parameter's default value


<!ELEMENT %LIRE.param; (%LIRE.description;)?                         >
<!ATTLIST  %LIRE.param;
             name       ID                                 #REQUIRED
             type       (%lire.types;)                     #REQUIRED
             default    CDATA                              #IMPLIED  >



chart-configs element

   This element contains one or more chart configurations that should be
   copied to the generated subreport. These chart configurations are
   specified using the Lire Report Configuration Markup Language.

   This element has no attribute.

<!ELEMENT %LIRE.chart-configs; (%LRCML.param;)+                      >



Filter expression elements


<!ENTITY % LIRE.eq      "%LIRE.pfx;eq"                               >
<!ENTITY % LIRE.ne      "%LIRE.pfx;ne"                               >
<!ENTITY % LIRE.gt      "%LIRE.pfx;gt"                               >
<!ENTITY % LIRE.ge      "%LIRE.pfx;ge"                               >
<!ENTITY % LIRE.lt      "%LIRE.pfx;lt"                               >
<!ENTITY % LIRE.le      "%LIRE.pfx;le"                               >
<!ENTITY % LIRE.and     "%LIRE.pfx;and"                              >
<!ENTITY % LIRE.or      "%LIRE.pfx;or"                               >
<!ENTITY % LIRE.not     "%LIRE.pfx;not"                              >
<!ENTITY % LIRE.match   "%LIRE.pfx;match"                            >
<!ENTITY % LIRE.value   "%LIRE.pfx;value"                            >

<!ENTITY % expr "%LIRE.eq; | %LIRE.ne; |
                 %LIRE.gt; | %LIRE.lt; | %LIRE.ge; | %LIRE.le; |
                 %LIRE.and; | %LIRE.or; | %LIRE.not; |
                 %LIRE.match; | %LIRE.value;"                        >


filter-spec element

   This element is used to select the subset of the records that will be
   used to generate the report. If this element is missing, all records
   will be used to generate the report.

   The content of this element are expression element which defines an
   expression which will evaluate to true or false for each record. The
   subset used for to generate the report are all records for which the
   expression evaluates to true.

   The value used to evaluate the expressions are either literal, value
   of parameter or value of one of the field of the record. Parameter and
   field starts with a $ followed by the name of the parameter or field.
   All other values are interpreted as literals.

   This element doesn't have any attribute.

<!ELEMENT %LIRE.filter-spec; (%expr;)                                >



value element

   This expression element to false if the 'value' attribute is
   undefined, the empty string or 0. It evaluate to true otherwise.

   value's attributes

   value
          The value that should be evaluated for a boolean context.


<!ELEMENT %LIRE.value;  EMPTY                                        >
<!ATTLIST %LIRE.value;
             value      CDATA                              #REQUIRED >



eq element


<!ELEMENT %LIRE.eq;     EMPTY                                        >
<!ATTLIST %LIRE.eq;
             arg1       CDATA                              #REQUIRED
             arg2       CDATA                              #REQUIRED >



ne element


<!ELEMENT %LIRE.ne;     EMPTY                                        >
<!ATTLIST %LIRE.ne;
             arg1       CDATA                              #REQUIRED
             arg2       CDATA                              #REQUIRED >



gt element


<!ELEMENT %LIRE.gt;     EMPTY                                        >
<!ATTLIST %LIRE.gt;
             arg1       CDATA                              #REQUIRED
             arg2       CDATA                              #REQUIRED >



ge element


<!ELEMENT %LIRE.ge;     EMPTY                                        >
<!ATTLIST %LIRE.ge;
             arg1       CDATA                              #REQUIRED
             arg2       CDATA                              #REQUIRED >



lt element


<!ELEMENT %LIRE.lt;     EMPTY                                        >
<!ATTLIST %LIRE.lt;
             arg1       CDATA                              #REQUIRED
             arg2       CDATA                              #REQUIRED >



le element


<!ELEMENT %LIRE.le;     EMPTY                                        >
<!ATTLIST %LIRE.le;
             arg1       CDATA                              #REQUIRED
             arg2       CDATA                              #REQUIRED >



match element

   The match expression element tries to match a POSIX 1003.2 extended
   regular expression to a value and return true if there is a match and
   false otherwise.

   match's attributes

   value
          the value which should matched

   re
          A POSIX 1003.2 extended regular expression.

   case-sensitive
          Is the regex sensitive to case. Defaults to true.


<!ELEMENT %LIRE.match;  EMPTY                                        >
<!ATTLIST %LIRE.match;
             value          CDATA                          #REQUIRED
             re             CDATA                          #REQUIRED
             case-sensitive (%bool.type;)                  'true'    >



not element


<!ELEMENT %LIRE.not;    (%expr;)                                     >



and element


<!ELEMENT %LIRE.and;    (%expr;)+                                    >



or element


<!ELEMENT %LIRE.or;     (%expr;)+                                    >



Report Calculation Elements


<!ENTITY % LIRE.timegroup   "%LIRE.pfx;timegroup"                    >
<!ENTITY % LIRE.group       "%LIRE.pfx;group"                        >
<!ENTITY % LIRE.rangegroup  "%LIRE.pfx;rangegroup"                   >
<!ENTITY % LIRE.timeslot    "%LIRE.pfx;timeslot"                     >
<!ENTITY % LIRE.field       "%LIRE.pfx;field"                        >
<!ENTITY % LIRE.sum         "%LIRE.pfx;sum"                          >
<!ENTITY % LIRE.avg         "%LIRE.pfx;avg"                          >
<!ENTITY % LIRE.min         "%LIRE.pfx;min"                          >
<!ENTITY % LIRE.max         "%LIRE.pfx;max"                          >
<!ENTITY % LIRE.first       "%LIRE.pfx;first"                        >
<!ENTITY % LIRE.last        "%LIRE.pfx;last"                         >
<!ENTITY % LIRE.count       "%LIRE.pfx;count"                        >
<!ENTITY % LIRE.records     "%LIRE.pfx;records"                      >

<!-- Empty group operator                                          -->
<!ENTITY % LIRE.empty-ops   "%LIRE.sum; | %LIRE.avg; | %LIRE.count; |
                             %LIRE.min; | %LIRE.max; | %LIRE.first; |
                             %LIRE.last; | %LIRE.records;"           >

<!-- Group operations that are also aggregators                    -->
<!ENTITY % LIRE.nestable-aggr
                            "%LIRE.group; | %LIRE.timegroup; |
                             %LIRE.timeslot; | %LIRE.rangegroup;"    >

<!-- Group operations                                              -->
<!ENTITY % LIRE.group-ops   "%LIRE.empty-ops;| %LIRE.nestable-aggr;" >

<!-- Containers for group operations                               -->
<!ENTITY % LIRE.aggregator  "%LIRE.nestable-aggr;"                   >


report-calc-spec element

   This element describes the computation needs to generate the report.

   It contains one aggregator element.

   This element doesn't have any attributes.

<!ELEMENT %LIRE.report-calc-spec; (%LIRE.aggregator;)                >



Common Attributes

   All elements which will create a column in the resulting report have a
   label attribute that will be used as the column label. When this
   attribute is omitted, the name attribute content will be used as
   column label.

<!ENTITY % label.attr "label CDATA #IMPLIED">



   All operation elements may have a name attribute which can be used to
   reference that column. (It is required in the case of aggrage
   functions). The primary usage is for controlling the sort order of the
   rows in the generated report.

<!ENTITY % name.attr      "name ID      #IMPLIED">
<!ENTITY % name.attr.req  "name ID      #REQUIRED">



group element

   The group element generates a report where records are grouped by some
   field values and aggregate statistics are computed on those group of
   records.

   It contains the field that should be used for grouping and the
   statistics that should be computed.

   The sort order in the report is controlled by the 'sort' attribute.

   group's attributes

   name
          An identifier that can be used to reference this operation from
          other elements. This name will most often be used in the
          parent's sort attribute. If omitted a default name will be
          generated.

   sort
          whitespace delimited list of fields name that should used to
          sort the records. Field names can be prefixed by - to specify
          reverse sort order, otherwise ascending sort order is used. The
          name can also refer to the name attribute of the statistics
          element.

   limit
          limit the number of records that will be in the generated
          report. It can be either a positive integer or the name of a
          user supplied param.


<!ELEMENT %LIRE.group;      ((%LIRE.field;)+, (%LIRE.group-ops;)+)   >
<!ATTLIST %LIRE.group;
             %name.attr;
             sort       NMTOKENS                           #IMPLIED
             limit      CDATA                              #IMPLIED  >



timegroup element

   The timegroup element generates a report where records are grouped by
   time range (hour, day, etc.). Statistics are then computed on these
   records grouped by period.

   timegroup's attributes

   name
          An identifier that can be used to reference this operation from
          other elements. This name will most often be used in the
          parent's sort attribute. If omitted a default name will be
          generated.

   label
          Sets the column label that will be used for column generated by
          this element. If omitted a default label will be generated.

   field
          the name of the field which is used to group records. This
          should be a field which is of one of the time types (timestamp,
          date, time). It defaults to the default timestamp field if
          unspecified.

   period
          This is the timeperiod over which records should be grouped.
          Valid period looks like (hour, day, 1h, 30m, etc). It can also
          be the name of a user supplied param.


<!ELEMENT %LIRE.timegroup;  (%LIRE.group-ops;)+                      >
<!ATTLIST %LIRE.timegroup;
             %name.attr;
             %label.attr;
             field      NMTOKEN                            #IMPLIED
             period     CDATA                             #REQUIRED  >



timeslot element

   The timeslot element generates a report where records are grouped
   according to a cyclic unit of time. The duration unit used won't fall
   over to the next higher unit. For example, this means that using a
   unit of 1d will generate a report where the stats will be by day of
   the week, 8h will generate a report by third of day, etc. The
   statistics are then computed over the records in the same timeslot.

   Example 11.1. timeslot with 1d unit

   Using a specification like:

<lire:timeslot unit="1d">
  ...
</lire:timeslot>



   would generate a report like:

   Table 11.1. weekly overview
   Sunday   ...
   Monday   ...
   Tuesday  ...
   ...      ...
   Saturday ...

   where data will be summed over all Sunday's, Monday's, ..., and
   Saturdays found in the log.

   Example 11.2. timeslot with 2m unit

   Specifying unit="2m" would generate a line for each two months, giving
   a yearly view.

   timeslot's attributes

   name
          An identifier that can be used to reference this operation from
          other elements. This name will most often be used in the
          parent's sort attribute. If omitted a default name will be
          generated.

   label
          Sets the column label that will be used for column generated by
          this element. If omitted a default label will be generated.

   field
          the name of the field which is used to group records. This
          should be a field which is of one of the time types (timestamp,
          date, time). It defaults to the default 'timestamp' field if
          unspecified.

   unit
          This is the cyclic unit of time in which units the records are
          aggregated. It can be any duration value. (hour, day, 1h, 30m,
          etc). It can also be the name of a user supplied param.


<!ELEMENT %LIRE.timeslot;  (%LIRE.group-ops;)+                       >
<!ATTLIST %LIRE.timeslot;
             %name.attr;
             %label.attr;
             field      NMTOKEN                            #IMPLIED
             unit       CDATA                             #REQUIRED  >



rangroup element

   The rangegroup element generates a report where records are grouped
   into distinct class delimited by a range. This element can be used to
   aggregates continuous numeric values like duration or bytes.
   Statistics are then computed on these records grouped in range class.

   rangegroup's attributes

   name
          An identifier that can be used to reference this operation from
          other elements. This name will most often be used in the
          parent's sort attribute. If omitted a default name will be
          generated.

   label
          Sets the column label that will be used for column generated by
          this element. If omitted a default label will be generated.

   field
          the name of the field which is used to group records. This
          should be a field which is of a continuous numeric type (bytes,
          duration, int, number). Time types aggregation should use the
          timegroup element or timeslot.

   range-start
          The starting index of the first class. Defaults to 0. This
          won't be used a the lower limit of the class. It is only used
          to specify relatively at which values the classes delimitation
          start. For example, if the range-start is 1, and the range-size
          is 5, a class ranging -4 to 0 will be created if values are in
          that range. It can be supplied in any continuous unit (i.e 10k,
          5m, etc.) This can also be the name of a user supplied param.

   range-size
          This is the size of class. It can be supplied in any continuous
          unit (i.e 10k, 5m, etc.) It can also be the name of a user
          supplied param.

   min-value
          All value lower then this boundary value will be considered to
          be equal to this value. If this parameter isn't set, the ranges
          won't be bounded on the left side.

   max-value
          All value greater then this boundary value will be considered
          to be equal to this value. If this parameter isn't set, the
          ranges won't be bounded on the right side.

   size-scale
          The rate at which the size scale from one class to another. If
          it is different then 1, this will create a logarithmic
          distribution. For example, setting this to 2, each successive
          class will be twice larger then the precedent : 0-9, 10-29,
          30-69, etc.


<!ELEMENT %LIRE.rangegroup;  (%LIRE.group-ops;)+                     >
<!ATTLIST %LIRE.rangegroup;
             %name.attr;
             %label.attr;
             field          NMTOKEN                       #REQUIRED
             range-start    CDATA                         #IMPLIED
             range-size     CDATA                         #REQUIRED
             min-value      CDATA                         #IMPLIED
             max-value      CDATA                         #IMPLIED
             size-scale     CDATA                         #IMPLIED   >



field element

   This element reference a DLF field which value will be displayed in a
   separate column in the resulting report. Its used to specify the
   grouping fields in the group element and to specify the fields to
   output in the records element.

   field's attribute

   name
          The name of the DLF field that will be used as key for
          grouping.

   label
          Sets the column label that will be used for column generated by
          this element. If omitted a default label will be generated.


<!ELEMENT %LIRE.field;  EMPTY                                        >
<!ATTLIST %LIRE.field;
             name   NMTOKEN                                 #REQUIRED
             %label.attr;                                            >



sum element

   The sum element sums the value of a field in the group.

   sum's attributes

   name
          An identifier that can be used to reference this operation from
          other elements. This name will most often be used in the
          parent's sort attribute.

   label
          Sets the column label that will be used for column generated by
          this element. If omitted a default label will be generated.

   field
          the field that should be summed.

   ratio
          This attribute can be used to display the sum as a ratio of the
          group or table total. If the attribute is set to group the
          resulting value will be the ratio on the group's total sum. If
          the attribute is set to table, it will be expressed as a ratio
          of the total sum of the table. The defaults is none which will
          not convert the sum to a ratio.

   weight
          This optional attribute can be used to create a weighted sum.
          It should contain a numerical DLF field name. The content of
          that field will be used to multiply each field value before
          summing them.


<!ELEMENT %LIRE.sum;    EMPTY                                        >
<!ATTLIST %LIRE.sum;
             %name.attr.req;
             %label.attr;
             ratio      (none | group |table)              'none'
             field      NMTOKEN                            #REQUIRED
             weight     NMTOKEN                            #IMPLIED >



avg element

   The avg element calculate average of all value of a field in the
   group. The average will be computed either on the number of records if
   the by-field attribute is left empty, or by the number of different
   values that there are in the by-fields.

   avg's attributes

   name
          An identifier that can be used to reference this operation from
          other elements. This name will most often be used in the
          parent's sort attribute.

   label
          Sets the column label that will be used for column generated by
          this element. If omitted a default label will be generated.

   field
          the field that should be averaged. If left unspecified the
          number of record will be counted.

   by-fields
          the fields that will be used to dermine the count over which
          the average is computed.

   weight
          This optional attribute can be used to create a weighted
          average. It should contain a numerical DLF field name. The
          content of that field will be used to multiply each field value
          before summing them. Its that weighted sum that will be used to
          calculate the average.


<!ELEMENT %LIRE.avg;    EMPTY                                        >
<!ATTLIST %LIRE.avg;
             %name.attr.req;
             %label.attr;
             field      NMTOKEN                            #IMPLIED
             by-fields  NMTOKENS                           #IMPLIED
             weight     NMTOKEN                            #IMPLIED >



max element

   The max element calculates the maximum value for a field in all the
   group's records.

   max's attributes

   name
          An identifier that can be used to reference this operation from
          other elements. This name will most often be used in the
          parent's sort attribute.

   label
          Sets the column label that will be used for column generated by
          this element. If omitted a default label will be generated.

   field
          the field for which the maximum value should found.


<!ELEMENT %LIRE.max;    EMPTY                                        >
<!ATTLIST %LIRE.max;
             %name.attr.req;
             %label.attr;
             field      NMTOKEN                            #REQUIRED >



min element

   The min element calculates the minimum value for a field in all the
   group's records.

   min's attributes

   name
          An identifier that can be used to reference this operation from
          other elements. This name will most often be used in the
          parent's sort attribute.

   label
          Sets the column label that will be used for column generated by
          this element. If omitted a default label will be generated.

   field
          the field for which the minimum value should found.


<!ELEMENT %LIRE.min;    EMPTY                                        >
<!ATTLIST %LIRE.min;
             %name.attr.req;
             %label.attr;
             field      NMTOKEN                            #REQUIRED >



first element

   The first element will display the value of the value of one field of
   the first DLF record within its group. The sort order is controlled
   through the sort attribute..

   first's attributes

   name
          An identifier that can be used to reference this operation from
          other elements. This name will most often be used in the
          parent's sort attribute.

   label
          Sets the column label that will be used for column generated by
          this element. If omitted a default label will be generated.

   field
          the DLF field which will be displayed.

   sort
          whitespace delimited list of fields name that should used to
          sort the records. Field names can be prefixed by - to specify
          reverse sort order, otherwise ascending sort order is used. If
          this attribute is omitted, the records will be sort in
          ascending order of the default timestamp field.


<!ELEMENT %LIRE.first;    EMPTY                                        >
<!ATTLIST %LIRE.first;
             %name.attr.req;
             %label.attr;
             field      NMTOKEN                            #REQUIRED
             sort       NMTOKENS                           #IMPLIED
             >



last element

   The last element will display the value of the value of one field of
   the last DLF record within its group. The sort order is controlled
   through the sort attribute..

   last's attributes

   name
          An identifier that can be used to reference this operation from
          other elements. This name will most often be used in the
          parent's sort attribute.

   label
          Sets the column label that will be used for column generated by
          this element. If omitted a default label will be generated.

   field
          the DLF field which will be displayed.

   sort
          whitespace delimited list of fields name that should used to
          sort the records. Field names can be prefixed by - to specify
          reverse sort order, otherwise ascending sort order is used. If
          this attribute is omitted, the records will be sort in
          ascending order of the default timestamp field.


<!ELEMENT %LIRE.last;    EMPTY                                        >
<!ATTLIST %LIRE.last;
             %name.attr.req;
             %label.attr;
             field      NMTOKEN                            #REQUIRED
             sort       NMTOKENS                           #IMPLIED
             >



count element

   The count element counts the number of records in the group if the
   fields attribute is left empty. Otherwise, it will count the number of
   different values in the fields specified.

   count's attributes

   name
          An identifier that can be used to reference this operation from
          other elements. This name will most often be used in the
          parent's sort attribute.

   label
          Sets the column label that will be used for column generated by
          this element. If omitted a default label will be generated.

   fields
          Which fields to count. If unspecified all records in the group
          are counted. If not, only different fields' value will be
          counted.

   ratio
          This attribute can be used to display the frequency as a ratio
          of the group or table total. If the attribute is set to group
          the resulting value will be the ratio on the group's total
          frequency. If the attribute is set to table, it will be
          expressed as a ratio of the total frequency of the table. The
          defaults is none which will not convert the frequency to a
          ratio.


<!ELEMENT %LIRE.count;  EMPTY                                        >
<!ATTLIST %LIRE.count;
             %name.attr.req;
             %label.attr;
             ratio      (none | group |table)              'none'
             fields     NMTOKENS                           #IMPLIED  >



records element

   The records element will put the content of selected fields in the
   report. This can be used in reports that shows events matching certain
   criteria. The fields that will be included in the report for each
   record is specified by the field element.

   records's attribute

   fields
          whitespace delimited list of fields name that should included
          in the report.


<!ELEMENT %LIRE.records; EMPTY                                       >
<!ATTLIST %LIRE.records;
             fields      NMTOKENS                          #REQUIRED >



Chapter 12. The Lire Report Markup Language

   Table of Contents

   The Report Markup Language

        report element
        Meta-information elements
        section element
        subreport element
        missing-subreport element
        table element
        table-info element
        group-info element
        column-info element
        group-summary element
        group element
        entry element
        name element
        value element
        chart-configs element

The Report Markup Language

   Document Type Definition for the XML Lire Report Markup Language as
   generated by lr_dlf2xml.

   Elements of that DTD are defined in the namespace
   http://www.logreport.org/LRML/ which will be usually mapped to the
   lire prefix.

   The latest version of that DTD is 2.1 and its public identifier is
   -//LogReport.ORG//DTD Report Markup Language V2.1//EN(TM). Its
   canonical system identifier is
   http://www.logreport.org/LRML/2.1/lrml.dtd.

<!--                    Namespace prefix for validation using the
                        DTD                                        -->
<!ENTITY % LIRE.xmlns.pfx    "lire"                                  >
<!ENTITY % LIRE.pfx          "%LIRE.xmlns.pfx;:"                     >
<!ENTITY % LIRE.xmlns.attr.name "xmlns:%LIRE.xmlns.pfx;"             >
<!ENTITY % LIRE.xmlns.attr
  "%LIRE.xmlns.attr.name; CDATA #FIXED 'http://www.logreport.org/LRML/'">

<!ENTITY % LRCML.xmlns.pfx    "lrcml"                                >
<!ENTITY % LRCML.pfx          "%LRCML.xmlns.pfx;:"                   >
<!ENTITY % LRCML.xmlns.attr.name "xmlns:%LRCML.xmlns.pfx;"           >
<!ENTITY % LRCML.xmlns.attr
  "%LRCML.xmlns.attr.name; CDATA #FIXED 'http://www.logreport.org/LRCML/'">



   This DTD uses the common modules lire-types.mod which defines the data
   types recognized by Lire and lire-desc.mod which is used to include a
   subset of DocBook in description and text elements.

<!-- Include needed modules -->
<!ENTITY % lire-types.mod PUBLIC
    "-//LogReport.ORG//ENTITIES Lire Data Types V1.0//EN"
    "lire-types.mod">
%lire-types.mod;

<!ENTITY % lire-desc.mod PUBLIC
    "-//LogReport.ORG//ELEMENTS Lire Description Elements V3.0//EN"
    "lire-desc.mod">
%lire-desc.mod;



   Each report is an XML document of which the top-level element is the
   report element. The report's data is contained in subreport elements
   (these hold the results of each report specification that was used to
   generate the report).

<!--                    Parameter entities which defines qualified
                        names of the elements                      -->
<!ENTITY % LIRE.report          "%LIRE.pfx;report"                   >
<!ENTITY % LIRE.section         "%LIRE.pfx;section"                  >
<!ENTITY % LIRE.subreport       "%LIRE.pfx;subreport"                >
<!ENTITY % LIRE.missing-subreport "%LIRE.pfx;missing-subreport"      >
<!ENTITY % LIRE.table           "%LIRE.pfx;table"                    >
<!ENTITY % LIRE.table-info      "%LIRE.pfx;table-info"               >
<!ENTITY % LIRE.group-info      "%LIRE.pfx;group-info"               >
<!ENTITY % LIRE.column-info     "%LIRE.pfx;column-info"              >
<!ENTITY % LIRE.group-summary   "%LIRE.pfx;group-summary"            >
<!ENTITY % LIRE.entry           "%LIRE.pfx;entry"                    >
<!ENTITY % LIRE.group           "%LIRE.pfx;group"                    >
<!ENTITY % LIRE.name            "%LIRE.pfx;name"                     >
<!ENTITY % LIRE.value           "%LIRE.pfx;value"                    >
<!ENTITY % LIRE.date            "%LIRE.pfx;date"                     >
<!ENTITY % LIRE.timespan        "%LIRE.pfx;timespan"                 >
<!ENTITY % LIRE.chart-configs   "%LIRE.pfx;chart-configs"            >
<!ENTITY % LRCML.param          "%LRCML.pfx;param"                   >

<!ELEMENT %LRCML.param; (#PCDATA|%LRCML.param;)*                       >
<!ATTLIST %LRCML.param;
             name      NMTOKEN                             #REQUIRED
             value     CDATA                               #IMPLIED >


report element

   A report starts with the report's meta-informations: title, timespan
   and description.

   The report's actual data is contained in one or more subreports.

   report's attributes

   version
          The version of the DTD to which this report complies. New
          report should use the 2.1 value.


<!ELEMENT %LIRE.report;      ((%LIRE.title;)?, (%LIRE.date;)?,
                              (%LIRE.timespan;)?, (%LIRE.description;)?,
                              (%LIRE.section;)+)                     >
<!ATTLIST %LIRE.report;
             version         %number.type;                 #REQUIRED
             %LIRE.xmlns.attr;
             %LRCML.xmlns.attr;                                       >



Meta-information elements

date element

   The date element contains the date on which the report was generated.

   The content of this element should be the timestamp in a format
   suitable for display.

   's attribute

   time
          The date in epoch time.


<!ELEMENT %LIRE.date;    (#PCDATA)                                   >
<!ATTLIST %LIRE.date;
             time        %number.type;                      #REQUIRED>



timespan element

   The timespan element contains the starting and ending date which
   delimits the period of the report.

   The content of this element should be formatted for display purpose.
   The starting and ending time of the timespan can be read in epoch time
   in the attributes. The period attribute contains the timespan period.

   timespan's attributes

   period
          Optional attribute which contains the period for which the
          report was generated.

   start
          The start time of the timespan in epoch time.

   end
          The end time of the timespan in epoch time.


<!ELEMENT %LIRE.timespan;    (#PCDATA)                               >
<!ATTLIST %LIRE.timespan;
          period        (hourly|daily|weekly|monthly|yearly) #IMPLIED
          start         %number.type;                    #REQUIRED
          end           %number.type;                    #REQUIRED   >




section element

   The section element group common subreports together. The section's
   description will usually contains informations about the filters that
   were applied in this section.

   It contains a title, a description if some global filters were applied
   and the section's subreports.

   This element doesn't have any attribute.



<!ELEMENT %LIRE.section;      ( %LIRE.title;, (%LIRE.description;)?,
                               (%LIRE.subreport;|%LIRE.missing-subreport;)*) >



subreport element

   The subreport element contains data for a certain report.

   It can contains meta-information elements, it they are different from
   the one of the report.

   Example of subreports for the email superservice are :
     * Message delay by relay in seconds.
     * Per hour traffic summary.
     * Top 10 messages delivery.
     * etc.

   The data is contains in a table element.

   If charts should be generated from the table's data, their
   configuration is contained in the chart-configs element.

   subreport's attributes

   id
          A unique identifier that can be used to link to this element.

   superservice
          the name of the superservice from which the report's data comes
          from : i.e. email, www, dns, etc.

   type
          This is the name of the report specification that was used to
          generated this subreport.

   schemas
          A space delimited list of the schemas used by this subreport.


<!ELEMENT %LIRE.subreport;    ( %LIRE.title;, (%LIRE.description;)?,
                                %LIRE.table;, (%LIRE.chart-configs;)?) >
<!ATTLIST %LIRE.subreport;
             id             ID                              #REQUIRED
             superservice   %superservice.type;             #REQUIRED
             type           CDATA                           #REQUIRED
             schemas        NMTOKENS                        #REQUIRED >



missing-subreport element

   missing-subreport's attributes

   id
          A unique identifier that can be used to link to this element.

   superservice
          the name of the superservice from which the report's data comes
          from : i.e. email, www, dns, etc.

   type
          This is the name of the report specification that was used to
          generated this subreport.

   schemas
          A space delimited list of the schemas used by this subreport.

   reason
          The reason why this subreport is missing.


<!ELEMENT %LIRE.missing-subreport; (EMPTY)                           >
<!ATTLIST %LIRE.missing-subreport;
             id             ID                              #IMPLIED
             superservice   %superservice.type;            #REQUIRED
             reason         CDATA                          #IMPLIED
             type           CDATA                          #REQUIRED
             schemas        NMTOKENS                       #REQUIRED >



table element

   The table element contains the data of the subreport. It starts by a
   table-info element which contains information on the columns defined
   in the subreport. Following the table structure, there is a
   group-summary element which contains values computed over all the
   records.

   A table element can contains the subreport data directly or the data
   can be subdivided into groups.

   An example of a subreport which would contains directly the data would
   be "messages per to-domain, top-10". This would contains ten entries,
   one for each to-domain.

   An example of a subreport which would contains data in group would be
   "deliveries to users, per to-domain, top 30, top 5 users". It would
   contain 30 groups (one per to-domain) and each group would contain 5
   entries (one per user).

   Group can be nested to arbitrary depth (but logic don't recommend to
   nest too much).

   table's attributes

   show
          the number of entry to display. By default all entries should
          be displayed.


<!ELEMENT %LIRE.table;        (%LIRE.table-info;, %LIRE.group-summary;,
                               (%LIRE.entry;)*) >
<!ATTLIST %LIRE.table;
              show         %int.type;                        #IMPLIED >



table-info element

   The table-info element contains information on the table structure. It
   contains one column-info element for each columns defined. It will
   also contains one group-info element for every grouping operation used
   in the report specification.

   This element doesn't have any attribute.

<!ELEMENT %LIRE.table-info;   (%LIRE.column-info;|%LIRE.group-info;)+ >



group-info element

   The group-info element play a similar role to the table-info element.
   Its used to group the columns defined by particular subgroup.

   group-info's attribute

   name
          This attribute holds the name of the operation in the report
          specification which was responsible for the creation of this
          group data.

   row-idx
          Specify the row index of the table header in which this group's
          categorical labels should be displayed.


<!ELEMENT %LIRE.group-info;   (%LIRE.column-info;|%LIRE.group-info;)+ >
<!ATTLIST %LIRE.group-info;
             name     NMTOKEN                              #REQUIRED
             row-idx  %int.type;                           #REQUIRED >



column-info element

   The column-info element describes a column of the table. It holds
   information related to display purpose (label, class, col-start,
   col-end, col-width) as well as information needed to use the content
   of the column as input to other computation (type, name).

   The col-start, col-end and col-width can be used to render the data in
   grid.

   column-info's attributes

   name
          This attribute contains the name of the operation in the report
          specification which was used to generata data in this column.

   type
          The Lire data type of this column.

   class
          This attribute can either be categorical or numerical.
          Categorical data is held in name element and numerical data is
          held in value element. Also, numerical column will have
          column-summary element associated to them.

   label
          This optional attribute contains the column's label. If
          omitted, the name attribute's content will be used.

   col-start
          The column number in which this column start. The first column
          being column 0.

   col-end
          The column number in which this column ends. The first column
          being column 0. Spans are used to cover "padding columns" to
          indent grouped entries under their parent entry.

   col-width
          The suggested column width (in characters) to use for this
          column.

   max-chars
          The maximum entry's length in that column (this includes the
          label).

   avg-chars
          The average entry's length in that column (this includes the
          label). This value is rounded up to the nearest integer.


<!ELEMENT %LIRE.column-info;        EMPTY                        >
<!ATTLIST %LIRE.column-info;
             name          NMTOKEN                      #REQUIRED
             class         (categorical|numerical)      #REQUIRED
             type          (%lire.types;)               #REQUIRED
             label         CDATA                        #IMPLIED
             col-start     %int.type;                   #REQUIRED
             col-end       %int.type;                   #REQUIRED
             col-width     %int.type;                   #IMPLIED
             max-chars     %int.type;                   #IMPLIED
             avg-chars     %int.type;                   #IMPLIED >



group-summary element

   The group-summary contains one value element for all the columns that
   contains numerical data. These elements will contains the statistics
   computed over all the DLF records which were processed by the group or
   the subreport.

   group-summary's attribute

   nrecords
          The number of DLF records that were processed by this group or
          subreport.

   missing-cases
          This attribute contains the number of LIRE_NOTAVAIL values
          found when computing the statistic. This number represents the
          number of records which didn't have the required information to
          group the records appropriately. If ommited or equals to 0, it
          means that all records had all the required information.

   row-idx
          Specify the row index in the table at which the group's summary
          value should be displayed. If this is attribute is omitted, the
          summary values won't be displayed.


<!ELEMENT %LIRE.group-summary; (%LIRE.value;)*                       >
<!ATTLIST %LIRE.group-summary;
             nrecords         %int.type;                   #REQUIRED
             missing-cases    %int.type;                    #IMPLIED
             row-idx          %int.type;                    #IMPLIED >



group element

   The group element can be used to subdivide logically a report. It's
   used for aggregate reports like message per user per domain.

   It contains a group-summary element which contains the group's values
   for the whole group followed by the entries that makes the group.

   Groups can be nested more than once, but too much nesting augments
   information clutter and isn't useful for the user.

   group's attributes

   id
          A unique identifier that can be used to link to this element.

   show
          the number of entry to display. By default all entries should
          be displayed.


<!ELEMENT %LIRE.group;        (%LIRE.group-summary;, (%LIRE.entry;)*)>
<!ATTLIST %LIRE.group;
             id           ID                                #IMPLIED
             show         %int.type;                        #IMPLIED >



entry element

   The entry contains the data from the report. It is similar to a row in
   a table altough one entry may represents several rows when it includes
   nested groups.

   The name elements contain categorical items of data like user name,
   email, browser type, url. Note that numeric ranges (like time period
   for example) are also considered categorical data items.

   The value elements contain numericical data which are the result of a
   descriptive statistical operation: message count, bytes transferred,
   average delay, etc.

   entry's attribute

   id
          A unique identifier that can be used to link to this element.

   row-idx
          Specify the row index in the table at which this entry's name
          and value elements should be rendered. If this is attribute is
          omitted, the entry won't be displayed.


<!--
                                                                   -->
<!ELEMENT %LIRE.entry;        (%LIRE.name;,
                                (%LIRE.name;|%LIRE.value;|%LIRE.group;)+)>
<!ATTLIST %LIRE.entry;
             id           ID                                #IMPLIED
             row-idx      %int.type;                        #IMPLIED >




name element

   The name elements contains categorical data column value. Its also
   used for numerical values that represents a class of values (like
   produced by the rangegroup or timegroup operations for example.)

   name's attributes

   id
          A unique identifier that can be used to link to this element.

   col
          The column's name. It should be the same than the one in the
          corresponding column-info element.

   value
          When the displayed format is different from the DLF
          representation, this attribute contains the DLF representation.

   range
          In some cases (like in report generated by the timegroup,
          timeslot or rangegroup specification), this attribute will
          contains the range's length from the starting value which is in
          the 'value' attribute.


<!ELEMENT %LIRE.name;         (#PCDATA)                              >
<!ATTLIST %LIRE.name;
             id               ID                            #IMPLIED
             col              NMTOKEN                       #REQUIRED
             value            CDATA                         #IMPLIED
             range            %number.type;                 #IMPLIED >



value element

   The value element contains numerical column value..

   value's attributes

   id
          A unique identifier that can be used to link to this element.

   col
          The column's name. It should be the same than the one in the
          corresponding column-info element.

   value
          contains the value in numeric format. This is used when the
          value was scaled (1k, 5M, etc.)

   total
          for average value, this contains the total used to compute the
          average.

   n
          for average value, this contains the n value that was used to
          compute the average.

   missing-cases
          This attribute contains the number of LIRE_NOTAVAIL values
          found when computing the statistic. When omitted, its assume to
          have a value of 0, i.e. that the value was defined in each DLF
          record.


<!ELEMENT %LIRE.value;        (#PCDATA)                              >
<!ATTLIST %LIRE.value;
             id          ID                                 #IMPLIED
             col         NMTOKEN                            #REQUIRED
             missing-cases %int.type;                       #IMPLIED
             value       %number.type;                      #IMPLIED
             total       %number.type;                      #IMPLIED
             n           %number.type;                      #IMPLIED >



chart-configs element

   This element contains one or more chart configurations that should be
   generated from the table's. These chart configurations are specified
   using the Lire Report Configuration Markup Language.

   This element has no attribute.

<!ELEMENT %LIRE.chart-configs; (%LRCML.param;)+                      >



Lire Developers' Conventions

   Table of Contents

   13. Contributing Code to Lire
   14. Developers' Toolbox

        Required Tools To Build From CVS
        Accessing Lire's CVS

              CVS primer

        SourceForge
        Mailing Lists

   15. Coding Standards

        Shell Coding Standards
        Perl Coding Standards

   16. Making Lire "Test-infected"

        Unit Tests in Lire

              PerlUnit

        Writing Tests
        Running Tests
        Some "Best Practices" on Unit Testing

   17. Commit Policy

        CVS Branches

              Hands-on example
              Naming, what it looks like
              Creating a Branch
              Accessing a Branch
              Merging Branches on the Trunk

   18. Testing and debugging

        Test before releasing
        Test-installations and test-runs
        Using the Perl debugger on Lire code

   19. Making a Release

        Setting version in NEWS file, checking ChangeLog
        Tagging the CVS
        Building The Tarball
        Building The Debian Package
        Building The RPM Package
        Making sure the FreeBSD port gets updated
        Uploading The Release

              The LogReport Webserver

        Advertising The Release

              SourceForge
              Freshmeat.net

   20. Website Maintenance

        Documentation on the LogReport Website

              Publishing the DTD's

   21. Writing Documentation

        Plain Text
        Perl's Plain Old Documentation: maintaining manpages
        Docbook XML: Reference Books and Extensive User Manuals

Chapter 13. Contributing Code to Lire

   The LogReport team invites you to contribute code to Lire. We're very
   happy with any code contributions which work for you: it'll very
   likely will make life easier for other people too! We ask you to
   consider some points, when writing code to get distributed with Lire.

   When adding new scripts, or extending and improving current Lire code,
   make sure you're working with the current Lire code. (When working
   with old code, the bug you're working on might be fixed already by
   somebody else.) You can get the current code by fetching our CVS from
   SourceForge, using the anonymously accessible pserver:
cvs -d:pserver:anonymous@cvs.sourceforge.net:/cvsroot/logreport login


   When prompted for a password for anonymous, simply press the Enter
   key.
cvs -z3 -d:pserver:anonymous@cvs.sourceforge.net:/cvsroot/logreport co service


   See also the instructions on the SourceForge website. Alternatively,
   you can peek at the Lire CVS using your webbrowser.

   When you'd like to change e.g. /usr/local/bin/lr_log2report, you'll
   have to hack on
   cvs/sourceforge/logreport/service/all/script/lr_log2report.in. This
   file will get converted to lr_log2report by running ./configure. Of
   course, when adding scripts or extending scripts, be sure to update
   the scripts' manpage too.

   If you'd like the LogReport team to distribute your contribution, be
   sure to offer it to the team under a suitable software license. Refer
   to the Licensing section in the Lire FAQ for details.

   Once you've tested your script, you can send it too the LogReport
   development list on development@logreport.org. The LogReport team will
   be happy to ship your contribution with the next Lire release.

Chapter 14. Developers' Toolbox

   Table of Contents

   Required Tools To Build From CVS
   Accessing Lire's CVS

        CVS primer

   SourceForge
   Mailing Lists

Required Tools To Build From CVS

   In order to be able to build the program from the CVS tree and make a
   tarball distribution the following tools are needed:
     * DocBook XML 4.1.2
     * DocBook DSSSL stylesheets
     * autotools
     * Jade(TM) or OpenJade(TM)
     * lynx
     * GNU make
     * Perl's XML::Parser module
     * dia
     * epsffit
     * epstopdf
     * xsltproc
     * xmllint

   For Debian woody the packages are: docbook-utils,
   docbook-xml-stylesheets, autoconf, automake1.4, autotools-dev, jade,
   lynx, make and libxml-parser-perl.

   You need automake version 1.4. Building using automake 1.7 will very
   likely not work.

Accessing Lire's CVS

   Make sure you've got an account on SourceForge. Get yourself added to
   the logreport project. (Joost van Baal joostvb@logreport.org can do
   this for you.) Make sure your ssh public key is on the sourceforge
   server.

   A full backup of the complete LogReport CVS as hosted on SourceForge
   is made weekly and written to hibou:/data/backup/cvs/.

CVS primer

   If you have a Unix like system, make sure you have this
            CVSROOT=:ext:cvs.sourceforge.net:/cvsroot/logreport
            CVS_RSH=ssh


   in your shell environment.

   Of course, you could do something like
          $ eval `ssh-agent`
          $ ssh-add


   to get a nice ssh-agent running.

   Now do something like
            $ cd ~/cvs-sourceforge/logreport
            $ cvs co service


   There are also repositories called 'docs' and 'package'. In the former
   the webpages are located and in the latter the package files for
   Debian GNU/Linux(TM) and other distributions are kept.

   Files can then be edited and commited:
          $ vi somefile
          $ cvs commit somefile


   and get flamed ;)

   Subscribe yourself to the commit list (commit-request@logreport.org),
   to get all commit messages, along with unified diffs.

SourceForge

Mailing Lists

Chapter 15. Coding Standards

   Table of Contents

   Shell Coding Standards
   Perl Coding Standards

   Indentation should be four spaces. No tabs please.

   See also Message-Id: <1028238571.1085.185.camel@Arendt.Contre.COM> on
   the development mailing list for some rationale on coding standards.

Shell Coding Standards

   Shell scripts should run -e. Shell script should be portable. Refer to
   http://doc.mdcc.cx/doc/autobook/html/autobook_208.html .

Perl Coding Standards

   Perl scripts should use strict, and run -w. Documentation should come
   in .pod format, documentation about script internals should be in perl
   comments.

   No & in function call unless necessary.

   Split long lines using hard return; try to respect the 72th column
   margin (this is kind of a soft limit).

   Refer to the Lire::Program manpage for more details.

Chapter 16. Making Lire "Test-infected"

   Table of Contents

   Unit Tests in Lire

        PerlUnit

   Writing Tests
   Running Tests
   Some "Best Practices" on Unit Testing

   Soon after the release of Lire 1.2.1, unit tests were introduced in
   the source tree. Unit tests help development in several ways; the most
   important one being that you can make changes to code and run the unit
   tests to make sure that nothing was broken by that changes.

   You can find helpful resources on Unit testing on the PerlUnit home
   page as well as on the JUnit home page from which it was inspired.

Unit Tests in Lire

PerlUnit

   Unit tests are written using the PerlUnit framework. You need to
   install version 0.24 or later of the Test::Unit to run the unit tests.

Writing Tests

   General information on using the PerlUnit framework can be found in
   the Test::Unit man page. Information on writing individual test cases
   can be found in the Test::Unit::TestCase man page.

   Tests for individual modules should be defined in tests::moduleTest
   package. You can omit the Lire:: prefix and you can inline
   intermediary package names. For example, the unit tests of the
   Lire::ExtendedDlfSchema module are in the tests::ExtendedDlfSchemaTest
   package and the tests of the Lire::Timegroup module are in the
   tests::TimegroupTest package.

   The Lire::Tests namespace is reserved for extensions to the PerlUnit
   framework that will be used to provide "fixtures" and "assertions"
   that are of general use for common Lire extensions.

Note

   This section will be expanded as common patterns for writing unit test
   for DLF converters, analyzers and other common Lire extension are
   developped.

Running Tests

   To run tests, you use the TestRunner.pl script included with the
   PerlUnit distribution. You'll need to add the directory containing the
   Lire libraries to perl library path. For example, if you have
   TestRunner.pl in your ~/bin directory, you can run a test case from
   the top level source directory like this:
$ perl -Iall/lib ~/bin/TestRunner.pl tests::ExtendedDlfSchemaTest


   tests::ExtendedDlfSchemaTest can be replaced by your TestCase module.

Some "Best Practices" on Unit Testing

   This section lists some tips on how to make effective use of Unit
   tests in common development situations on Lire.

   Changing interface/implementation. Before changing a module interface
   or implementation, make sure that this module has test cases and that
   it passes its tests before changing the implementation. This way you
   can know that your changes didn't break anything.

   Debugging. A good opportunity for writing tests is when bugs are
   reported. Before trying to chase the bug using the debugger or adding
   print statements, write a test case that will fail as long as the bug
   isn't fixed. This achieves two purpose: first, you'll know when the
   bug is fixed as soon as the test pass; secondly, we now have a test
   case that will warn us if we regress and the bug reappears.

Chapter 17. Commit Policy

   Table of Contents

   CVS Branches

        Hands-on example
        Naming, what it looks like
        Creating a Branch
        Accessing a Branch
        Merging Branches on the Trunk

   Make sure your changes run on your own platform before committing. Try
   not to break things for other platforms though. Currently, Lire
   supported platforms are GNU/Linux (Debian GNU/Linux(TM), Red Hat
   Linux(TM), Mandrake Linux(TM)), FreeBSD(TM), OpenBSD(TM) and
   Solaris(TM).

   Documentation should be updated ASAP, in case it's obsolete or
   incomplete by new commits.

CVS Branches

   When doing major architectural changes to Lire, branches in CVS are
   created to make it possible to continue to fix bugs and to add small
   enhancements to the stable version while development continues on the
   unstable version. This applies mainly to the service repository. The
   doc and package repositories generally don't need branching.

   BTW: A nice CVS tutorial is available in the Debian cvsbook package.

Hands-on example

   A branching gets announced. Be sure to have all your pending changes
   commited before the branching occurs. After a branch has been made,
   one can do this:
$ cd ~/cvs-sourceforge/logreport
$ mv service service-HEAD
$ cvs co -r lire-20010924 service
$ mv service service-lire-20010924


   or (with the same result)
$ mv service service-HEAD
$ cvs co -r lire-20010924 -d service-lire-20010924 service


   Now, when working on stuff which should be shipped in the coming
   release, one should work in service-lire-20010924. When working on
   stuff which is rather fancy and experimental, and which needs a lot of
   work to get stabilized, one should work in service-HEAD.

Naming, what it looks like

   Here is what branches schematically look like:

      release-20010629_1 ---> lire-unstable-20010703 ---> HEAD
            \
             \
         lire-20010630 ---> lire-stable-20010701



   In this diagram a branch named lire-20010630 was created from the
   release-20010629_1 tag. lire-unstable-20010703 is another tag on the
   trunk (the trunk is the main branch). HEAD isn't a real tag, it always
   points to latest version on the trunk.

Creating a Branch

   To create a branch, one runs the command cvs rtag -b -r release-tag
   branch-name module. Note that this command doesn't need a checkout
   version of the repository. For example, to create the
   release-20010629_1-bugfixes branch in the service repository, e.g. to
   backport bugfixes to version 20010629_1, one would use cvs rtag -b -r
   release-20010629_1 release-20010629_1-bugfixes service. When ready for
   release, this could get tagged as release-20010629_2.

   The release-tag should exist before creating the branch. In case you
   want to branch from HEAD, use -r HEAD. E.g. cvs rtag -b -r HEAD
   release_1_1-branch service. Once Lire 1.1 gets released, tag it as
   release_1_1.

Accessing a Branch

   To start working on a particular branch, you do cvs update -r
   branch-name. For example, to work on the release_1_1-branch branch,
   you do in your checked out version, cvs update -r release_1_1-branch.
   This will update your copy to the version release_1_1-branch and will
   commit all future changes on that branch.

   Alternatively, you can also specify a branch when checking out a
   module using cvs co -r branch-name module. For example, you could
   checkout the stable version of Lire by using cvs co -r
   release_1_1-branch service.

   To see if you are working on a particular branch, you can use the cvs
   status file command. For example, running cvs status NEWS could show:

===================================================================
File: NEWS              Status: Up-to-date

   Working revision:    1.74
   Repository revision: 1.74    /cvsroot/logreport/service/NEWS,v
   Sticky Tag:          lire-stable
   Sticky Date:         (none)
   Sticky Options:      (none)



   The branch is indicated by the Sticky Tag: keyword. If its value is
   (none) you are working on the HEAD branch.

   To work on the HEAD, you remove the sticky tag by using the command
   cvs update -A.

Merging Branches on the Trunk

   You can bring bug fixes and small enhancements that were made on a
   branch into the unstable version on the trunk by doing a merge. You do
   a merge by using the command cvs update -j branch-to-merge in your
   working directory of the trunk. Conflicts are resolved in the usual
   CVS way. For example, to merge the changes of the stable branch in the
   development branch, you would use cvs update -j lire-stable.

   You should tag the branch after each successful merge so that future
   changes can be easily merged. For example, after merging, you do in a
   checked out copy of the lire-stable branch: cvs tag
   lire-stable-merged-20010715. In this way, one week later we can merge
   the week's changes of the stable branch into the unstable branch by
   doing cvs update -j lire-stable-merged-20010715 -j lire-stable.

Chapter 18. Testing and debugging

   Table of Contents

   Test before releasing
   Test-installations and test-runs
   Using the Perl debugger on Lire code

Test before releasing

   One week before release the software should be tested on all supported
   platforms. In between releases the system gets tested on various
   platforms on an ad hoc basis. When testing, use the to-be-released
   tarball. Run make distcheck to generate such a tarball.

   Especially when changes to the Lire core have been made, the "test"
   superservice can be handy, for easy setting up of tests of your code.
   See also the section on Unit Testing in this document.

Test-installations and test-runs

   We give some hints on various ways to debug the Lire code. One can
   make a test-install by extracting a tarball and running e.g.
$  ./configure --prefix=$HOME/local && make && make install
$ PATH=$HOME/local/bin:$PATH; export PATH
$ MANPATH=$HOME/local/share/man; export MANPATH


   One can do a test-run by executing:
$ echo 'some bug-triggering log line' | lr_log2report -o xml <converter> > /tmp
/report.xml
$ lr_xml2report -o txt /tmp/report.xml > /tmp/report.txt
$ $HOME/local/libexec/lire/convertors/combined2dlf < /tmp/combined.log > /tmp/d
lf



Using the Perl debugger on Lire code

   Please use the perl debugger: investing some time to learn is pays
   back really quick. Here's a very tiny howto.

   Start the debugger as e.g.
perl -d `which lr_log2report` -o xml combined < tmp/log > /dev/null


   After starting the debugger, run "v" and "c lineno" to make sure all
   modules are loaded. Once that's done, you can fast-forward to a
   relevant routine using e.g. "c
   Lire::DlfAnalysers::ReferrerCategoriser::categorise". Now you can
   inspect variables and evaluate expressions by running e.g.
 DB<12> x $parsed_url->{'query'}


   Also, be sure to try the commands "s" and "r". Just these 4 command
   very likely are enough to get your job done. (The "y" command might be
   useful too, though). See perldebug(1) and perldebtut(1) for more
   information.

Chapter 19. Making a Release

   Table of Contents

   Setting version in NEWS file, checking ChangeLog
   Tagging the CVS
   Building The Tarball
   Building The Debian Package
   Building The RPM Package
   Making sure the FreeBSD port gets updated
   Uploading The Release

        The LogReport Webserver

   Advertising The Release

        SourceForge
        Freshmeat.net

   Before making an official Lire release, it should have been tested on
   all supported platforms. A release shouldn't be made unless Lire
   builds, installs and generates an ASCII report from all supported log
   files on all supported platforms. If this is not the case, the release
   should be delayed untill this is fixed.

   Making a new release of Lire involves many steps:
    1. Writing the final version number in NEWS.
    2. Tagging the CVS tree.
    3. Building the "Standard" Lire tarball.
    4. Building the Debian GNU/Linux(TM) package.
    5. Building the RPM package.
    6. Making sure the FreeBSD package gets updated.
    7. Uploading the tarballs and making packages available.
    8. Advertising the release.

Setting version in NEWS file, checking ChangeLog

   Inbetween releases, the NEWS file generally reads "version in cvs".
   This should of course be changed to e.g. "version 20011205".

   We maintain a ChangeLog file. Make sure the ChangeLog in the toplevel
   directory is not too big. If needed, split off a chunk and move it to
   doc/. The ChangeLog is autogenerated from the CVS commits, using the
   cvs2cl tool. One could e.g. run cvs2cl --prune --stdout -l "-d
   \>yesterday" -U ../CVSROOT/users.

Tagging the CVS

   Run e.g. cvs tag release-20011017.

Building The Tarball

    1. Start from a fresh copy by running the command make
       maintainer-clean-recursive in the directory where you checked out
       Lire's source code.
         a. Make sure that there are no tarballs in the extras
            subdirectory.
    2. Set the version and prepare the source tree by running the command
       ./bootstrap. (You can overwrite the pre-cooked version by doing
       e.g. echo `date +%Y%m%d`-R-f-jvb-1 > VERSION . Make sure your
       version hasn't got too many characters. Non-GNU tar chokes if
       pathnames in the archive are too long.)
    3. Generate Makefiles
         a. Run ./configure
    4. Build Lire and create the tarball by running the command make
       distcheck.
       This will build a tarball lire-version.tar.gz and then make sure
       that the content of this tarball can be built and installed. If
       that command fails, Lire isn't ready to be released. Fix the
       errors before making the release.
    5. Sign Lire's tarball with your public key. To do this with
       GnuPG(TM), run gpg --detach-sign --armor lire-version.tar.gz.
       A file lire-version.tar.gz.asc will be created. Publish this file
       together with the tarball. Now, people downloading the tarball can
       verify its integrity by downloading the .asc as well as your
       public key, and running gpg --verify lire-version.tar.gz.asc .

Building The Debian Package

   This is a raw unformatted dump of what we did to build and upload the
   Lire .deb.
              $ cd ~/cvs-sourceforge/logreport/package/debian
              $ vi changelog

:r !date --rfc

              $ cd /usr/local/src/debian/lire/debian/20010219


   Run something like 'DIB_V=20020214 DIB_P=lire DIB_TARDIR=../archive/
   ./debian-install-build'. This does:
              $ cd /usr/local/src/debian/lire/debian/20010219
              $ cp \
  ~/cvs-sourceforge/logreport/service/lire-20010219.tar.gz .

              $ tar zxf lire-20010219.tar.gz
              $ cd lire/20010418
              $ mv lire-20010418 lire-20010418.orig
              $ tar zxf lire-20010418.tar.gz
              $ cd lire-20010418
              $ mkdir debian
              $ cp \
   ~/cvs-sourceforge/logreport/package/debian/[^C]* debian/


   Export the shell environment variable EMAIL, it should hold your email
   address, as it is to appear in the maintainers field of the package.
   (One could use 'dh_make --copyright gpl -s' on first time
   debianizing.) Build the .deb by running:
$ debuild 2>&1 | tee /tmp/build


   Check the .deb:
$ debc | less


   You might also want to test wether the Debianized sources build fine
   on other machines: copy diff.gz, orig.tar.gz and .dsc. Then do
$ dpkg-source -x lire_*.dsc
$ cd lire-version
$ dpkg-buildpackage -rfakeroot


   After having really tested it (dpkg -i, purge, etc.), optionally
   install it on any local apt-able websites you might have (Joost has
   one on http://mdcc.cx/debian/) and upload it to hibou's apt-able
   archive:
$ scp lire_20010418-1_all.deb \
 hibou.logreport.org:/var/www/logreport.org/pub/debian/dists/local/contrib/bina
ry-all/admin/

$ scp lire_20010418*.gz \
 hibou.logreport.org:/var/www/logreport.org/pub/debian/dists/local/contrib/sour
ce/admin/
$ scp lire_20010418*.*s* \
 hibou.logreport.org:/var/www/logreport.org/pub/debian/dists/local/contrib/sour
ce/admin/


   Move the old debian stuff on hibou to hibou:/pub/archive/debian/ .
   Update the Packages file by running
$ cd /var/www/logreport.org/pub/debian
$ make


   To upload it to the official debian mirrors:
vanbaal@gelfand:/usr...src/debian/lire/20010418% date; \
  dupload lire_20010418-1_i386.changes
Thu Apr 19 14:27:38 CEST 2001
Uploading (ftp) to ftp.uk.debian.org:debian/UploadQueue/
[ job lire_20010418-1_i386 from lire_20010418-1_i386.changes New dpkg-dev, anno
uncement will NOT be sent
 lire_20010418.orig.tar.gz, md5sum ok
 lire_20010418-1.diff.gz, md5sum ok
 lire_20010418-1_all.deb, md5sum ok
 lire_20010418-1.dsc, md5sum ok
 lire_20010418-1_i386.changes ok ]
Uploading (ftp) to uk (ftp.uk.debian.org)
 lire_20010418.orig.tar.gz 163.1 kB , ok (12 s, 13.59 kB/s)
 lire_20010418-1.diff.gz 32.6 kB , ok (3 s, 10.88 kB/s)
 lire_20010418-1_all.deb 222.4 kB , ok (16 s, 13.90 kB/s)
 lire_20010418-1.dsc 0.6 kB , ok (0 s, 0.60 kB/s)
 lire_20010418-1_i386.changes 1.2 kB , ok (1 s, 1.22 kB/s) ]


   check ftp://ftp.uk.debian.org/debian/UploadQueue/

Building The RPM Package

Making sure the FreeBSD port gets updated

   Since August 21, 2002, Lire is in the FreeBSD ports collection. Edwin
   Groothuis has build a FreeBSD port. Ask him if he's available for
   updating his port. Alternatively, Cdric Gross might be able to help.
   If not, the LogReport team should take care of it, and submit a
   Problem Report to the FreeBSD system, asking for inclusion of the
   updated port.

Uploading The Release

   To release a new distribution, publish the tarball on various places
   and send an announcement to the <announcement@logreport.org>
   mailinglist, stating the most interesting new features. Furthermore,
   add a newsitem to the news list of the website. We'll describe how to
   upload the tarball to various places.

The LogReport Webserver

   Upload the tarball to the pub area on the LogReport server. The area
   is mirrored automagically by the download.logreport.org servers;
   updates are done every 6 hours. Upload like this:
 $ scp lire-20001211.tar.gz hibou.logreport.org:/var/www/logreport.org/pub/


   On hibou, do:

                $ cd /var/www/logreport.org/pub
 $ chown .www lire-20010525.tar.gz
 $ chmod g+w lire-20010525.tar.gz

 $ tar zxf lire-20001211.tar.gz
 $ rm current && ln -s lire-20001211 current
 $ rm current.tar.gz && ln -s lire-20001211.tar.gz current.tar.gz
 $ rm -rf lire-20001205
 $ mv lire-20001205.tar.gz archive


   Update the README.txt file: Run
 $ cd /var/www/logreport.org/pub
 $ ( echo \
   'current is the latest official release'; echo; ls -lF c* ) > README.txt


   Check the symlink to the documentation stuff in the tarball.

   Check if the stuff in http://logreport.org/pub/docs is still up to
   date.

Advertising The Release

SourceForge

   In order to release a distribution on SourceForge (SF), you login with
   your SF account on the SF website. Once logged in you go to the
   project webpage and choose Admin. Down at the bottom of that page is a
   a [Edit/Add File Releases] link (click it).

   You are able to edit packages, like the Lire package in the LogReport
   project. To add a new release, choose [Add Release]. As a release name
   uses the date, like 20010407, assign it to the Lire package and then
   use the Create This Release button to makes it effective.

   The next page shows 4 steps of which only one (step 2) is not
   straightforward. In that step you assign files to a release (.tar.gz,
   .deb, .rpm). These files should be uploaded to SF's Upload anonymous
   FTP site at ftp://upload.sourceforge.net/incoming/. Make sure the file
   is placed in the /incoming directory. Click Refresh View in Step 2 to
   add the files you uploaded to the FTP site. Check the files belonging
   to the release and Click Add Files. In step 3, set Processor to any.
   Set file type to .deb and source.gz. Click update/refresh. Step 4:
   send notice. Done.

Freshmeat.net

   On Freshmeat.net, releases are not released, but get announced only.
   These announcements attract a lot of attention. The webpage for the
   Lire package can be found at http://freshmeat.net/projects/lire/.

   To announce a new release go to Lire - development branch webpage.
   Choose Add Release from the Project pull down menu in the light blue
   area. The rest is very straightforward.

Chapter 20. Website Maintenance

   Table of Contents

   Documentation on the LogReport Website

        Publishing the DTD's

   We give hints on how to upgrade the website: installing stuff from
   current CVS on http://logreport.org.

   Commits to the CVS tree of the website are automatically propagated to
   hibou. For more information on the markup language of the website, see
   the WJML documentation.

Documentation on the LogReport Website

   Be sure the links to stuff under /pub/current are still alive. E.g.
   the files TODO, dev-manual.html and user-manual.html are linked to.

Publishing the DTD's

   The DTD's are published as HTML on the website by using
   hibou:/usr/local/src/dtdparse/dtdparse-2.0b2-LogReportPatched.tar.gz,
   which is a patched version of Norman Walsh's dtdparse utility. Before
   the utility is run, make sure that the DocBook DTD is not included in
   the parsing process, because the DocBook DTD should not be published.
   This is done by changing the line:
<!ENTITY % load.docbookx     "INCLUDE"                               >

   into:
<!ENTITY % load.docbookx     "IGNORE"                               >

   The webpages are then generated with:
perl ~/dtdparse-2.0b2-patched/dtdparse.pl --title "XML Lire Report Markup Langu
age" --output lire.xml lire.dtd
perl ~/dtdparse-2.0b2-patched/dtdformat.pl --html lire.xml


   The resulting lire directory can be tar-ed, gziped and unpacked again
   on hibou in the directory /var/www/logreport.org/pub/docs/dtd/.

   The other two DTD's are HTML-ized similarly, but remember to change
   the title when running dtdparse.pl.

Chapter 21. Writing Documentation

   Table of Contents

   Plain Text
   Perl's Plain Old Documentation: maintaining manpages
   Docbook XML: Reference Books and Extensive User Manuals

   Documentation which comes with the Lire tarball is maintained in four
   formats: plain text, Perl POD, DocBook XML and UML diagrams. We'll
   talk about all four of these here.

Plain Text

   Small files like README, NEWS, AUTHORS, doc/BUGS, and doc/TODO are
   traditionally maintained in plain text format. We adhere to this
   common practice.

Perl's Plain Old Documentation: maintaining manpages

   We use Perl's pod (plain old documentation) for manpages. Every file
   installed with Lire in /usr/bin/ must have a manpage. Every file
   installed in /usr/share/perl5/Lire/ and /usr/lib/lire/ should have a
   manpage. It would be nice if the files in /etc/lire/ were documented
   in manpages too. And perhaps for some files in /usr/share/lire/xml/,
   /usr/share/lire/reports/, /usr/share/lire/filters/ and
   /usr/share/lire/schemas/ manpages could be useful.

   Since the files in /usr/bin/ are commands, ran by Lire users, the
   manpages describing these should focus on the user perspective.
   Describing the inner workings and implementations of the commands is
   less important than describing why someone would want to run the
   specific command. If there's need to make some remarks on the
   internals of these scripts, a section called DEVELOPERS could be added
   to the manpage. The perl modules installed in /usr/share/perl5/Lire/
   and the commands in /usr/lib/lire/ are not intended as interfaces for
   the user. Only people wanting to change or study the operation of Lire
   itself will interact with these files; therefore, the manpages should
   explain the inner workings and implementations of these files. The
   configuration files in /etc/lire/ might be changed by users. These
   should be properly documented: in manpages or in the Lire User's
   Manual.

Docbook XML: Reference Books and Extensive User Manuals

   The main documentation of the Lire project is done in DocBook XML
   4.1.2. E.g. this document is maintained in DocBook XML, as is the Lire
   User's Manual. The Lire User's Manual has more information about
   DocBook.

   After editing the Lire Developer's Manual or the Lire User's Manual,
   you should run make check-xml to make sure the document is still a
   valid DocBook document. You should fix any errors before committing
   your changes.

   If everything went right, documentation is built in txt, tex, html and
   pdf format by running make dist, or just make in doc/. We give some
   hints which might be helpful in case you have to build the
   documentation manually.

   To generate PDF:
          $ jade -t tex -d /path/to/DSSSL/docbook/print/docbook.dsl roadmap.xml
          $ pdfjadetex roadmap.tex

   The last step is actually done two or three times to resolve page
   numbers.

   To generate HTML:
          $ jade -t sgml -d html.dsl roadmap.xml


   And now you can use the html.dsl in the doc/source directory. (If
   necessary, adjust it to reflect the location of your DSSSL
   stylesheets). Use lynx to generate TXT output from HTML with:
            $ lynx -nolist -dump roadmap.html > roadmap.txt


Implementation Details

   Table of Contents

   22. Adding a New Superservice in Lire's Distribution
   23. Issues with Report Merging
   24. Overview of Lire scripts
   25. Source Tree Layout

Chapter 22. Adding a New Superservice in Lire's Distribution

   Integrating a new superservice in the Lire's several things:
    1. Making new directories in CVS:
          + /service/<superservice>/
          + /service/<superservice>/script/
          + /service/<superservice>/reports/
    2. Adding several files:
          + /service/<superservice>/Makefile.am
          + /service/<superservice>/reports/Makefile.am
          + /service/<superservice>/script/Makefile.am
          + /service/<superservice>/<superservice>.cfg
          + /service/< superservice>/<superservice>.xml This file
            specifies the DLF format of the superservice. Ideally, it
            should offer a place for each and every snippet of
            information which will ever be found in a logfile from a
            program which offers functionality defined by the
            superservice. This file should have documentation embedded;
            this will show up in this manual.
    3. Writing service plugins (2dlf scripts):
          + /service/<superservice>/script/<service>2dlf.in
    4. Adapting several files:
          + /service/configure.in (add the Makefiles and 2dlf script to
            AC_OUTPUT, to get them converted from <service>2dlf.in to
            <service>2dlf.)
          + /service/Makefile.am (add the superservice directory to
            SUBDIRS, so that make gets run there too, when called from
            the root source directory.)
          + /service/all/etc/address.cf (to make the new service known as
            a member of a superservice.)
    5. Update Documentation:
          + User Manual: Chapter "Supported Applications".
          + Add manpages for scripts
    6. Update the configuration by writing a custom config spec or
       extended the current one as well as by added default values to the
       defaults configuration files.

Chapter 23. Issues with Report Merging

   In some cases, a merged report doesn't display the right information.
   We outline some worst case scenarios, and justify our implementation.

   Suppose log file 1 ("requests" with "sizes") looks like:
   request size
   A       12
   B       11
   C       10

   while log file 2 looks like:
   request size
   D       3
   E       2
   F       1

   We report on the top 2 biggest requests, so the report from log 1
   looks like:
   request size
   A       12
   B       11

   while the report from log 2 would look like:
   request size
   D       3
   E       2

   Now we change the superservice.cfg file to list the top-4 biggest
   items. A naive merge would lead to:
   request size
   A       12
   B       11
   D       3
   E       2

   Of course, this should've been:
   request size
   A       12
   B       11
   C       10
   D       3

   This effect does not occur when keeping the top-limit to the same
   value. However, when we're not reporting on distinct values in the
   log, but are summing, more horrible things might happen. Consider
   this: We want to report on the total size by client. Logs look like:
   client size
   a      12
   b      11
   c      10

   and
   client size
   d      4
   e      4
   c      3

   Reports from these logs would look like:
   client size
   a      12
   b      11

   client size
   d      4
   e      4

   After naively merging, one would get:
   client size
   a      12
   b      11

   In fact, the complete report should look like:
   client size
   c      13
   a      12

   Luckily, the Lire merging algorithm is not this naive: in fact, the
   XML reports store a little more records than actually needed. This
   heuristic trick leads to sane merged reports in most cases. However,
   since this is merely a heuristic trick, it is no waterproof guarantee.

   See the description of the guess_extra_entries routine in the
   Lire::Group manpage for more implementation details.

Chapter 24. Overview of Lire scripts

   An overview of the main scripts involved. lr_spoold is the engine
   behind a Lire Online Responder. lr_log2report is the main Lire command
   line interface. The lr_log2xml command is a helper scripts. The
   lr_xml2report command can be used by the user to merge XML reports.
   The lr_sql2report is not yet fully integrated in the Lire system. The
   lr_rawmail2mail command manages a Lire client setup. The lr_cron is
   fired of by cron, in a cron-driven setup.
 lr_spoold
 |
 \_ lr_check_service
 \_ lr_spool
    |
    \_ lr_processmail
       \_ lr_getbody
       |
       \_ lr_log2mail
 lr_log2report
 lr_log2xml
 lr_xml2report
 lr_rawmail2mail
 \_ lr_getbody
 \_ lr_deanonymize
 \_ lr_xml2mail

 lr_cron




   lr_spoold monitors a Maildir spool for each responder address.
   lr_processmail processes an email message with a compressed log file
   attached. Refer to the manpages for the gory details.

Chapter 25. Source Tree Layout

   Service specific scripts should reside in
   $CVSROOT/service/<service>/script/. Configuration data should be in
   <service>/etc/. Service specific documentation in <service>/doc/.

   Furthermore, in each subdirectory there should be a Makefile.am.

Glossary

   Definitions of particular terms used in Lire.

   DLF
          See Distilled Log Format.

   Distilled Log Format
          Example 3. DNS DLF Excerpts


1010912574 10.0.0.2 121.68.134.195.in-addr.arpa PTR recurs
1010912574 10.0.0.2 121.68.134.195.in-addr.arpa PTR recurs
1010912592 10.0.0.2 120.67.123.212.in-addr.arpa PTR recurs
1010912600 10.0.0.2 207.7.178.212.in-addr.arpa PTR recurs
1010912600 10.0.0.2 tr16.kennisnet.nl A recurs
1010912616 10.0.0.2 120.67.123.212.in-addr.arpa PTR recurs
1010912630 10.0.0.2 207.7.178.212.rbl.maps.vix.com ANY recurs
1010912630 10.0.0.2 NLnet.nl ANY recurs



          This is the generic log format used by Lire to normalise the
          log files from different products.

          Currenlty, this normalised log is a simple ASCII format where
          each event is represented by one line. The information about
          the event is represented by fields separated by spaces. All
          non-printable ASCII characters are replaced by ?. Spaces in a
          field's value are replaced by _ (an underscore). Each line must
          have the same number of fields. A DLF file doesn't contain any
          header information. Example 3, "DNS DLF Excerpts" shows an
          excerpt of a DNS DLF file.

          See Also Superservice, DLF Schema.

   DLF Schema
          Information about the order of the fields in a DLF file, their
          types and what they represent is specified in the DLF's schema.
          Schemas are defined in XML files using the Lire DLF Schema
          Markup Language (LDSML). Lire's offers an API (only in Perl for
          now) to programmatically access the information of a schema.

          Log files of many different products can share a common DLF
          schema that makes Lire's reports easily comparable.

   Report
          A report is what is generated by Lire. It consists of several
          subreports. Those subreports can be grouped into sections. The
          report is computed from the DLF file (and not the native log
          file) based on a configuration file which describes the
          subreports that make up the final report along with their
          parameters. (Consult the Lire User's Manual section Customizing
          Lire for more information.)

   Service
          Put simply, a service is a specific application that produces
          log files. It is usually the case that one application will be
          equivalent to one service. For example, the mysql service is
          used to process MySQL(TM)'s log files.

          But more precisely, a service is a specific log format. For
          example, the common service can be used for all web servers
          that support the Common Log Format. Similarly, the welf service
          can be used to process firewall log files written using
          WebTrends Enhanced Log Format.

          In order to generate a report on it, the native log will be
          converted to the appropriate superservice's DLF schema

   Subreport
          A subreport is a particular view on the DLF log's data.
          Subreports are defined in XML files using the Lire Report
          Specification Markup Language (LRSML). (Although it defines
          subreports, it is called a Report Specification because a
          report is made up out of several subreports.) Example of a
          subreport would be Requests by Hours of the Day.

          Subreports are defined for a particular DLF schema.

   Superservice
          A superservice is a collection of services that share the same
          DLF schema and report. It is used to group together
          applications (services) that offer the same kind of
          functionality.

          Lire currently supports eight superservices: database, dns,
          email, firewall, ftp, print, proxy, and www.
