This option is available with samhain version 2.5.0 and higher. To compile with support for this option, use the configure option
./configure --enable-logfile-monitor
| ![[Note]](stylesheet-images/note.png) | PCRE library required | 
|---|---|
| This option requires the PCRE (Perl Compatible Regular Expressions) library. Many Linux distributions split library packages into a runtime package (required to run a dependent executable) and a development package (required to compile an executable). At least on the build host where samhain is compiled, the development package is required if you use this option. | 
This module enables samhain to monitor/analyze logfiles of other applications. Currently ( samhain 2.5.0) the following logfile formats are supported:
Syslog
Apache (access and error log)
Samba
'pacct' BSD-style process accounting (also available on Linux)
Logfile analysis will always start from the point the last one ended; the pointer into the file is stored persistently on disk. Logfile rotation is handled automatically as long as the rotated logfile remains in the same directory and is not compressed(usually log rotation tools can be configured to compress only after the second rotation, which is advisable for unrelated reasons - the logging application may still have an open file pointer after logfile rotation).
Logfile entries can be filtered with Perl-style regular expressions (filter rules). Regular expressions must match the whole logfile record. For efficiency, regular expressions can be grouped under a common regular expression, i.e. if the group expression fails to match, no RE in the group is tried. Furthermore, (groups of) regular expressions can be grouped by host, if the logfile(s) contain host information (such as host information in centralized syslog server logfiles, or virtual host information in Apache logfiles). Note that host->group->rule is supported (just as host->rule or group->rule), while group->host->rule isn't.
Each filtering rule (regular expression) is assigned to an output queue. Currently (samhain 2.5.0) queues only differ in the assigned severity of an event, but more options (per-queue mail addresses for alerts) are under development.
Filtering rules are processed in the order given in the configuration file, i.e. the first match wins.
| ![[Note]](stylesheet-images/note.png) | Blacklisting vs. whitelisting, and the 'trash' output queue | 
|---|---|
| Output queues are labelled. The label 'trash' is reserved and refers to the trash bin (no output, throw away log entries if the matching rule is assigned to the 'trash' queue). If a logfile entry does not match any rule, it is reported (i.e. the default is whitelisting known-good entries). To turn this into a blacklisting policy, simply add a catch-all rule at the end and assign it to the 'trash' queue. | 
Sometimes it is desirable to report on the fact that several events happend at a similar time, possibly in a particular order. As of version 2.6.1, samhain supports this in the following way:
First, individual events to be correlated need to be marked for keeping them, under an arbitrary user-defined label, for an arbitrary user-defined time. So the rule for matching an event has to be modified like this:
          LogmonRule=KEEP(
          seconds,label):
          queue_label:
          (perl)regex matches a
          logfile entry against the provided regular expression,
          AND keeps it for the specified time in 
          seconds, with the specified 
          label. In other words, processing of
          this rule will be no different than other rules, except
          for the fact that also a memory of the event is kept for
          the specified amount of time. So if you e.g. don't want a
          separate report for this individual event, just assign it
          to the 
          trash queue.
To correlate events labelled label_one, label_two, etc., just build a regular expression that matches the labels, in the temporal order you want to check for. E.g. if the temporal order is irrelevant, you may want to match (label_one.*label_two)|(label_two.*label_one). Use this expression in a rule maked as CORRELATE( description), like this:
            LogmonRule=CORRELATE(
            description):
            queue_label:
            (perl)regex 
          
| ![[Note]](stylesheet-images/note.png) | Old records in existing logfiles | 
|---|---|
| Because the 'keep' timeout is relative to the current time, correlation of old entries in logfiles (i.e. when, at startup, an existing logfile with old entries is scanned) will only work if you specify 'keep' timeouts that are long enough to cover the whole timespan from the first logfile record until now. | 
To check whether a given event occurs at least once within some given interval, the rule for matching an event can be modified like this:
        LogmonRule=MARK(
        seconds,description):
        queue_label:
        (perl)regex matches a
        logfile entry against the provided regular expression, AND
        checks whether is occurs at least once within the specified
        interval (seconds).
Processing of this rule will be no different than other rules otherwise, so if you e.g. only want a report for this event if it is missing, just assign it to the trash queue. However, in the latter case the severity for reporting the messages must be set separately with the LogmonMarkSeverity directive, because the 'trash' queue has no severity assigned:
        LogmonMarkSeverity=
        severity —
        Severity for reports on missing heartbeat messages if the
        messages themselves are assigned to the 'trash' queue
        (default: crit).
Samhain can automatically detect and report bursts of similar, repeated events in the monitored logfiles. Here similar, repeated events refers to events that differ (only) in details that can be expected to differ for events of the same kind: IP adresses, FQDNs, email adresses, and numbers. The event history goes back 12 minutes, and thus a report is triggered if the number of similar events within the last 12 minutes exceeds a given threshold (default: 24).
This feature is off by default. In order to switch it on, you need to set a reporting queue:
        LogmonBurstQueue=
        queue — Set the
        reporting queue for reporting bursts of similar log
        messages (default: don't report).
In addition, there are two more configurable parameters, one to set the triggering threshold (i.e. the number of messages within 12 minutes that need to be exceeded to raise an alert), and another one to indicate whether messages from the cron daemon should be considered as well (default: no):
        LogmonBurstThreshold=
        number — The
        number of repeated messages within 12 minutes that must be
        exceeded to report a burst of repeated messages (default:
        24).
        LogmonBurstCron=
        boolean —
        Whether to report also on bursts of repeated cron messages
        (default: false).
        LogmonActive=
        boolean switches this
        module on or off (default: off).
        LogmonSaveDir=
        /absolute/path sets the
        directory where checkpoint data for logfiles is stored
        (default: same as for database file).
        LogmonClean=
        boolean delete old
        checkpoint data unmodified for 30 days or more (default:
        off).
        LogmonInterval=
        seconds sets the
        interval for logfile checking (default: 10 seconds).
        LogmonMarkSeverity=
        severity —
        Severity for reports on missing heartbeat messages if the
        messages themselves are assigned to the 'trash' queue
        (default: crit).
        LogmonBurstThreshold=
        number — The
        number of repeated messages within 12 minutes that must be
        exceeded to report a burst of repeated messages (default:
        24).
        LogmonBurstQueue=
        queue — Set the
        reporting queue for reporting bursts of similar log
        messages (default: don't report).
        LogmonBurstCron=
        boolean —
        Whether to report also on bursts of repeated cron messages
        (defaul: false).
        LogmonDeadtime=
        seconds — Do not
        report a correlated event again within the given time
        (default: 60 seconds).
        LogmonWatch=
        
        TYPE:path[:format] advises the
        module to monitor the logfile with the specified 
        path, which is of type
        'TYPE' (logfile types are uppercase). Some logfile types
        (e.g. Apache access logs) can be customized, and hence some
        
        format information must be
        provided.
| ![[Note]](stylesheet-images/note.png) | Do not quote the format | 
|---|---|
| Please note that it's neither required nor supported to add quotes around the format string. Likewise, quotes within the format should not be escaped. Wrong: 
            LogmonWatch=
             Correct: 
            LogmonWatch=
             | 
Currently ( samhain 2.6.4) the following logfile types are supported
Standard UNIX style syslog files. Matching starts at the command (i.e. after the hostname). To select certain hostnames, place the rule under a LogmonHost directive (see below). If the LogmonHidePID option is used, the RE should not account for the process PID.
Apache (or compatible) webserver access and/or
              error logs. Required 
              format information: either one of
              
              combined, 
              common, or 
              error(error log), or the
              Apache custom log format specification used (also
              '%{X-Forwarded-For}i' is recognized). The whole log
              line is matched. If there are virtual hosts (%v),
              then the LogmonHost directive will match the virtual
              host.
In addition to the 
              
              Apache format specifications, is
              possible to insert a 
              literal regular expression as 
              RE{
              regex} (
              
              samhain 2.8.4+).
Samba logfile format (multiline, timestamp and origin within samba source code on first line, log message on continuation lines). The RE will match the continuation line (with the log message) only.
BSD style process accounting (also available on Linux). This is a binary logfile. The module will build a text line like the 'last' command does, and match it against the RE.
What is pacct good for? Note that pacct records contain only the executable name, not the arguments. This may look somewhat useless for shell accounts, but is quite useful for servers: how many different commands can e.g. postfix legitimately execute? Just a handful, indeed, and certainly none of them is /bin/sh! So if pacct says that the 'postfix' user has executed a shell, then this would be rather alarming...
A shell command. The full output on stdout will
              be read and matched. The PATH environment variable
              will be set to 
              
              /sbin:/bin:/usr/sbin:/usr/bin:/usr/ucb,
              and the SHELL, IFS, and TZ variables will be defined.
              The command is executed via 
              /bin/sh -c 
              command .
        LogmonHidePID=
        boolean is an option
        that only affects logfiles of type SYSLOG. It causes the
        PID to be stripped from the log line (before matching
        against the RE).
        LogmonQueue=
        
        label:[interval]:(sum|report):severity[:alias] defines
        an output queue. Here, 
        label is an arbitrary name which is
        used to assign rules to this queue; 
        interval is the timespan over which
        messages are summarized if the queue is of type 'sum'; 
        sum(summarize over some interval) or 
        report(report each event seperately
        and immediately) are the two queue type supported, and 
        severity is the severity assigned to an
        event. Furthermore, optionally it is possible to specify an
        
        alias(must be defined in the email
        configuration) to direct email for this rule to a specific
        list of recipients.
| ![[Note]](stylesheet-images/note.png) | |
| If you spefify a list alias, email will still go to all defined email recipients unless filtered, e.g. with SetMailFilterNot = \[Logfile\] I.e. you may want to define recipients, filter them as above, and then define list aliases to be used in an event queue. See Section 4 for more information. | 
        LogmonHost=
        (perl)regex causes the
        following rules to be applied only to entries for this
        host(s). It is ended implicitely by another LogmonHost
        directive, or explicitely by a LogmonEndHost
        directive.
LogmonEndHost explicitely ends a preceding LogmonHost directive.
        LogmonGroup=
        (perl)regex causes the
        following rules to be applied only if the group regex
        matches (i.e. rules within the group are skipped if the
        group regex doesn't match. This can be used to improve
        speed/efficiency of matching, i.e. you can group regexes by
        a common prefix. A group is ended implicitely by another
        LogmonGroup directive, or explicitely by a LogmonEndGroup
        directive.
LogmonEndGroup explicitely ends a preceding LogmonGroup directive.
        LogmonRule=
        
        queue_label:(perl)regex matches a
        logfile entry against the provided regular expression. If
        the expression matches, then 
        captured subexpressions are replaced by
        '___', and the logfile entry is reported as specified for
        the queue referenced by 
        queue_label. Non-captured
        subexpressions (i.e. subexpressions where the opening
        bracket is followed by '?:') are 
        not replaced by '___', but reported
        literally.
        LogmonRule=KEEP(
        seconds,label):
        
        queue_label:(perl)regex as above,
        but additionally keep the event 
        label for 
        seconds to perform event
        correlation.
        LogmonRule=CORRELATE(
        description):
        queue_label:
        (perl)regex perform
        event correlation by matching the 
        labels(as specified in KEEP rules) of
        a sequence of events against the given regular
        expression.
        LogmonRule=MARK(
        seconds,description):
        queue_label:
        (perl)regex matches a
        logfile entry against the provided regular expression, AND
        checks whether is occurs at least once within the specified
        interval (seconds).
[Logmon] # # Switch on the module # LogmonActive = yes # Check every second # LogmonInterval = 1 # Strip PIDs from syslog messages # Logmonhidepid = true # Define a queue with severity 'crit'. # This is a 'report' queue, hence 'interval' (10) # will be ignored. # LogmonQueue = q1:10:report:crit # Define a second queue with severity 'alert' # LogmonQueue = q2:10:report:alert # Monitor /var/log/messages, which is a syslog file # LogmonWatch = SYSLOG:/var/log/messages # Monitor /var/log/samba/log.nmbd, which is a samba # logfile # LogmonWatch = SAMBA:/var/log/samba/log.nmbd # Monitor /var/log/apache2/access.log, which is # an Apache logfile in 'combined' format # LogmonWatch = APACHE:/var/log/apache2/access.log:combined # Monitor disks to check for full /dev/sda1 # LogmonWatch = SHELL:df -h # Syslog messages for the pppd deamon # LogmonGroup = g1:pppd.* # # Rules in this group # LogmonRule = q1:pppd:\s+primary.* LogmonRule = q1:pppd:\s+secondary.* # LogmonEndGroup # Warn about disk /dev/sda1 nearly full (80% or more. Use a # non-capturing subexpression [the (?:8|9)] for the percentage full. # LogmonRule = q1:/dev/sda1\s+[0-9GM.]+\s+[0-9GM.]+\s+[0-9GM.]+\s+(?:8|9).%.* # Messages starting with WARNING (some samba stuff) # LogmonGroup = g2:WARNING.* LogmonRule = q2:.*interfaces.* LogmonEndGroup # Report on these events if happening within 120 seconds. # Set LogmonDeadtime to 120 seconds to avoid multiple reports. # Use the 'trash' queue for the keep rules to avoid reports on # the individual events. # LogmonRule = KEEP(120,event1):trash:sshd: Accepted publickey for root.* LogmonRule = KEEP(120,event2):trash:sshd: pam_unix\(sshd:session\).* LogmonRule = CORRELATE(root_login):q1:(event1.*event2)|(event2.*event1) LogmonDeadtime = 120 # Throw away all non-matching entries. This amounts # to a blacklist policy (only report known bad). # # Usually considered bad practice!!! Use whitelisting! # # 'trash' is a built in queue, no definition needed. # LogmonRule = trash:.*