sendmail_bayes(8)          Expaminator          sendmail_bayes(8)



NAME
       sendmail_bayes

SYNOPSIS
       sendmail_bayes [-D level] [-h] [-v]  [-c config-file]

DESCRIPTION
       A Bayesean sendmail 'milter' for spam
       sendmail_bayes  is a sendmail 'milter' which normally runs
       as a daemon.  Sendmail communicates with  milters  through
       either  a  unix  or  internet-domain socket, which must be
       specified  in  both  sendmail's  ".cf"  file   and   send­
       mail_bayes's configuration file.

       Command-line options:

       -D Set the Debug level; 'level' should be a positive inte­
          ger 0..50.  Higher values for  debug-level  cause  more
          internal variables to be dumped to stdout.  This is not
          associated with the error-logging level, ´debug´;  '-D'
          should not be used during normal mail processing.

       -X Don't daemonise; useful only for debugging.

       -h Help; print the command-line options and exit.

       -v Version; print the version number and exit.

       -c  specify a Configuration file.   If '-c config-file' is
          omitted, the environment variable 'SPAMCONFIG' is used.


CONFIGURATION
       sendmail_bayes'  configuration  file is composed of simple
       keyword-value pairs, one pair per line.  Keywords are  not
       case-sensitive;  keyword and value are separated by one or
       more spaces or tabs.   A comment symbol, '#' anywhere on a
       line  causes  all  following  text  to  be ignored.  send­
       mail_bayes will stop scanning for a keyword at  the  first
       occurence in the file.
       This  configuration file is shared by the database mainte­
       nance and testing utilities.

       approval_message <headername>
       optional; default is 'X-judged-non-spam'
       When a message has been judged  legitimate,  a  header  is
       added  to  it  containing  an  indication that it has been
       passed through the filter.  The host name,  which  may  be
       overridden by 'force_hostname', is appended automatically.

       force_hostname <hostname>
       optional; defaults to the host's  fully  qualified  domain
       name.   ´force_hostname´  substitutes the specified string
       for the hostname in any headers the  filter  adds  to  the
       message.  (Currently, only one header is added, an ´accep­
       tance´ header, if the message is judged not to be spam.)

       force_domainname <domain-name>
       optional; defaults to the host's DNS domain name.
       When a ´rcpt to:´ command is  received,  and  a  user-hash
       exists,  the  filter will append the hosts domain to user­
       names which are not fully qualified before looking up  the
       name  in the user-hash.  ´force_domainname´ will cause the
       specified string to be appended, instead.

       guess <number>
       optional; defaults to 0.4
       This supplies a probability to any word found in a message
       which cannot be found in the probability hash.  Valid val­
       ues are in the range 0..1.0

       logfile <pathname>
       required. Normally, will be the same logfile used by send­
       mail.

       loglevel <name>
       optional; defaults to "INFO".
       Valid   values  are,  in  order  of  decreasing  severity,
       "EMERG", "ALERT", "CRIT",  "ERROR",  "WARNING",  "NOTICE",
       "INFO", and "DEBUG".

       number_to_consider <integer>
       optional; defaults to 100
       When  all  words  in a message have been assigned a proba­
       bilty, the probabilities are sorted according to the abso­
       lute  value  of their difference from 0.5; ´number_to_con­
       sider´ is the number of the highest-ranking  probabilities
       to  use  in  the  calculation of "spam probability".   Low
       values for number_to_consider  may  result  in  unreliable
       judgements; high values impose a slightly higher cpu load.

       probabilityhash <filename>
       required.  This is the name of the  probability  hash  (or
       more  usually,  the  symbolic  link  to  it)  produced  by
       ´make_new_database´.  Only a simple filename is  required;
       the directory is supplied by ´spamdatadir´.

       sendmail_listen <address-family>:<pathname | portnumber>
       required.   The internet-domain port or unix-domain socket
       to be used by the sendmail milter library  to  communicate
       with sendmail.
       For a unix domain socket, use:
       sendmail_listen  unix:/full/pathname/of/socket , or:
       sendmail_listen  local:/full/pathname/of/socket
       For an internet-domain connection, use:
       sendmail_listen  inet:portnumber
       This must, of course, match exactly the ´Xfilter´ configu­
       ration line in sendmail.cf.

       spamdatadir
       required.  The directory containing the  probability  hash
       and the optional username hash.

       spamlimit <directory>
       optional; defaults to 0.999
       The value which the estimated spam probability must exceed
       before being condemned and rejected as spam.

       user <username | userid>
       optional;  defaults to the userid of the  process  running
       the filter.
       If  found,  the  filter  will  setuid  to this user before
       entering the milter library code.

       username_db <filename>
       optional; the user ´opt-in´ database; no default.
       If no username hash is specified, the filter will be  used
       on all messages, regardless of recipient.
       If  one is specified, filtering will be done only for mes­
       sages destined for names found in the hash.


ENVIRONMENT
       $SPAMCONFIG can be used to supply the full pathname of the
       configuration  file.   (The  ´-c  config-file´ option will
       override $SPAMCONFIG)


FILES
       Required:  a configuration file and probability hash
       Optional(but recommended): a username hash


COPYRIGHT
       Copyright (c) 2002, J.B.Ward
       <bward2@users.sourceforge.net>




Expaminator                Nov.22,2002          sendmail_bayes(8)