sendmail_bayes(8)          Expaminator          sendmail_bayes(8)
NAME
       sendmail_bayes
SYNOPSIS
       sendmail_bayes [-D level] [-h] [-v]  [-c config-file]
DESCRIPTION
       A Bayesean sendmail 'milter' for spam
       sendmail_bayes  is a sendmail 'milter' which normally runs
       as a daemon.  Sendmail communicates with  milters  through
       either  a  unix  or  internet-domain socket, which must be
       specified  in  both  sendmail's  ".cf"  file   and   send
       mail_bayes's configuration file.
       Command-line options:
       -D Set the Debug level; 'level' should be a positive inte
          ger 0..50.  Higher values for  debug-level  cause  more
          internal variables to be dumped to stdout.  This is not
          associated with the error-logging level, ´debug´;  '-D'
          should not be used during normal mail processing.
       -X Don't daemonise; useful only for debugging.
       -h Help; print the command-line options and exit.
       -v Version; print the version number and exit.
       -c  specify a Configuration file.   If '-c config-file' is
          omitted, the environment variable 'SPAMCONFIG' is used.
CONFIGURATION
       sendmail_bayes'  configuration  file is composed of simple
       keyword-value pairs, one pair per line.  Keywords are  not
       case-sensitive;  keyword and value are separated by one or
       more spaces or tabs.   A comment symbol, '#' anywhere on a
       line  causes  all  following  text  to  be ignored.  send
       mail_bayes will stop scanning for a keyword at  the  first
       occurence in the file.
       This  configuration file is shared by the database mainte
       nance and testing utilities.
       approval_message <headername>
       optional; default is 'X-judged-non-spam'
       When a message has been judged  legitimate,  a  header  is
       added  to  it  containing  an  indication that it has been
       passed through the filter.  The host name,  which  may  be
       overridden by 'force_hostname', is appended automatically.
       force_hostname <hostname>
       optional; defaults to the host's  fully  qualified  domain
       name.   ´force_hostname´  substitutes the specified string
       for the hostname in any headers the  filter  adds  to  the
       message.  (Currently, only one header is added, an ´accep
       tance´ header, if the message is judged not to be spam.)
       force_domainname <domain-name>
       optional; defaults to the host's DNS domain name.
       When a ´rcpt to:´ command is  received,  and  a  user-hash
       exists,  the  filter will append the hosts domain to user
       names which are not fully qualified before looking up  the
       name  in the user-hash.  ´force_domainname´ will cause the
       specified string to be appended, instead.
       guess <number>
       optional; defaults to 0.4
       This supplies a probability to any word found in a message
       which cannot be found in the probability hash.  Valid val
       ues are in the range 0..1.0
       logfile <pathname>
       required. Normally, will be the same logfile used by send
       mail.
       loglevel <name>
       optional; defaults to "INFO".
       Valid   values  are,  in  order  of  decreasing  severity,
       "EMERG", "ALERT", "CRIT",  "ERROR",  "WARNING",  "NOTICE",
       "INFO", and "DEBUG".
       number_to_consider <integer>
       optional; defaults to 100
       When  all  words  in a message have been assigned a proba
       bilty, the probabilities are sorted according to the abso
       lute  value  of their difference from 0.5; ´number_to_con
       sider´ is the number of the highest-ranking  probabilities
       to  use  in  the  calculation of "spam probability".   Low
       values for number_to_consider  may  result  in  unreliable
       judgements; high values impose a slightly higher cpu load.
       probabilityhash <filename>
       required.  This is the name of the  probability  hash  (or
       more  usually,  the  symbolic  link  to  it)  produced  by
       ´make_new_database´.  Only a simple filename is  required;
       the directory is supplied by ´spamdatadir´.
       sendmail_listen <address-family>:<pathname | portnumber>
       required.   The internet-domain port or unix-domain socket
       to be used by the sendmail milter library  to  communicate
       with sendmail.
       For a unix domain socket, use:
       sendmail_listen  unix:/full/pathname/of/socket , or:
       sendmail_listen  local:/full/pathname/of/socket
       For an internet-domain connection, use:
       sendmail_listen  inet:portnumber
       This must, of course, match exactly the ´Xfilter´ configu
       ration line in sendmail.cf.
       spamdatadir
       required.  The directory containing the  probability  hash
       and the optional username hash.
       spamlimit <directory>
       optional; defaults to 0.999
       The value which the estimated spam probability must exceed
       before being condemned and rejected as spam.
       user <username | userid>
       optional;  defaults to the userid of the  process  running
       the filter.
       If  found,  the  filter  will  setuid  to this user before
       entering the milter library code.
       username_db <filename>
       optional; the user ´opt-in´ database; no default.
       If no username hash is specified, the filter will be  used
       on all messages, regardless of recipient.
       If  one is specified, filtering will be done only for mes
       sages destined for names found in the hash.
ENVIRONMENT
       $SPAMCONFIG can be used to supply the full pathname of the
       configuration  file.   (The  ´-c  config-file´ option will
       override $SPAMCONFIG)
FILES
       Required:  a configuration file and probability hash
       Optional(but recommended): a username hash
COPYRIGHT
       Copyright (c) 2002, J.B.Ward
       <bward2@users.sourceforge.net>
Expaminator                Nov.22,2002          sendmail_bayes(8)