create_probability_hash(1) Expaminator create_probability_hash(1)
NAME
create_probability_hash
SYNOPSIS
Usage: create_probability_hash [-v] [-d] [-f] \
probability-hash good-hash spam-hash
DESCRIPTION
create_probability_hash creates a dictionary of all words
found in both the "normal", or "good" word-hash and the
spam word-hash.
Currently, the probability assigned to each word is calcu
lated exactly as outlined in Paul Graham's "A Plan for
Spam", <http://www.paulgraham.com/spam.html>.
Command-line options:
-v be verbose; print a dot for every 1000 words pro
cessed.
-d write debugging messages (frequency & probability of
each word)
-f 'force'; if 'probability-hash' already exists, delete
and re-ceate it.
probability-hash - probability hash to be created
good-hash - hash of all words in non-spam messages
spam-hash - hash of all words in spams
If a hashfile name is a bare file name, then the environ
ment variable '$SPAMDIR' will be prepended. If a hashfile
name is a bare file name, and '$SPAMDIR' is not set, the
current working directory is used. To use the current
working directory, use the form: './hashfile-name'
ENVIRONMENT
$SPAMDIR, as discussed above.
FILES
Required: A "normal" (non-spam) hash and a spam hash,
both created by 'create_word_hash'
SEE ALSO
make_new_database, create_word_hash
COPYRIGHT
Copyright (c) 2002, J.B.Ward
<bward2@users.sourceforge.net>
Expaminator Nov.29,2002 create_probability_hash(1)