gps - greylist policy service for postfix

1.005

Author:
Michael Moritz mimo/at/restoel.net
Last modified on
Thu Aug 30 16:27:02 2007
Sourceforge Page | Introduction | Installation | Configuration | Running | Whitelisting | Database Maintenance | Todo and known bugs | Changelog | Credits | Download | Discussion & Support | The gps Forum | Links | Class hierarchy and source documentation

Introduction

gps, firstly, is an implementation of a greylist policy service for postfix. Greylisting is a concept to reduce the amount of UCE ('spam') by technical means. Tests on production systems show that greylisting is hugely effective against spam. Read more about greylisting on http://www.greylisting.org and http://projects.puremagic.com/greylisting/whitepaper.html

Secondly, gps takes greylisting one step further starting with version 1.0. Based on the experience of using greylisting in a production environment, gps comes with features that hugely reduce the problems of the original greylisting concept. These improvements make gps' greylisting usable for ISPs and big mail system setups.

gps' main features are:

  • Uses a database backend through DBI
  • Allows sharing one database between all mail servers
  • Supports various database types, tested with mysql, postgresql, sqlite
  • Written in C/C++, using STL and libstdc++
  • Good compromise between speed and safety
  • Logging via syslog
  • Whitelisting by client network addresses, recipient and sender email address
  • Pattern matching based whitelisting (regex): wl_pattern
  • Database maintenance through a customisable perl script run by crond
  • Supports weak greylisting
  • Unique method of reverse weak/light greylisting to correctly identify mail from mail relays
  • Confirmed to run on Linux and FreeBSD so far

Project Status

  • Version 1.x
    • Stable, more than two years of running on production mail system with two mail servers
  • Version 0.x
    • Tested on Debian, RedHat 9 and 7.2
    • Tested on RedHat 9 and 7.2
    • Ongoing test on production mail system with two mailservers sharing one MySQL database

Installation

To build gps from source the following packages are required:

To build gps unpack the source tar ball (not if using SVN) and run configure and make. Since gps is under development you may have to do:

tar xvfz gps-<version>.tar.gz
[OR] 
tar xvfz gps-<version>.tar.gz
cd gps-X.X (or cd release-<version>)
make -f Makefile.cvs
./configure
make
make install
Alternatively, it is often possible to build gps manually by running (works on debian):
g++ -s -o gps configreader.cpp db.cpp main.cpp read.cpp cfg.cpp dbdefs.cpp wlcacheddb.cpp signals.cpp -ldbi -ldl 

Note:
If you get stuck with installing gps post your problem in the gps Forum

Configuration

First create an empty database for greylisting. How to do this depends on the database backend.

Example for mysql (this does not use a password):

# mysql -p
> CREATE DATABASE greylist;
> GRANT ALL ON greylist TO 'greylist' IDENTIFIED BY 'secret';
> BYE
Note:
etc/gps.pgsql.conf in the package contains a step by step example on how to do this in postgresql.
gps will create its Triplets table (and other tables) when it is run in mode=init.

Add gps to your master.cf and main.cf files as described in the postfix documentation under greylisting (taken from http://www.postfix.org/SMTPD_POLICY_README.html):

/etc/postfix/master.cf:
    policy  unix  -       n       n       -       -       spawn
      user=nobody argv=/usr/local/bin/gps /usr/local/etc/gps.conf

/etc/postfix/main.cf:
    smtpd_recipient_restrictions =
        ... 
        reject_unauth_destination 
        check_policy_service unix:private/policy 
        ...
    policy_time_limit = 3600

Syntax

gps [-v] configfile
Parameters:
-v enables verbose log messages
configfile your config file including path

Configuration File

The following options are used in the configuration file.
Note:
the keys and values are case sensitive.
ParameterPossible values or value range
(default in bold)
DescriptionDepends onVersion
modenormal | init | weak | reverseSets the greylisting mode   
weakbytes0 - 4 (3)Number of significant bytes of client IP addressmode=weak(|reverse)0.92
dbtypemysql | sqlite | pgsqlDatabase type   
db_hosthostname or IP addressDatabase serverdbtype  
db_usernameusernameDatabase user namedbtype  
db_passwordpasswordDatabase passworddbtype  
db_dbnamedatabase nameDatabase namedbtype  
db_portport numberDatabase portdbtype=pgsql 0.9
db_pgsql_optionsPostgres optionsPostgres optionsdbtype=pgsql 0.9
db_pgsql_tty/dev/ttyX (/dev/null)Postgres loggingdbtype=pgsql 0.9
db_sqlite_dbdirpath (permissions!)SQLite Database pathdbtype=sqlite 0.9
timeoutseconds (3600=1 hour)Greylisting timeout(mode=init)  
wl_networkoff | db | dbcached (off)Network whitelisting mode 0.8
wl_recipientoff | db | dbcached (off)Recipient whitelisting mode 0.8
wl_senderoff | db | dbcached (off)Sender whitelisting mode 0.8
wl_patternoff | db | dbcached (off)Pattern matching whitelisting mode 0.91

mode

mode tells gps in which mode to run. Default is init

timeout

timeout The greylisting whitepaper suggests a timeout of 3600 seconds (1 hour) before a new triplet of sender, recipient, client address should be allowed through the greylisting system. Reducing the timeout will keep users happy and is still very effective (e.g. 60 seconds). Default is 3600

dbtype

dbtype sets the database type to use. This must be set to the same name libdbi expects. Currently, libdi-drivers support mysql, pgsql (version 0.9+), sqlite (version 0.9+), msql (?), oracle (?). gps will exit and log a list of available drivers if the specified driver is not installed (comes in handy for checking libdbi installation). Default value is mysql

db_<db parameter>

db_<db parameter> This is the list of paramters to be passed on to libdbi to make the database connection. The db parameters depend on the driver. See the example configuration file below and the included gps.conf, gps.sqlite.conf and gps.pgsql.conf for examples of how to use the different database backends. Example db parameters for dbtype=mysql:
db_host=localhost
db_username=gps
db_password=secret
db_dbname=greylist

Whitelisting

A good greylisting implementation should include several ways of whitelisting. Many mail systems do not conform to the SMTP specification, some big ISPs use multiple mail servers on the same subnet. In the worst case mails get bounced back. Weak greylisting (sometimes called light greylisting) is one way to attack this problem, whitelisting is better in CPU load and results in better spam reduction. In the 1.x series gps uses a better approach to this problem. Using the mode reverse solves the issues with mail relays and thus reduces the need to whitelist.

The following whitelisting options are provided by gps. The can be used in any combination. If you try to optimise your configuration bear in mind that the whitelisting tables get processed before the triplets table.

Whitelisting Database Modes (version 0.8+)

All whitelisting modules can use different ways of storing data and looking up entries. The mode to use is set in the configuration file.
wl_<module>=<mode>
Supported modes are:
Parameters:
off Is the default. This whitelisting module is not used.
db (version 0.7b+) The whitelisting data is stored in a table with the name of the whitelisting module. If gps is run in mode=init it will check if the table exists and create it if necessary. Setting a module to db makes gps check evry triplet against the whitelisting module's table before checking the main triplets table. Therefore, for every whitelisting modules enabled one more SQL query is generated.
dbcached (version 0.8+) When gps is started it reads the module's whitelisting table and creates a memory cache of it which it uses to do subsequent lookups. This uses more memory than db and results in longer startup times, but means fewer SQL queries, and is - once initialised - much faster than db.

wl_network

wl_network (version 0.7b+) This sets the network whitelisting mode. If it is set to wl_network=db it will check the table network prior to everything else whether the client address network block has been whitelisted. In order to turn it off use wl_network=off. Default off

Example of adding a whitelisting entry in mysql

> use greylist;
> insert into network values ('192.168.0.','my home network');
> bye (or CTRL+D)
Note:
The last dot of the netblock is mandatory!

wl_recipient, wl_sender

wl_recipient (version 0.8+) This sets the recipient (or sender) whitelisting mode. If it is set to wl_recipient=db it will check the table recipient prior to everything else whether the recipient address has been whitelisted. In order to turn it off use wl_recipient=off. Default off

Example of adding a whitelisting entry in mysql

> use greylist;
> insert into recipient values ('bla@mydomain.com','this user wants his spam');
> bye (or CTRL+D)

wl_pattern

wl_pattern (version 0.91+) allows whitelisting based on regular expression matching.

The regular expressions in wl_pattern can, theoretically, be used to replace any of the other whitelisting modules. Furthermore, it can be used to implement complex whitelisting rules combining several conditions. Nevertheless, it should only be used if none of the other modules suit the task. It is much slower by itself and also because all its patterns will be tested against any incoming triplet. The other modules use database or string map based lookups. If wl_pattern has to be used this should be done by setting it to wl_pattern=dbcached thus reducing the number of database queries.

gps builds a text that expressions can be matched against for advanced whitelisting solutions. The format of the gps internal representation is:

s=someuser@yahoo.com
r=someuser@mydomain.org
c=216.145.54.171
h=mrout1.yahoo.com
If a pattern contains a h= line gps does a reverse name lookup. This makes gps slower. If no patterns contain h= the reverse lookup is skipped (It's a good idea to run nscd in a reverse lookup situation). From version 1.x on gps also does the lookup if it is run in mode reverse. In this case only the reverse name lookup is already performed thus there is no difference in the performance. Tests also show that the effect reverse lookups is not as bad as orginially assumed.

In the above example the IP address resolves to one of Yahoo's servers. This pattern uses reverse name lookup and matches the example:

> insert into pattern values(".+^h=.*yahoo\.com.+$","yahoo");
Another example: this whitelists one of your mail domains completely
> insert into pattern values(".+^r=.*@someorg\.org.+$","someorg want all spam");
A more complex example for a common situation. A user has problems with receiving mail from someone particular. In this example we even know the sender's mail server's IP address -- well at least the first byte:
^s=user.+^r=myuser@mydomain.+^c=210
If you wanted to specify the users full address it would look like this
^s=user.+^r=myuser@mydomain\.org.+^c=210
Note:
the .+ after the org in the example is still required!
Since s=user is at the beginning do not use the leading .+ before the anchor ^
^s=sender@example\.com.+$

weakbytes

weakbytes sets the number of significant bytes of the client address in weak greylisting mode. The default is 3.

Example configuration file:

mode=reverse
dbtype=mysql 
db_host=localhost
db_username=gps
db_password=secret
db_dbname=greylist
timeout=60
wl_recipient=dbcached
wl_network=db
wl_sender=off
wl_pattern=dbcached

To test gps and your configuration use the following command. Configuration errors will be logged to syslogd (facility mail). Also see Running.

./src/gps -v etc/gps.conf < tests/testinput4.txt
If everything is installed correctly a couple of "action=permit_if_defer" should be printed. If this is not the case check the mail log for errors. If you are stuck at this point post your configuration file, the relevant section of the mail log, and the versions of gps, libdbi, libdb-drivers in the gps Forum.

Now wait for the number of seconds specified in timeout and run the same line again. It should return "action=dunno" lines. If it does gps is ready.

Note:
If you plan to use gps in reverse mode (strongly recommended) then you must now clear out the triplet table. E.g.
> TRUNCATE TABLE `triplet`;
Again, check the log and post in the Forum if something goes wrong.

Example configuration files for postgres and SQLite are in the etc/ folder after unpacking gps. The gps.pgsql.conf contains step by step instruction on how to install and configure postgres on debian and how to create the greylist database and user.

Postgres outputs information on table creation and an error on creating the secondary index when run in mode=init. Nevertheless, it is useable after this.

Note:
If you get stuck with configuring gps post your problem in the gps Forum

Running

For running gps the requirements are:

gps logs its actions to the syslog mail facility. The output from a testrun is shown below:

Note:
Running gps in verbose mode (-v switch) generates a lot of log output and is only recommended for initialising and troubleshooting.
 mail gps[2225]: started (ver.: 0.8 built: Sep 14 2004 18:35:14)
 mail gps[2225]: reading config: /etc/gps.conf
 mail gps[2225]: config: prefix:  key: mode value=normal
 mail gps[2225]: config: prefix: db key: host value=localhost
 mail gps[2225]: config: prefix: db key: username value=greylist
 mail gps[2225]: config: prefix: db key: password value=
 mail gps[2225]: config: prefix: db key: dbname value=greylist
 mail gps[2225]: config: prefix:  key: timeout value=60
 mail gps[2225]: connecting to DB, using driver mysql
 mail gps[2225]: setting DB option: dbname to: greylist
 mail gps[2225]: setting DB option: host to: localhost
 mail gps[2225]: setting DB option: password to: 
 mail gps[2225]: setting DB option: username to: root
 mail gps[2225]: connected to DB
 mail gps[2225]: ok: 'foobar.tld' -> 'barfoo.tld', '1.2.3.4' (3, 152 secs)
 mail gps[2225]: action=dunno
 mail gps[2225]: new: 'foo@blabla.org' -> 'blabla@foo.org', '192.168.0.1'
 mail gps[2225]: action=defer_if_permit Service is unavailable
 mail gps[2225]: wait: 'foo@blabla.org' -> 'blabla@foo.org', '192.168.0.1' (0, 34 secs)
 mail gps[2225]: action=defer_if_permit Service is unavailable
 mail gps[2225]: disconnecting from DB
While gps is running it logs information about the records it receives from postfix. The typical path of a non-spam record is
  1. new: sender -> recipient, client_address|client_name
  2. wait: sender -> recipient, client_address|client_name (count, time_difference first seen)
  3. ok: sender -> recipient, client_address|client_name (count, time_difference last seen)

Parameters:
sender the sender's address
recipient the recipient's address
client_address the client's address
client_name the significant part of the resolved client name when run in mode reverse
count number of times triplet has been passed from postfix
time_difference interval between now and the record's time
gps also logs information about whitelisting. The format of these messages is:
 mail gps[18838]: wl recipient: 'foobar.tld' -> 'bla@mydomain.com', '192.168.0.254': this user wants his spam
 
 mail gps[18452]: wl network: 'foobar.tld' -> 'bla@someorg.org', '192.168.0.254': my home network

Note:
If you have questions about running gps post in the gps Forum

Database Maintenance

The greylisting approach requires a level of database maintenance. This implementation uses an example perl script for database maintenance. This can be run from cron.
gps-maintain.pl [-v] [-delete] -eq|-lt count -age seconds configfile
A typical usage example would be:
/usr/local/bin/gps-maintain.pl -delete -eq 0 -age 18000 /usr/local/etc/gps.conf
This could be run hourly to delete entries that have not been received again within 5 hours.
/usr/local/bin/gps-maintain.pl -delete -age 3110400 /usr/local/etc/gps.conf
This could be run daily to delete entries that are older than 35 days.

Todo and known bugs

Note:
If you think you found a bug or if you have improved gps post in the gps Forum

Credits

Changelog

Download

Since version 1.005 gps source code is hosted on sourceforge. There are now two (three) ways of getting gps.

The current development version is available from subversion on sourceforge

Older versions will be available from this site but mainly for archiving reasons. If you are using one of them consider upgrading to the most recent stable version.

Note:
Upgrading from 0.x versions to 1.x versions requires upgrading the database to the new database format. The distribution includes an upgrade script gps-db-update.pl It contains more information on how to perform the upgrade.

Older Releases

Discussion & Support

I have set up a publicly accessible forum where you can post bug reports, questions, and answers (preferably) around gps.

The gps Forum

Links

Let me know if you have a link to add here.
Generated on Tue Jul 24 16:36:53 2007 for gps by  doxygen 1.5.1