| cover | report | presentation (PDF) | article (Ja.) | white list | watch tool | applyers | Q&A | blog (Ja.) | links | history | contact |
Here is a shell script useful for finding out mail servers which are mistakenly rejected. If your computer combines a web server with a mail server, you can easily watch rejection records with a web browser by means of installing this script together with a password under a directory under the cgi-bin directory. You can also run it as a command. Rejection log sorting script
Function
This script inputs a Postfix mail log and extracts records of rejection with the response code "450" (meaning "try again later") by client restriction (rejections by other reasons are not extracted), and displays them sorted so that the records of retry accesses are arranged in a sequence. That is, accesses with the same client IP address, sender address and recipient address are arranged in a sequence. If all of them are not the same, the record is separated by a blank line with each other.
It also displays the following data at the end:
Practical use
- access count: Displays the total number of extracted accesses.
- estimated message count: Displays the number of messages in counting accesses with the same client IP address and recipient address as one message. It is supposed to be near to the number of spam messages which might have been received if you didn't take the anti-spam measure.
- retry sequence count: Displays the number of sequences of retry accesses. When it is not 0, you should investigate retry accesses.
Legitimate mail servers always retry transfer at appropriate intervals against rejection with the response code "450". The records of those rejections are displayed in a sequence with this script. Therefore, it will help you to find out accesses from legitimate mail servers which should be listed in the white list.
If accesses displayed in a sequence satisfy all of the following conditions, the client is probably a legitimate mail server which should be listed in the white list.Meanwhile, if accesses displayed in a sequence fall under any of the following conditions, the client is probably or surely illegitimate.
- The retry intervals are 1 minute or longer but up to 4 hours. (However, when one sender sends multiple messages to one recipient, the apparent intervals may be short.)
- The retry has been lasting for 30 minutes or more.
- The domain name of the sender address is existent.
- The recipient address is correct.
In case you are hard to decide whether the client is legitimate or not, you should once list it on the white list, and then unlist it if the recipient complains of spam.
- The accesses had repeated for a while at intervals of time shorter than 1 minute and then stopped.
- The HELO address is your mail server's IP address or the recipient's domain name.
- The HELO address changes.
Necessary configuration
You need to configure the access mode of the mail log files so that they are readable with the authority of the HTTP daemon. In many systems, you can configure it with the commands as follows:
Alterations
- chgrp nobody /var/log/maillog*
chmod g+r /var/log/maillog*
Shell script code
- Mail log file names
This script is coded to input the mail log files in the last week and this week applying to my system environment in which the mail log files are rotated every week. If necessary, alter the mail log file names in the script according to your system environment or the period of time you want to watch.
- To suppress single access records
A record of an access which has not been retried is displayed in a single line separated by a blank line with each other. It may be sometimes helpful for you to find the client to be illegitimate even if it has apparently retried transfer.
However, if too many single access records disturb your watch, you can suppress them. In the GAWK script which is the 11th process, rewrite the line:into:
Suppress_single_access_records=0Even if you suppress single access records, the indication of the access count, estimated message count and retry sequence count is the same.
Suppress_single_access_records=1
#!/bin/sh echo "Content-Type: text/plain" echo echo "Mail rejection log (450 Client host rejected) - sorted" echo # # (1) Input mail log. # cat /var/log/maillog.1 /var/log/maillog | \ # # (2) Extract records indicating "450 Client host rejected". # egrep 'reject:.+ 450 .*Client host rejected:' | \ # # (3) Extract essential items. # gawk ' { client=substr($0, match($0, /from [^]]+\]/)+5, RLENGTH-5) sub(/\[/, " [", client) sender=substr($0, match($0, /from=<[^>]*>/), RLENGTH) rcpt=substr($0, match($0, /to=<[^>]*>/), RLENGTH) helo=substr($0, match($0, /helo=<[^>]*>/), RLENGTH) printf "%s %2d %s %s %s %s %s\n", $1, $2, $3, client, sender, rcpt, helo } ' | \ # # (4) Convert month names into month numbers. # gawk ' BEGIN { month_num["Jan"]=1 month_num["Feb"]=2 month_num["Mar"]=3 month_num["Apr"]=4 month_num["May"]=5 month_num["Jun"]=6 month_num["Jul"]=7 month_num["Aug"]=8 month_num["Sep"]=9 month_num["Oct"]=10 month_num["Nov"]=11 month_num["Dec"]=12 max_month_num=0 } { $1=month_num[$1] if ($1>max_month_num) max_month_num=$1 else if ($1<max_month_num) $1+=12 printf "%3d %2d %s %s %s %s %s %s\n", $1, $2, $3, $4, $5, $6, $7, $8 } ' | \ # # (5) Sort according to IP address, sender address and recipient address. # sort -k 5,7 | \ # # (6) Insert a blank line between records with a different triplet. # gawk ' BEGIN { prev_triplet="" } { if (prev_triplet!="") { if (prev_triplet!=$5 $6 $7) print "" } print prev_triplet=$5 $6 $7 } ' | \ # # (7) Convert retry records in a sequence into one line. # gawk ' BEGIN { RS="" } { gsub(/\n/, "\036") print } ' | \ # # (8) Sort according to date and time. # sort -k 1,3 | \ # # (9) Reconvert retry records in a sequence into multiple lines. # gawk ' { gsub(/\036/, "\n") print print "" } ' | \ # # (10) Reconvert month numbers into month names. # gawk ' BEGIN { month_name[1]="Jan" month_name[2]="Feb" month_name[3]="Mar" month_name[4]="Apr" month_name[5]="May" month_name[6]="Jun" month_name[7]="Jul" month_name[8]="Aug" month_name[9]="Sep" month_name[10]="Oct" month_name[11]="Nov" month_name[12]="Dec" } { if ($0!="") { $1=month_name[($1-1)%12+1] printf "%s %2d %s %s %s %s %s %s\n", $1, $2, $3, $4, $5, $6, $7, $8 } else print "" } ' | \ # # (11) Output sorted records with counting. # gawk ' BEGIN { Suppress_single_access_records=0 RS="" acc_count=0 host_and_rcpt="" msg_count=0 seq_count=0 } { retry_count=gsub(/\n/, "\n") acc_count+=1+retry_count if (index(host_and_rcpt, $5 $7)==0) { ++msg_count host_and_rcpt=$5 $7 host_and_rcpt } if (retry_count>0) ++seq_count if (!(retry_count==0 && Suppress_single_access_records)) { print print "" } } END { print "access count =", acc_count, \ ", estimated message count =", msg_count, \ ", retry sequence count =", seq_count } '
| cover | report | presentation (PDF) | article (Ja.) | white list | watch tool | applyers | Q&A | blog (Ja.) | links | history | contact |