Go Back   DaemonForums > Miscellaneous > Guides

Guides All Guides and HOWTO's.

Thread Tools Display Modes
Prev Previous Post   Next Post Next
  #1   (View Single Post)  
Old 13th May 2008
J65nko J65nko is offline
Join Date: May 2008
Location: Budel - the Netherlands
Posts: 3,554
Default Script to test whether an IP address has been listed in a DNSBL

1. Script to test whether an IP address has been listed in a DNSBL
  • 1.1 Introduction
  • 1.2 Manual DNSBL list check
  • 1.3 The 'blcheck' shell script
  • 1.4 Examples of correct and incorrect usage
  • 1.5 The role of 'sed'
  • 1.6 Explanation of the 'sed' regular expression
  • 1.7 Alternative approach for this rather long regex
1.1 Introduction

If you run a mail or a web server it nice to know in time whether the IP address of your server has been submitted to a so-called DNSBL list. Being listed can mean that one of your network boxes, or that a site you host on your webserver, has been compromised and is sending out spam.

Many administrators find out the hard way, that their server has been blacklisted. Customers or users complain about their mail not being accepted by their recipients. Checking the mail logs then usually reveals an pointer to an URL which states something like this
IP Address was found in the CBL.

It was detected at 2007-06-16 20:00 GMT (+/- 30 minutes), approximately
8 hours ago.

ATTENTION: This IP has an open web or socks proxy which is being
hijacked by the 'DMS' spam tool to send spam. This is usually due
to proxy trojans being installed on your IP (or a machine "behind"
this IP if it is a NAT gateway) via the vulnerabilities described
in the Microsoft MS06-040 security bulletin. Please see the top
news item on our home page for more information.

You need to patch your system, find then fix/remove the proxy, and
then contact the CBL at xxxxx@xxxx.org to remove this listing.
This is why every responsible system administrator should check on a regular basis for being listed.

See http://en.wikipedia.org/wiki/DNSBL and the excellent http://www.spamhaus.org/dnsbl_function.html page for more information about these lists and their role in anti-spam policies.

1.2 Manual DNSBL list check

The organizations maintaining these lists, have a page on their website where you can check if your server is on their list. For example http://www.spamhaus.org/lookup.lasso and http://www.spamcop.net/bl.shtml.

For a regular check however, for example to be run by 'cron', these facilities are not really helpful. And for a manual check of the IP address from the 'sh' command line you have to do quite some work too. Take as example the IP address that sent me a spam message. For a black list check of this address you have to perform the following steps:
  1. Reverse the address to

  2. Append the name of the blacklist.

    For the 'zen.spamhaus' list, that results in ''

  3. Resolve the resulting name in DNS with a DNS tool
    $ dig
    ; <<>> DiG 9.3.2-P1 <<>>
    ;; global options:  printcmd
    ;; Got answer:
    ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 11694
    ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0
    ;        IN      A
    ;; ANSWER SECTION: 1800 IN A
    ;; Query time: 384 msec
    ;; SERVER:
    ;; WHEN: Sat Jun 16 13:42:10 2007
    ;; MSG SIZE  rcvd: 64
    The results means the address is on that list.

    A similar test but now for Spamcop:
    $ dig
    ; <<>> DiG 9.3.2-P1 <<>>
    ;; global options:  printcmd
    ;; Got answer:
    ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 16727
    ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0
    ;  IN      A
    ;; ANSWER SECTION: 2100 IN   A
    ;; Query time: 141 msec
    ;; SERVER:
    ;; WHEN: Sat Jun 16 13:58:33 2007
    ;; MSG SIZE  rcvd: 62
    Again the response is an address in the loopback range, meaning it has been listed.

If you are familiar with reverse name lookups, you will noticed that the same mechanism is used here. Instead of appending '.in-addr.arpa.' to the reversed IP, you use the name of the black list.

1.3 The 'blcheck' shell script

In the beginning of 2007 I saw an increase of spam in my Gmail spam folders. Because I wanted a comfortable way to find out whether this junk originated from black listed senders, I wrote the following script.
# -- $Id: blcheck.xml,v 1.8 2007/06/17 23:38:00 j65nko Exp $ --

# Check if an IP address is listed on one of the following blacklists
# The format is chosen to make it easy to add or delete
# The shell will strip multiple whitespace


# simple shell function to show an error message and exit
#  $0  : the name of shell script, $1 is the string passed as argument
# >&2  : redirect/send the message to stderr

  echo $0 ERROR: $1 >&2
  exit 2

# -- Sanity check on parameters
[ $# -ne 1 ] && ERROR 'Please specify a single IP address'

# -- if the address consists of 4 groups of minimal 1, maximal digits, separated by '.'
# -- reverse the order
# -- if the address does not match these criteria the variable 'reverse will be empty'

reverse=$(echo $1 |
  sed -ne "s~^\([0-9]\{1,3\}\)\.\([0-9]\{1,3\}\)\.\([0-9]\{1,3\}\)\.\([0-9]\{1,3\}\)$~\4.\3.\2.\1~p")

if [ "x${reverse}" = "x" ] ; then
      ERROR  "IMHO '$1' doesn't look like a valid IP address"
      exit 1

# Assuming an IP address of as parameter or argument

# If the IP address in $0 passes our crude regular expression check,
# the variable  ${reverse} will contain
# In this case the test will be:
#   [ "x44.33.22.11" = "x" ]
# This test will fail and the program will continue

# An empty '${reverse}' means that shell argument $1 doesn't pass our simple IP address check
# In that case the test will be:
#   [ "x" = "x" ]
# This evaluates to true, so the script will call the ERROR function and quit

# -- do a reverse ( address -> name) DNS lookup
REVERSE_DNS=$(dig +short -x $1)

echo IP $1 NAME ${REVERSE_DNS:----}

# -- cycle through all the blacklists
for BL in ${BLISTS} ; do

    # print the UTC date (without linefeed)
    printf $(env TZ=UTC date "+%Y-%m-%d_%H:%M:%S_%Z")

    # show the reversed IP and append the name of the blacklist
    printf "%-40s" " ${reverse}.${BL}."

    # use dig to lookup the name in the blacklist
    #echo "$(dig +short -t a ${reverse}.${BL}. |  tr '\n' ' ')"
    LISTED="$(dig +short -t a ${reverse}.${BL}.)"
    echo ${LISTED:----}


# --- EOT ------
The script has been rather heavily commented and is available for downloading. The regular expression used by 'sed' will be explained in detail in another section.

1.4 Examples of correct and incorrect usage

$ ./blcheck

IP NAME p4040-ipbf1108marunouchi.tokyo.ocn.ne.jp.
2007-06-17_01:11:06_UTC         ---
2007-06-17_01:11:12_UTC      ---

$ ./blcheck 

IP NAME fia99-2-100.dsl.mxposure.nl.
2007-06-17_21:01:42_UTC           ---
2007-06-17_21:01:42_UTC           ---
2007-06-17_21:01:42_UTC            ---
2007-06-17_21:01:43_UTC          ---
2007-06-17_21:01:43_UTC        ---

$ for X in; do ./blcheck $X; done

IP NAME cpe-24-209-96-220.woh.res.rr.com.
2007-06-17_01:18:31_UTC         ---

$ while true; do echo IP?; read IP; ./blcheck $IP; done

IP NAME 201-13-22-241.dsl.telesp.net.br.
IP NAME ;; connection timed out; no servers could be reached
2007-06-17_23:14:33_UTC        ---
2007-06-17_23:14:34_UTC     ---
Incorrect and error message
$ ./blcheck
./blcheck ERROR: Please specify a single IP address

$ ./blcheck]
./blcheck ERROR: IMHO ']' doesn't look like a valid IP address

$ ./blcheck
./blcheck ERROR: Please specify a single IP address
Incorrect (invalid octet 400), but no error message
$ ./blcheck

2007-06-17_01:29:03_UTC 400.43.175.125.cbl.abuseat.org.        ---
2007-06-17_01:29:03_UTC 400.43.175.125.dnsbl.sorbs.net.        ---
2007-06-17_01:29:04_UTC 400.43.175.125.bl.spamcop.net.         ---
2007-06-17_01:29:04_UTC 400.43.175.125.zen.spamhaus.org.       ---
2007-06-17_01:29:04_UTC 400.43.175.125.combined.njabl.org.     ---
1.5 The role of 'sed'

To reverse the IP address the script uses 'sed'. The man page tersely describes this program as follows:
     The sed utility reads the specified files, or the standard input if no
     files are specified, modifying the input as specified by a list of com-
     mands.  The input is then written to the standard output.

     A single command may be specified as the first argument to sed.  Multiple
     commands may be specified separated by newlines or semicolons, or by us-
     ing the -e or -f options.  All commands are applied to the input in the
     order they are specified regardless of their origin.
'sed' is one of the many text processing utilities which acts as a filter. It takes input, applies some commands to that input and sends the result to the standard output.
reverse=$(echo $1 |
  sed -ne "s~^\([0-9]\{1,3\}\)\.\([0-9]\{1,3\}\)\.\([0-9]\{1,3\}\)\.\([0-9]\{1,3\}\)$~\4.\3.\2.\1~p")
'reverse' is a variable which we fill with the output of a command.
reverse=$( command )
The '$( command )' construct is called command substitution and is the preferred alternative for the older construct which uses backticks:
The command in this case is echo $1 | sed -ne "......."

$1 is the IP address passed as argument to the 'blcheck' script and is echoed to standard output. The '|' pipe symbol causes it to be fed to 'sed' as standard input for processing.

The options used in calling 'sed':
-n            By default, each line of input is echoed to the standard output
              after all of the commands have been applied to it.  The -n option
              suppresses this behavior.

-e command    Append the editing commands specified by the command argument to
              the list of commands.
We only want to echo the reversed IP if the regular expression matches But that means we have to use the 'sed' 'p' command to force a print if the match and reversal has been successful.

1.6 Explanation of the 'sed' regular expression
Starting at beginning of string '^' and starting storage in '\1', look for a sequence of at least 1, at most 3 digits, and store or remember in container \1.

Then do something similar for the remainder of the IP address 3 times: Look for a leading period '.' and a sequence of at least 1, and a maximum of 3 digits and store each of these digit groups (octets) in the containers \2, \3 and \4.

s 	: substitute/replace
~	: our own chosen delimiter denoting start of the subst. pattern
	: instead of the default "/"
^	: beginning of line (the shell will strip all leading whitespace )
\(	: start storing in the first container \1
[0-9]	: a character in the range 0-9, a digit
\{	: start of quantifier
1,3	: minimal one, maximal 3
\}	: end of quantifier
\) 	: end of storing in container \1
\.	: a literal period '.'
A "." is a so-called meta-character in regular expressions and stands for "any" character. Because we want to match a real literal '.' we have to escape it with a '\'.

The opposite is true for the quantifier indicators \{ and \} and the grouping indicators \( and \). Here the opening and closing brace and the '(', ')' have no special meaning in the regular expression language To upgrade them to their special meaning, of start and stop storing, they need a leading "\".

\(	: start storing in next container \2
[0-9]	: a character in the range 0-9
\{	: start of quantifier
1,3	: minimal one, maximal 3
\}	: end of quantifier
\) 	: end of storing in container \2
\.	: a literal '.' escaped with '\'


\(	: start storing in next container \3
[0-9]	: a character in the range 0-9
\{	: start of quantifier
1,3	: minimal one, maximal 3
\}	: end of quantifier
\) 	: end of storing in container \3
\.	: a literal '.' escaped with '\'


\(	: start storing in next container \4
[0-9]	: a character in the range 0-9
\{	: start of quantifier
1,3	: minimal one, maximal 3
\}	: end of quantifier
\) 	: end of storing in container \4
$	: end of string
We now have matched the 4 groups of digits of the IP address in four containers: '\1', '\2', '\3' and '\4'

We only stored the digit groups, not the separating periods or dots. Now we substitute and rearrange them in reverse order and re-insert the periods.

Note that in the search or matching pattern, the period "." is a special character, that needs to be escaped if we want to match a real period. That doesn't apply to the substitution or replacement. Here a period is just a plain normal period.

~	: our custom delimiter, marking the end of the matching pattern, and start of
	  the substituting part
\4	: 4th digit group first
.	: a '.' has no special meaning in replacement part, so not escaped with '\')
\3	: 3th digit group
.	: a '.'
\2	: 2nd digit group
.	: a '.'
\1	: 1st digit group
.	: a '.'
p	: print the matched and substituted pattern
If the regular expression did not match, this substitution will not be done and nothing will be printed to standard output. So nothing would be stored in the variable 'reverse'.

1.7 Alternative approach for this rather long regex

Both 'awk' and it's descendant 'perl' have an operator called 'split'. From the 'awk' man page:
     split(s, a, fs)  Splits the string s into array elements a[1], a[2], ...,
                      a[n] and returns n.  The separation is done with the
                      regular expression fs or with the field separator FS if
                      fs is not given.  An empty string as field separator
                      splits the string into one array element per character.
So you could split on the dots separating the 4 IP address octets and store the octets in 4 array elements. This is left as an exercise for the reader

Both 'sed' and 'awk' are part of the base installation of all Unix-like operating systems, while 'perl' for example is not part of FreeBSD base and needs to be installed separately. This fact made me use 'sed' instead of 'perl' for this particular script.
Attached Files
File Type: sh blcheck.sh (2.2 KB, 837 views)
You don't need to be a genius to debug a pf.conf firewall ruleset, you just need the guts to run tcpdump

Last edited by J65nko; 13th May 2008 at 12:22 AM. Reason: typo
Reply With Quote

blacklist, rbl, spam

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
test port connection carpman FreeBSD Security 5 9th February 2009 11:12 AM
MAC address to IP rex FreeBSD General 9 11th November 2008 07:06 PM
Asking about IPv6 address berlowin Off-Topic 2 9th July 2008 02:39 AM
how extract specific test from Postfix logs with PHP or Perl marco64 Programming 3 21st June 2008 12:46 PM
Sendmail 8.14.2 undisclosed DNSBL lookup failure and NOQUEUE errors (FreeBSD 7.0) NathanPardoe FreeBSD General 9 21st May 2008 12:00 AM

All times are GMT. The time now is 07:09 AM.

Powered by vBulletin® Version 3.8.4
Copyright ©2000 - 2019, Jelsoft Enterprises Ltd.
Content copyright © 2007-2010, the authors
Daemon image copyright ©1988, Marshall Kirk McKusick