View Single Post
Old 15th April 2010
Carpetsmoker's Avatar
Carpetsmoker Carpetsmoker is offline
Real Name: Martin
Tcpdump Spy
 
Join Date: Apr 2008
Location: Netherlands
Posts: 2,243
Default

Yes, you understand it correctly. This only matters if you actually care about your site showing up in google.

Here are a few examples from my pf.conf

Code:
table <badguys> persist
table <goodbots> persist file "/root/goodbots"

pass in on $if proto tcp from any to $ip2 port http keep state \
  (source-track max-src-conn 50 max-src-conn-rate 200/10 overload <badguys>)

pass in quick on $if proto tcp from <goodbots> to {$ip1, $ip2} port http
block drop in on $if from <badguys>
It is important to use -T expire if you don't want to ban people forever when they make too many requests, so in your /etc/crontab add something along the lines of:
Code:
# Don't ban people for more than n seconds
*       *       *       *       *       root    /sbin/pfctl -t badguys -T expire 5 > /dev/null 2>&1
This wil ban people for 5 to 10 seconds. (Since atrun is only executed every 5 seconds by default).

I make the file /root/goodbots with a simple shell script. Adjust the lists to your needs:

Code:
#!/bin/sh
#

lists="
http://iplists.com/google.txt
http://iplists.com/inktomi.txt
http://iplists.com/lycos.txt
http://iplists.com/infoseek.txt
http://iplists.com/altavista.txt
http://iplists.com/excite.txt
http://iplists.com/northernlight.txt
http://iplists.com/misc.txt
http://iplists.com/non_engines.txt
"

echo -n "" > goodbots

for list in ${lists}; do
        fetch -o /tmp/list ${list}
        grep -Ev '?(^#|^$)' /tmp/list >> /root/goodbots
done

rm /tmp/list
__________________
UNIX was not designed to stop you from doing stupid things, because that would also stop you from doing clever things.
Reply With Quote