View Single Post
  #1   (View Single Post)  
Old 26th August 2011
tomp's Avatar
tomp tomp is offline
Real Name: Tom Purvis
Local Area Nitwit
 
Join Date: Aug 2011
Location: Colorado
Posts: 17
Default pf interfering with local lan peer communication

After a couple weeks of installing/configuring OpenBSD 4.9, learning about PF, testing pf.conf rules on a closed test network, last night the boss and I stayed late to put it into place as our firewall when we could do it without disrupting normal operations. It took about four hours to root out our gremlins and get all the critical stuff working. It was good, I left last night feeling like the past couple weeks had been worth the furrowed brow.

We are a mail order outfit. My boss and I are the IT department. We run an application supplied by UPS called Worldship. We use a multi-node version of it that has an MS SQL Server DB at its core. The nodes talk to that central machine more or less as peers, and all of them talk to the mothership through the 'net.

We are in a busy season, and today after I'd been here a couple hours I got word from the warehouse that the main Worldship box was working but that non of the peer nodes were, they threw an error about not being able to connect to the MS SQL Express DB. Now, this multi-node version of Worldship is pretty brittle. We used to use a single node version and it was much less complicated, but we needed more nodes to scale our operation.

As it was failing I watched tcpdump and saw some blocked communications coming in from outside on various obscure ports, and I wrote rules to pass them as I saw them. (UPS says that you only need 1433/1434 and 443 open). I was able to write pass rules to get all of those log entries to go away, and still the trouble persisted. I tried writing a quick rule or two to be sure that nothing was jacking around with internal LAN communications. No dice. Finally I actually commented out the "block log all" statement that is my first rule. When I ran pfctl verbose, none of the output said "block". No. Dice.

We had no choice but to take the OpenBSD box out of the way and put things back as they were before 6 PM yesterday. Worldship worked on all nodes immediately.

Very disappointing.

While I was at lunch, muttering and grinding my teeth, it occurred to me that the NAT rule could be dicking around with my internal communications.

match out on $ext_if from $localnet nat-to ($ext_if)

It occurred to me that perhaps before the nat-to rule I should have a quick rule that says something like:

do not dick around quick on { $int_if, $localnet } rdr-to $go_foff_yourself

(or more seriously)

pass inet quick on $int_if from $localnet

Something like that anyway.

I'm grumpy. I'm sure it shows in this post. If you are inclined to ignore the grumpy, I understand. But I'm hoping to have a pretty strong guess or two up my sleeve next time we break the whole network to get this %&#$&ing thing back in production. If you can add any ideas or throw me a clue, I'd appreciate it. Very much.

Last edited by tomp; 26th August 2011 at 07:55 PM.
Reply With Quote