View Single Post
  #8   (View Single Post)  
Old 19th May 2015
rtwingfield rtwingfield is offline
Real Name: Ron Wingfield
Port Guard
 
Join Date: Oct 2008
Location: Little Rock, AR USA
Posts: 36
Thumbs up No Route to Host . . .really?

Quote:
Originally Posted by jggimi View Post
Since it's not clear to me what's happening, I'll start with an ASCII picture. Is this correct?
Code:
{Internet via a /29} - [DSL gateway router] - {192.168.1.x/24} - [Servers and Workstations]
Mostly yes. I have a typical 8-block of "small business" IP addre$$e$ "on loan" from AT&T/SBC. The compliment consists of the usual suspects, broadcast, loopback, gateway, and four "customer usable" addresses. (That totals seven; what am I missing, multicast . . .whatever?) This is all ported through a DSL router-modem with and integrated 4-port switch. The DSL "router" is dumb'd down to essentially a bridge. Between this device and the subnetted LAN is a NetGear FVX538 VPN Firewall router with an integrated 8-port switch.

Attached to the LAN are several server platforms and "Windoze" workstations, that include three FreeBSD boxes, an AS/400, and across town, yet another NetGear FVS318 router (attached to it's own wireless ISP).

An Apache HTTP server is running on one of the aforementioned FreeBSD platforms, is attached to the local LAN and host several virtual webhosts. (Keep in mind that these virtual hosts all attached to the primary webserver's IP address.) These virtual hosts are parcel to the problem. Notably, the httpd.conf file, et.al, has not changed. All webhosts were serving well before the upgrade of the nameserver to BIND v9.10.2 that runs on a separate FreeBSD platform, but on the same LAN. Problem now is that only one of the virtual hosts will serve, and it is not the primary webhost; however the zonefiles are practically identical -- one works, the other not.
The diagnostic complaint is "no route to host". Additionally, I'm seeing similar complaints from sendmail and qpopper.

Philosophically and certainly pragmatically, I agree with Cricket Liu and Paul Albitz's assertion that
"[the] worst problem with DNS is that despite its widespread use on the Internet, there's really very little documentation about managing and maintaining it. Most administrators on the Internet make do with the documentation their vendors see fit to provide and with whatever they can glean from following the Internet mailing lists and Usenet newsgroups on the subject.

This Lack of documentation means that the understanding of an enormously important internet service -- one of the linchpins of today's Internet -- is either handed down from administrator to administrator like a closely guarded family recipe, or learned repeatedly by isolated programmers and engineers. New zone administrators suffer through the same mistakes made by countless others."

DNS and BIND, 5th edition, O'Reilly, May 2006, Preface p.xii,

OK, I'll stand down off my soap box. I just wanted to take an opportunity to vent . . .should I begin to sound ignorant of BIND and nameservice in general.

. . .continuing on topic:

Quote:
You cannot [reach] one or more of the addresses in your /29 block from your private network?
IP addresses are reachable. I can login via ssh to the WAN address of the server in question and established a console session on a Windows workstation attached to the LAN. (This also works from a remote WAN site.)

Quote:
You get "no route to host" when using your resolver, but you succeed when using the specific IP address within the /29 block?
Within the bounds of the LAN, host names seem to be unreachable by name as in "no route to host". Generally speaking and depending whether inside the LAN or outside on the WAN, I can ping IP addresses. For example, the name archaxis.net is unreachable, but I can ping it's IP address. If I use the IP address of the primary http webhost in a browser, then the page serves; however, if I specify it's name, then failure with no route to host is diagnosed.


Quote:
Are all other Internet addresses reachable through the resolver?
Other valid Internet wetsites such as google.com, forecast.weather.gov, and obviously this forum, daemonforums.org, are accessible via browser lookup, as well as WAN IP adress.

------------------------
I'm beginning to think that this is not a resolver problem per se. First, I'm not sure that the named is throwing the "no route to host" message. Often, it is difficult to determine from where or what process the message(s) originated. (BTW, this is where OS/400 really shines, but that argument is for another suitcase of beer). Just late last evening, I found similar examples in /var/named/named.log as follows:

Notice that it starts with a query to 151.164.1.11. This is one of AT&T/SBC's nameservers.

It continues with:

security: info: client 151.164.1.11#26716 (www.ar042swrcap.org): query 'www.ar042swrcap.org/A/IN' denied

. . .followed with

query-errors: debug 3: client 151.164.1.11#26716 (www.ar042swrcap.org):[ query failed (REFUSED) for www.ar042swrcap.org/IN/A at query.c:6328

Another example:

queries: info: client 151.164.1.11#26716 (www.ar042swrcap.org): query: www.ar042swrcap.org IN A -EDC (192.168.1.73)
18-May-2015 21:33:08.135 security: info: client 151.164.1.11#26716 (www.ar042swrcap.org): query 'www.ar042swrcap.org/A/IN' denied
18-May-2015 21:33:08.135 query-errors: debug 3: client 151.164.1.11#26716 (www.ar042swrcap.org): query failed (REFUSED) for www.ar042swrcap.org/IN/A at query.c:6328
18-May-2015 21:33:08.135 client: debug 3: client 151.164.1.11#26716 (www.ar042swrcap.org): error
18-May-2015 21:33:08.135 client: debug 3: client 151.164.1.11#26716 (www.ar042swrcap.org): send
18-May-2015 21:33:08.135 client: debug 3: client 151.164.1.11#26716 (www.ar042swrcap.org): sendto
18-May-2015 21:33:08.136 client: debug 3: client 151.164.1.11#26716 (www.ar042swrcap.org): senddone
18-May-2015 21:33:08.136 client: debug 3: client 151.164.1.11#26716 (www.ar042swrcap.org): next
18-May-2015 21:33:08.136 client: debug 3: client 151.164.1.11#26716 (www.ar042swrcap.org): endrequest



. . .then there is this example:


client: debug 3: client 151.164.1.11#45884: query
18-May-2015 21:50:24.584 queries: info: client 151.164.1.11#45884 (classxboats.com): query: classxboats.com IN A -EDC (192.168.1.73)
18-May-2015 21:50:24.584 security: info: client 151.164.1.11#45884 (classxboats.com): query 'classxboats.com/A/IN' denied
18-May-2015 21:50:24.584 query-errors: debug 3: client 151.164.1.11#45884 (classxboats.com): query failed (REFUSED) for classxboats.com/IN/A at query.c:6328
18-May-2015 21:50:24.584 client: debug 3: client 151.164.1.11#45884 (classxboats.com): error
18-May-2015 21:50:24.585 client: debug 3: client 151.164.1.11#45884 (classxboats.com): send
18-May-2015 21:50:24.585 client: debug 3: client 151.164.1.11#45884 (classxboats.com): sendto
18-May-2015 21:50:24.585 client: debug 3: client 151.164.1.11#45884 (classxboats.com): senddone
18-May-2015 21:50:24.585 client: debug 3: client 151.164.1.11#45884 (classxboats.com): next
18-May-2015 21:50:24.585 client: debug 3: client 151.164.1.11#45884 (classxboats.com): endrequest


This leads me to think that the server is resolving, but something is interrupting the validation process. It appears that there is something going on at my upstream ISP's nameserver that is unhappy with my queries. This is new . . .never had this problem before I upgraded to BIND v9.10.2.

As I alluded to before, there is not a great plethora of documentation regarding named diagnostic and logged messages. Regardless, is the "no route to host" message a result of a failed query? If so, then perhaps this is not a resolving failure, but more of an ambiguous diagnostic resulting from a denied or failed query.

If this is so, then what would/should I do to avoid or correct the problem?

I have also observed several logged messages regarding "request is not signed". What's that about?

18-May-2015 22:20:08.439 security: debug 3: client 127.0.0.1#61061: request is not signed



Additionally, I have experimented with running BIND (named) without the /var/named/resolv.conf file. (Why do this? Cricket Liu and/or Paul Albitz have suggested that the table may not be needed with small scope BIND environments); never-the-less, I saw log messages that stated something like "could not resolve" rather than "route to host not found". It was early hours this morning . . .2 or 3 AM . . .I did not sleep at all last night . . .tired and cannot think or recall how I happened onto the scenario. I'll try to recreate the situation, but for now, it's about 11 PM. I've got to get some sleep.

I'll edit this drama when I have more info.


*** EDIT *** 6:43 AM 19 May 2015

. . .everyone standby -- I think I have discovered and corrected the issue, and it is associated with my upstream ISP (AT&T/SBC) denying recursive queries as per the following example:

query-errors: debug 3: client 151.164.1.11#26716 (www.ar042swrcap.org):[ query failed (REFUSED) for www.ar042swrcap.org/IN/A at query.c:6328


In my /var/namedb/named.conf file, I have removed the allow-recursion clause that for some reason I had allowed in my new named.conf file! Notice the tested date in the following code. Once I deactivated the function and restarted named, then BAM! suddenly sendmail and qpopper came alive, downloaded days of eMail, and bingo, my Apache virtual hosts are serving!

Code:
 recursion yes;       // DNS & BIND, 4th edition, p.283 & 322 ...tested Jan 25, 2013
        allow-recursion {     // Only relevant if recursion yes is present or defaulted.
                        "internal";
                };

        // allow-query {             // DNS & BIND, p.315 ...tested  Jan 27, 2013
        //              "internal";
        //              "external";
        //      };
        // testing with com-out to see if server can receive email.
        // ...and yes, this was the problem. (Allowing queries hoses SendMail.)

(I'm going to post this, and come back to my discussion . . .probably should call it, "Ron's Rant".)

. . .continuing, there are still some oddities that I cannot explain. For example, why was one Apache virtual host allowed to serve successfully while all others were denied? In consideration that the associated zone files are essentially identical, there must be some unnoticed subtle nuance. Most important though, is my assumption that the nameserver is/was indeed resolving, only the queries were being blocked upstream, and the no route to host diagnostic messages were (and still are in some cases) a poor and generalized suggestion (probably generated by ICMP) that something is very wrong somewhere else.

As much as I like UNIX, this is where IBM's OS/400 shines. (Don't laugh, I've made a lot of money programming and supporting AS/400 users.) OS/400 supports a plethora of diagnostic and help messaging. The OS allows you to interrupt interactive processes and look at things like the program invocation stack. Messages usually have a second level of help that can provide tremendous details about the primary subject. Interactive debug provides multi-colored step-by-step walk through of displayed source code while executing the actual code. Break points are easily set, and program variable values are easily displayed. But I digress, my point is that if indeed the no route to host diagnostic message is generated by the ICMP, then the system could have been designed and written to throw a message something like:

No route to host . . .but you know what, it's because your upstream ISP host screwed you -- they don't allow recursive queries!" . . .and oh by the way, they are AT&T and their IP address is 123.456.789.10. Go cry.

You see, ICMP (. . .whatever) already knows what happened -- it made a yes/no-go decision somewhere, and if it evaluates responses to queries and determines DENIED or REFUSED, then tell me the same, rather than "no route to host". Don't make me spend two weeks grepping through obscure named logs. (If only I were king of the world.)

Last edited by rtwingfield; 19th May 2015 at 02:14 PM. Reason: . . .spelling and grammar.
Reply With Quote