[nsp-sec] dns issues?

Fri Feb 27 13:41:33 EST 2009

Florian Weimer wrote:

> It might be interesting to look at packet captures/traces.  Something
> like this can happen if the stub resolver picks an unlucky source port
> (such as 1434/UDP).  More speculation. 8-/

Here's a short example of what I was seeing: 206.168.220.23 is my 
customer's nagios host that is checking our availability, and 
206.168.229.20 is the query-source address for the anycasted resolver IP 
that he's querying:

11:50:01.186355 IP 206.168.220.23.41655 > 208.139.192.2.53:  5405+ A? 
www.visionlink.org. (36)
11:50:01.188296 IP 206.168.229.20.6666 > 4.71.222.18.53:  30212 [1au] A? 
www.visionlink.org. (47)
11:50:01.231203 IP 4.71.222.18.53 > 206.168.229.20.6666:  30212*- 1/2/3 
A 66.84.12.194 (143)
11:50:01.232297 IP 208.139.192.2.53 > 206.168.220.23.41655:  5405 1/2/0 
A 66.84.12.194 (100)
11:55:01.190208 IP 206.168.220.23.58803 > 208.139.192.2.53:  51414+ A? 
www.visionlink.org. (36)
11:55:06.191280 IP 206.168.220.23.58803 > 208.139.192.2.53:  51414+ A? 
www.visionlink.org. (36)
11:56:01.145814 IP 206.168.220.23.59671 > 208.139.192.2.53:  45763+ A? 
www.visionlink.org. (36)

When the service is up nagios checks every five minutes, and then when 
down every one minute. There's nothing notable in the system logs at the 
time of failure. tcpdump didn't drop any packets when capturing. 
Examining the full dump reveals nothing else unusual at that time - 
other queries were being answered. There was little non-dns traffic (the 
only thing I excluded in the capture was my own ssh IP). The A record 
they are querying for has a one hour TTL, so the subsequent queries 
above should have all been answered from cache.

rndc stats shows a high rate of failures, but these are mostly accounted 
for from non-client queries being dropped by policy. I've since ACL'd my 
resolver anycast IPs at the borders to stop that traffic completely so 
that rndc stats will be more useful and BIND will spend less time 
refusing foreign queries. I've also increased our logging. But since the 
problem just disappeared on its own, my trail has gotten cold. I'm 
tempted to write it off to "cosmic rays" ;)

>> 2) Yesterday another customer discovered his own resolver cache was
>> poisoned, and his access to some web sites was being proxied through
>> vipertheripper.com
> 
> Have you been able to figure out how the cache was poisoned?

The customer was running either an old version of BIND and/or had simply 
neglected to configure the source port randomization (not sure which, 
but the doxpara test results of his server were "Poor" so I sent him the 
URL for the secure BIND template and info about the issues last year).

He did send me a couple dig outputs and I found it curious that someone 
would have any reason to poison an entry like clock.redhat.com.

  ns3# dig clock.redhat.com

  ; <<>> DiG 9.3.1 <<>> clock.redhat.com.
  ;; global options:  printcmd
  ;; Got answer:
  ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 38043
  ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 3, ADDITIONAL: 0

  ;; QUESTION SECTION:
  ;clock.redhat.com.              IN      A

  ;; ANSWER SECTION:
  clock.redhat.com.       102     IN      A       66.187.233.4

  ;; AUTHORITY SECTION:
  redhat.com.             102     IN      NS      ns3.redhat.com.
  redhat.com.             102     IN      NS      ns1.redhat.com.
  redhat.com.             102     IN      NS      ns2.redhat.com.

  ;; Query time: 5 msec
  ;; SERVER: 207.174.202.18#53(207.174.202.1

vs:

  ns3# dig clock.redhat.com @63.211.239.4

  ; <<>> DiG 9.3.1 <<>> clock.redhat.com @63.211.239.4
  ; (1 server found)
  ;; global options:  printcmd
  ;; Got answer:
  ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 51209
  ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 2, ADDITIONAL: 0

  ;; QUESTION SECTION:
  ;clock.redhat.com.              IN      A

  ;; ANSWER SECTION:
  clock.redhat.com.       115     IN      A       115.126.5.200

  ;; AUTHORITY SECTION:
  clock.redhat.com.       79612   IN      NS      nx1.redhat.com.
  clock.redhat.com.       79612   IN      NS      nx2.redhat.com.

  ;; Query time: 3 msec
  ;; SERVER: 63.211.239.4#53(63.211.239.4)

[12:46] david: 115.126.5.200, second result, is the squid proxy
[12:46] david: resolves to vipertheripper.com
[12:47] david: then, after a restart of named its went away

By the time I started looking into this vipertheripper.com was no longer 
resolving for me - the glue at roots for their NS was missing.

Mike

-- 
Rockynet.com
303-629-2860
AS13345