[ednog] DNS server monitoring

Michael Sinatra michael at rancid.berkeley.edu
Mon Nov 28 17:29:39 EST 2005


What are EDU folks doing to monitor their nameservers?  I know I have 
posed this question before to individuals on this list, but I'd like to 
survey the group.  (I can summarize and post the summary if you just 
want to reply to me.)

The question is basically in two parts:

o Monitoring actual queries, such as via syslog or some other method. 
Do you do this and do you ship all of your queries to a central syslog 
server?  If not, how do you monitor DNS queries?  How long to you save 
the logged queries?  What kinds of trolling do you do (looking for 
naughty queries that might indicate compromise, botnets, etc).

o Aggregate statistics such as number and type of queries per second. 
Do you have any (say, RRD-backed) script that either monitors the server 
itself or goes through syslogs and generates aggregate statistics?  Do 
you use SNMP on your DNS servers?  Any issues with that?  (It might be 
useful to mention the OS you're running.)

I'm leaning toward a regime where I would log all my queries to a 
dedicated syslog server, which would then have a script that would parse 
the raw logs and generate RRD graphs of aggregate query statistics.  Any 
gotchas you can think of?  (One that I know if is that the syslog server 
can't be configured to do reverse lookups, using one of the DNS servers 
it's monitoring, or it will get into a rather nasty loop as it does a 
lookup for every query, which generates a query log, which generates a 
lookup, which generates a...)  I also plan to test how the syslog 
processes on the DNS servers deal with the syslog server going down for 
an extended period of time.  I don't think that should be a problem 
because they're just throwing udp at the server...

So far, performance hasn't been an obvious problem, even with some of 
the syslog testing I have been doing.

Anyone have any suggestions for relevant off-the-shelf (open-source) 
tools that might help?


PS. Currently, I monitor queries locally on each box and then have a 
script that sshes into each box every 5 minutes and scoops up the 
queries and dumps them in a central location.  That's clunky for a 
variety of (probably obvious) reasons.

More information about the ednog mailing list