[ednog] DNS server monitoring

Mon Nov 28 18:22:20 EST 2005

On Mon, 28 Nov 2005 14:29:39 -0800
Michael Sinatra <michael at rancid.berkeley.edu> wrote:

> o Monitoring actual queries, such as via syslog or some other method. 
> Do you do this and do you ship all of your queries to a central syslog
> server?  If not, how do you monitor DNS queries?  How long to you save
> the logged queries?  What kinds of trolling do you do (looking for 
> naughty queries that might indicate compromise, botnets, etc).

We log all queries (and other things) from our primary campus name
servers to a centralized logging facility.  Some logs remain local
to the server, which is helpful if the logs aren't getting to the
remote syslog server.

We tend to save the logs for as long as 3 months, which given current
usage can be about 2 GBs uncompressed per day, though they are nicely
compressed (around 250-300 MB per day).

I run some custom tools against a tail -f of the queries looking
for the "naughty queries" and based on some simple algorithms the
ones that stand out get put into an incidents MySQL database, which
then gets imported into our custom security incident and network
status management system, primarily to be acted upon by local admin
contacts.

I also have a handful of scripts and tools that I run for more
in depth analysis.  For example, to summarize a list of queries a
host makes from a query log file (replace $9 with whatever position
the query name is in your logfiles):

  grep ${client-ip-address} ${logfile} |
       awk '{print $9}' |
       sort |
       uniq -c

Obviously DNS queries tell a lot about what a host (user) may be
doing and you may not want examine data like the above script does
for privacy reasons.

I've also been looking at some advanced methods of finding anomalous
queries, which so far I'm finding hard and easy at the same time.  It
is easy to find some obvious anomalies, but so far I find it a bit
hard to automatically determine exactly what all those anomalies are
without further manual investigation (thought some seem to be easy :-).
Hopefully I'll have more interesting things to say about this effort in
a few months.

For situations where remote logging or long term storage is a problem,
one idea I've heard, but have not tried is to send the query logs to
a fifo pipe and you monitor that in real time when you need to.

> o Aggregate statistics such as number and type of queries per second. 
> Do you have any (say, RRD-backed) script that either monitors the
> server  itself or goes through syslogs and generates aggregate
> statistics?  Do  you use SNMP on your DNS servers?  Any issues with
> that?  (It might be  useful to mention the OS you're running.)

I do something real cheesy.  Through cron I mail us off a daily summary
of the previous day's logs.  Basically this script is a bunch of regexes,
which you can find here (named-report):

  <http://aharp.ittns.northwestern.edu/software/>

That script summarizes a bunch of different types of BIND log messages
including some things about queries.  It also gives an hourly summary
graph of log messages.

> I'm leaning toward a regime where I would log all my queries to a 
> dedicated syslog server, which would then have a script that would
> parse  the raw logs and generate RRD graphs of aggregate query
> statistics.  Any  gotchas you can think of?  (One that I know if is
> that the syslog server  can't be configured to do reverse lookups,
> using one of the DNS servers  it's monitoring, or it will get into a
> rather nasty loop as it does a  lookup for every query, which
> generates a query log, which generates a  lookup, which generates
> a...)  I also plan to test how the syslog  processes on the DNS

Why would the monitoring server need to do queries, unless you are
trying to resolve the things found in the query logs.  I suggest if
you want to do this, put something near the DNS servers and pull out
the data you need with a packet capture.  Building this is a little
more complex of course, but provides a wealth more information than
logs can provide.  I know you're familiar with the passive DNS
replication work from RUS-CERT, but for those that aren't, this is a
good example of the possibilities:

  <http://www.enyo.de/fw/software/dnslogger/>

> servers deal with the syslog server going down for  an extended period
> of time.  I don't think that should be a problem  because they're just
> throwing udp at the server...

Correct, once the DNS server puts the logs on the wire, they don't
care about what happens to them.

> Anyone have any suggestions for relevant off-the-shelf (open-source) 
> tools that might help?

A lot of people like syslog-ng over many of the standard syslog
daemons that you get with many standard OSes.

John