[c-nsp] Network Software - Management/Performance
Phil Mayers
p.mayers at imperial.ac.uk
Mon Feb 19 12:47:25 EST 2007
Paul Stewart wrote:
>
> Anyways, just trolling for ideas.... oh, and prefer Linux support but not
> ruling Windows out entirely....
It has warts, but we run Nagios relatively successfully. The config is
(re)built and (re)loaded every 5 minutes (if changed).
This only works because our (in-house) registration system is considered
authoritative and the config is wholly derived from that database - no
nonsense like "periodic discovery" (translation: periodic erasing of
devices and their history because a loopback was renumbered) or such.
Errors in the database are considered mistakes by the ops staff and
corrected by such (and generally detected by auxiliary processes like
parsing the configs and comparing to the database).
Most of the work went into writing custom service checks (shell, perl or
python scripts) to poll things we were interested in such as the
hierarchy of CPUs on a 6500s.
The major downside is that with a lot of devices (>1500) the load
average on the system can become significant as the checks are executed
in sub-processes so there's a lot of fork/exec load.
Some of the more heavily-executed checks we run in a single-process
high-speed parallel poller and they are passwd into Nagios with the
"passive service check" feature, but TBH it doesn't seem to gain much.
We've got complementary bespoke systems to track ARP and FDB entries,
RRD graph interfaces and do config archiving and such. These can all be
accomplished by a combination of netdisco, cricket/cacti and rancid (we
had need for features all of those didn't have).
Next on the list is SEC+syslog-ng for watching logs,
The point: My advice would be to use a combination of tools which are
each best at their individual jobs, rather than looking for one
monolithic package. Integration of the tools can be done in stages and
makes for good tasks for new staff to take on during quiet periods.
Add to which, every large package I've ever had the misfortune to be
involved with is absolute crap.
More information about the cisco-nsp
mailing list