[c-nsp] Sampled netflow on 6500/7600
Richard A Steenbergen
ras at e-gerbil.net
Fri Jun 30 17:51:12 EDT 2006
Ok so, I'd like to see if I can get to the bottom of the netflow issues on
6500/7600s once and for all. Everyone I've talked to who is trying to do
netflow on those platforms and push any decent amount of traffic seems to
be running into the same issue, namely rapid netflow tcam exhaustion on
even 3BXLs with only a few gigs of ordinary traffic.
I've been able to BARELY scrape by, by setting the flowmask to destination
only and using super aggressive aging, so that "most" boxes are not at
100% tcam utilization "most" of the day. Even going to destination-source
on a lightly loaded box instantly fills the tcam.
Trying to enable netflow sampling actually makes the situation worse. The
sampling appears to be done at export time instead of sample time, so
there is no help for the tcam. Also, enabling the time/packet based
sampling most frequently noted in the documentation forces the flowmask to
"interface-full", which exhausts all the the tcam 24/7 on a box that is
pushing even a few gigs of traffic. One big down-side to not being able to
enable sampling is the large volume of netflow traffic generated, you can
easily end up exceeding 100Mbps worth of netflow data by deploying a few
7600s, even with dest-only flowmasks (also the documentation notes that
data under the nexthop/egress ifindex fields may not be accurate with
dest-only flowmasks). I haven't tried netflow aggregation yet, but I've
heard it has the same problem, the aggregation is done at export time.
So, after digging through the documentation I found a little-referenced
command for random sampling which appears to be implemented in SXF and
SRA, "flow-sampler". So for example, using:
flow-sampler-map sampling
mode random one-out-of 1000
int g#/#
flow-sampler sampling
And of course removing the old "ip flow ingress" commands first, the
router does seem to be exporting netflow data but with no noticable
reduction in tcam or amount of data being exported. Also, multiplying the
data received by 1000 leads to wild values, so I think its safe to assume
that nothing is being sampled. All of the "ip flow ingress" commands on
the entire box have been removed for these tests, on both SXF4 and SRA.
The only feature difference in SRA appears to be that you can do
flow-sampler egress (and ip flow egress), but for this test it is ingress
only.
Ideas?
--
Richard A Steenbergen <ras at e-gerbil.net> http://www.e-gerbil.net/ras
GPG Key ID: 0xF8B12CBC (7535 7F59 8204 ED1F CC1C 53AF 4C41 5ECA F8B1 2CBC)
More information about the cisco-nsp
mailing list