[c-nsp] Sampled netflow on 6500/7600

Richard A Steenbergen ras at e-gerbil.net
Fri Jun 30 17:51:12 EDT 2006


Ok so, I'd like to see if I can get to the bottom of the netflow issues on 
6500/7600s once and for all. Everyone I've talked to who is trying to do 
netflow on those platforms and push any decent amount of traffic seems to 
be running into the same issue, namely rapid netflow tcam exhaustion on 
even 3BXLs with only a few gigs of ordinary traffic.

I've been able to BARELY scrape by, by setting the flowmask to destination 
only and using super aggressive aging, so that "most" boxes are not at 
100% tcam utilization "most" of the day. Even going to destination-source 
on a lightly loaded box instantly fills the tcam.

Trying to enable netflow sampling actually makes the situation worse. The 
sampling appears to be done at export time instead of sample time, so 
there is no help for the tcam. Also, enabling the time/packet based 
sampling most frequently noted in the documentation forces the flowmask to 
"interface-full", which exhausts all the the tcam 24/7 on a box that is 
pushing even a few gigs of traffic. One big down-side to not being able to 
enable sampling is the large volume of netflow traffic generated, you can 
easily end up exceeding 100Mbps worth of netflow data by deploying a few 
7600s, even with dest-only flowmasks (also the documentation notes that 
data under the nexthop/egress ifindex fields may not be accurate with 
dest-only flowmasks). I haven't tried netflow aggregation yet, but I've 
heard it has the same problem, the aggregation is done at export time.

So, after digging through the documentation I found a little-referenced 
command for random sampling which appears to be implemented in SXF and 
SRA, "flow-sampler". So for example, using:

flow-sampler-map sampling
 mode random one-out-of 1000

int g#/#
 flow-sampler sampling

And of course removing the old "ip flow ingress" commands first, the 
router does seem to be exporting netflow data but with no noticable 
reduction in tcam or amount of data being exported. Also, multiplying the 
data received by 1000 leads to wild values, so I think its safe to assume 
that nothing is being sampled. All of the "ip flow ingress" commands on 
the entire box have been removed for these tests, on both SXF4 and SRA. 
The only feature difference in SRA appears to be that you can do 
flow-sampler egress (and ip flow egress), but for this test it is ingress 
only.

Ideas?

-- 
Richard A Steenbergen <ras at e-gerbil.net>       http://www.e-gerbil.net/ras
GPG Key ID: 0xF8B12CBC (7535 7F59 8204 ED1F CC1C 53AF 4C41 5ECA F8B1 2CBC)


More information about the cisco-nsp mailing list