[c-nsp] Weird Multicast microburst amplification issue

Matthew Huff mhuff at ox.com
Tue Dec 13 09:49:27 EST 2011


I agree that the 10g->1G is probably the culprit amplifying the burst nature of the packets. At our colo at the NYSE in Mahwah we have implemented HP's flex fabric in blade chassis instead of doing what you are doing with the Arista's and it's working fine. We probably are going to have to do the same at our core datacenter.

http://h18004.www1.hp.com/products/quickspecs/13127_div/13127_div.html

Basically you connect N number of 10GB uplinks into the flex fabric and you can control how the bandwidth is allocated to each blade.

I was hoping to avoid having to do anything, but it looks like the data rates are killing the port buffers. I was hoping that the 6500/sup720 with 6748 would handle > 120Mbps, 12k pps multicast, but it doesn't look like it.



----
Matthew Huff             | 1 Manhattanville Rd
Director of Operations   | Purchase, NY 10577
OTA Management LLC       | Phone: 914-460-4039
aim: matthewbhuff        | Fax:   914-460-4139


> -----Original Message-----
> From: cisco-nsp-bounces at puck.nether.net [mailto:cisco-nsp-
> bounces at puck.nether.net] On Behalf Of Jeff Bacon
> Sent: Tuesday, December 13, 2011 9:30 AM
> To: cisco-nsp at puck.nether.net
> Subject: Re: [c-nsp] Weird Multicast microburst amplification issue
> 
> > It definitely looks like a classic microburst output buffer overflow
> > problem, but with a Sup720 and a 6748 module, I haven't seen this at
> > this volume. Ticker volume has peeked recently, and that might
> > contribute to it. It appears to start happening with more than
> 120Mbps
> > and/or 12,000 pps output on the port. Other than moving to 10GB, I
> > don't see any solutions.
> > Given the 6748 buffer size, I'm surprised it's overrunning it at this
> > volume.
> 
> It could very well be. The port buffers on a 6748 are only about so
> big, after all.
> 
> The amplification factor may come from the simple fact that you have a
> 10G pipe between the switches. "But the input is only coming in at 1G!"
> you say. Yes, but it's then being intermingled on the 10G pipe.
> Probably on a 6708/6716 with 200mb port buffers. After going through
> replication engine.
> (Ingress mode? Egress? Shouldn't matter though.) So while in theory the
> traffic should be coming through the 10G at 1G rates, it isn't
> necessarily, and you have to consider the possibility that you are,
> yes, facing the ol' 10G->1G neck-down problem. 100 packets @ 1500 bytes
> == 1.5mbyte == buffer go boom.
> 
> If the packets are large, you also have serialization delay to
> consider. What takes 3 micros to get out the 1G pipe only takes
> 1 micro to come in the 10G pipe. Multiply.
> 
> I'm not going to point at any of these and say "that's it" - but I can
> see where it can happen, as annoying as crap as it might be. Someone
> suggested running a 1G pipe between the switches to see whether the
> problem went away - I suspect that is what they were pointing at.
> 
> I've been moving hosts off the 6500s and onto 10G off aristas fed off
> the 6500s. Let the 6500 drive the WAN, the aristas handle fan-out.  I
> am actually sitting here debating swapping out a pair of VS720s with
> sup-2T kit - not even because the hardware is working particularly hard
> as-is (by the stats, they've got life downright good) but because
> sledgehammer overkill seems to be about the only safe option in dealing
> with these kinds of flows, I know it will take me 3 months to swap the
> sups and cards out, and it might be better to start now, however little
> thrilled I am at forking $20k more per switch than I had originally
> intended (the VS720 parts swapped would be used to populate some new
> chassis in a DR/test facility, the original idea was just to buy VS720
> parts, but my vendor came up with better prices on sup2t kit than I'd
> seen even a couple months ago so now it's just in the range of
> "argh...maybe..."
> instead of "no, way too much").
> 
> Or to quote one of my employees, "who knows what MOAR will be asked for
> next..." - or, when will OPRA blow the cap again?
> 
> I suspect you just helped me decide.
> 
> Grrrr.
> -bacon
> 
> _______________________________________________
> cisco-nsp mailing list  cisco-nsp at puck.nether.net
> https://puck.nether.net/mailman/listinfo/cisco-nsp
> archive at http://puck.nether.net/pipermail/cisco-nsp/



More information about the cisco-nsp mailing list