[c-nsp] ingress vs egress queues

Fri Mar 4 08:33:52 EST 2011

On 04/03/2011 12:29, Phil Mayers wrote:
> Now I'm curious though - how on earth can you queue on a cut-thru switch?
> If you queue, you're storing the packet and are no longer cut-thru?

basically, yes.  Just like on a store-n-forward switch, if the next-hop on 
your fabric is unable to accept a packet for whatever reason, it needs 
either to be buffered or dropped.  So on a cut-thru, if the destination 
port D is busy with input from port A and a packet arrives on port B 
destined to D, you need some mechanism to store that packet.  Otherwise you 
end up dropping the packet.  You could use ethernet flow control, of 
course, but flow control causes head-of-line blocking, which is 
particularly evil and something which should be avoided at all costs.

> Do you have a pointer to any docs on this?

There are several ways of implementing buffering on a cut-thru switch.  The 
N55k mechanism is documented here:

http://www.cisco.com/en/US/prod/collateral/switches/ps9441/ps9670/ps11215/white_paper_c11-622479.html

The n5k/n55k implements virtual output queues.  It's quite an elegant and 
conceptually simple method of dealing with the issue, but there are other 
mechanisms too.

Regardless of the method used, on a cut-thru switch, most packets are not 
buffered at all, whereas on a store-n-forward, all packets are buffered. 
This means that cut-thru switches usually have surprisingly tiny buffers. 
E.g. the N5K has per-port buffers of 480k, or 9.6Mb per chassis for 20 x 
10G ports.  Whereas other 10G chipsets use even less - Broadcom and Fulcrum 
based boxes normally share 2M buffers for 24 x 10G ports.

Most of the cut-thru boxes use a cell based architecture rather than a 
complete packet based architecture too.  So, each incoming packet is 
assigned one or more cells for packet storage, depending on the size of the 
original packet size.

For the N5k, the cell size is 160 bytes, which means that if you have a < 
160 byte packet, it will take one cell, but if you have 161 to 319 byte 
packet, it will take two cells.  This can dramatically affect buffer usage 
during switch operation, if you end up with lots of packets of the "wrong 
size", so to speak.

As an interesting aside, this was one of the more interesting 
methodological choices in last year's Nexus5k vs Arista 7124 lab test, 
supervised by Miercom:

> http://www.cisco.com/web/strategy/docs/finance/Miercom_N5K_Arista_1Apr2010.pdf

You'll note that the max packet size used is 128 bytes.  If they had 
increased this to 161 bytes, the n5k would behaved quite differently for 
many of the tests.

There were several other curious methodological choices in this test too 
(e.g. when in reality are you going to send a co-ordinated burst of traffic 
of exactly the entire port buffer size of an N5k port from 23 input ports 
to a single output port?), but a discussion of that is better done over a beer.

Nick