[c-nsp] ASR9k Bundle QoS in 6.0.1

Wed Jun 8 04:53:07 EDT 2016

Hi,

You may want to check out BRKSPG-2904 (http://d2zmdbbm9feqrf.cloudfront.net/2014/usa/pdf/BRKSPG-2904.pdf) - around slides 14 to 17. There is probably a more recent version also.

My understanding is that the central arbitration has a few benefits, such as zero packet loss when there is an RSP failure (because every request is sent to _both_ arbiters at the same time, so when one RSP fails there is only one answer back, not two). Thus, the packet in flight is still transmitted without loss. Critical when you have very sensitive BFD over MPLS tunnels traversing the chassis. There may not be a quick enough 'detection' of the failed RSP/arbiter on the core device to maintain data plane forwarded BFD relationships between up/downstream nodes.

The second benefit, relating to my original question :) - is that backpressure from a congested egress port can be applied 'globally' to all ports on the chassis, via the virtual output queues. This should (in my opinion) open up the door for centralised egress queueing because it seems to me that the pieces are all there, and, maybe, it is now in place from 6.0.1 onwards so that bundles may support it across all member ports.

One drawback is obviously that both RSPs are required for the full bandwidth per slot to be available, but that helps sell dual RSPs ;)

I've still not had anyone say either way whether this feature in my original post, showing in the tech docs is indeed something new or not, or just a re-worded bit of commercial spin.

Cheers :)

Robert Williams
Custodian Data Centre
Email: Robert at CustodianDC.com
http://www.CustodianDC.com

-----Original Message-----
From: Saku Ytti [mailto:saku at ytti.fi]
Sent: 08 June 2016 08:20
To: Adam Vitkovsky <Adam.Vitkovsky at gamma.co.uk>
Cc: Robert Williams <Robert at CustodianDC.com>; cisco-nsp at puck.nether.net
Subject: Re: [c-nsp] ASR9k Bundle QoS in 6.0.1

On 8 June 2016 at 05:22, Adam Vitkovsky <Adam.Vitkovsky at gamma.co.uk> wrote:
> Now that I look at the length of this post I'm thinking I should write
> some blog on this, start with basics and build up reader's knowledge
> step by step before mixing all building blocks together into complex concepts.
> As throwing some nuggets here and there like this might actually
> introduce even more confusion.

Please do. The justification offered for centralised arbiter is failover. I'm not sure I understand this.

LC0 => RSP0 => LC1
LC0 => RSP1 => LC1

LC0 wants to send cell to LC1, it has can reach LC1 through either RSP.

Now what failure scenario will require us to have centralised arbiter view? Let's assume LC0 decides to use RSP0 for given cell going to LC1. But that RSP0 dies after receiving request, but before giving grant. How we could solve it:

a) with central arbitration
- RSP1 will detect RSP0 failure, and will proceed to give fabric grants for pending requests

b) without central arbitration
- LC0 will detect RSP0 failure, and will proceed to resend pending fabric requests to RSP1

What did we win by adding this central arbitration complexity? It does not seem it's inherent with in failover? I didn't think about multicast replication yet, but maybe there is some solid argument for it.

--
  ++ytti