[c-nsp] ASR9k Bundle QoS in 6.0.1

Wed Jun 8 07:16:42 EDT 2016

Hi,

>> My understanding is that the central arbitration has a few benefits, such as zero packet loss when there is an RSP failure (because every request is sent to _both_ arbiters at the same time, so when one RSP fails there is only one answer back, not two). Thus, the packet in flight is still transmitted without loss. Critical when you have very sensitive BFD over MPLS tunnels traversing the chassis. There may not be a quick enough 'detection' of the failed RSP/arbiter on the core device to maintain data plane forwarded BFD relationships between up/downstream nodes.

>But is that really so? Why can't we do this with 100% separated fabrics with distributed arbiters? The linecard could resend fabric request if it does not receive reply in n microseconds, to handle failover with no loss?

I believe the fabrics are already 100% separated (hence you lose capacity by 50% when an RSP fails). I cannot vouch for the distributed arbiters element, however, I believe a central point with a view of the entire VOQ for a given interface sounds like a better idea to me because it is able to (at least) get the high-priority elements to the front of the queue regardless of which input NP they arrive on. The example being 10G and 1G input ports on separate NPs and a 1G output port on a 3rd NP. The 10G having a burst of low-priority traffic while the 1G input port has a stream of high-priority input. Both heading towards the 1G output. If they were using separate arbiters, the egress NP would need to signal to both the arbiters that it is 'full' instead of just one. So you still finish up with duplication of control signalling but in a different area. I'd love to hear from someone with a deep understanding of how this works, the documentation stops short of anything more specific about all this.

>> The second benefit, relating to my original question :) - is that backpressure from a congested egress port can be applied 'globally' to all ports on the chassis, via the virtual output queues. This should (in my opinion) open up the door for centralised egress queueing because it seems to me that the pieces are all there, and, maybe, it is now in place from 6.0.1 onwards so that bundles may support it across all member ports.

> I don't think the arbiter experiences the traffic in higher precision than NPU, I don't think it can discriminate between actual WAN ports.

Again, the lack of detail on the documentation of the arbiter functionality somewhat limits the understanding of this, I believe(d) it has a concept of 3 colour - High, Normal and Normal-Low - per output queue - per _CLASS_. Which, if driven by a "I'm ready for Normal and High for VOQ XXX" type of feedback from an egress NP towards the arbiter could (to me) be easily used to drive an egress rate limit at least per class?

Cheers :)

Robert Williams
Custodian Data Centre
Email: Robert at CustodianDC.com
http://www.CustodianDC.com