[c-nsp] NCS-5001 - MPLS L3VPN Issue

James Bensley jwbensley at gmail.com
Tue Feb 2 10:32:38 EST 2016


On 2 February 2016 at 15:09, Adam Vitkovsky <Adam.Vitkovsky at gamma.co.uk> wrote:
> Are you running 5+ by any chance?
>
>> It’s been years since IOS-XR was released on ASR9000's, no excuse now for
>> basic features still not working. The TAC responses aren’t helpful either;
>
> I'm sorry to hear that as I have very positive experiences solving cases with XR team in Europe.
> And if the guy on the line did not know how to solve some hardcore problem he would get me the SME, a gentleman who designed the particular technology on XR so we could have a private techtorial for couple of hours to get it solved.

So TAC responses for "something is broken we need help" is good,
config issues great, XR TAC is way better than "regular" TAC IMO. I'm
talking about fresh new bugs. TAC look, agree its a bug, it has to get
punted to the BU, now the pace of help slows way down because even if
the TAC case is a P1 whatever BU has been roped in seems ignorant of
any sense of priority.

>> things like "running an Inter-AS MPLS Option B and BGP-LU at the same time
>> is not supported" - So we can have labelled VPN routes, or labelled GRT
>> routes but not both? In this day and age! Someone once said to us “Inter-AS
>> MPLS Opt C isn’t supported at all” - which we were running on the PE/ASBR
>> under investigation. We’ve had bucket loads of issues/TAC cases (we are still
>> opening TAC cases at a decent rate).

> That's striking.

It's ridiculous is what it is.

> I don't know about that as I have been running labelled-unicast and vpnv4 AFs in the lab just fine.
> So which part of the Inter-AS MPLS OptC is not supposed to work according to TAC please?

I think actually the problelm was we were running OptC and Opt B (so
LU path between RRs and VPNv4 between ASBRs, basically migrating from
one option to the other in a stagged approach). During this period we
had some issues that are still present after the migration I believe
(most label recycling issues).

Label recycling issues, lets talk about that. Jesus christ I've seen
alot of those.TAC don't seem to be able to replicate it however we've
had it on two seperate networks both times as soon as we lifted the
boxes off 4.3.4 default to 4.3.4 + latest SP and the boxes are running
Inter-AS OptB. I'm now building a lab to try and reliably replicate it
for me self so I can kick TAC's arse into fixing it.

Any routers running 4.3.4 default, as soon as they were lifted off
default to SP4/6/8/10 they have all encountered some form of bug,
every time (BGP processes stuck at 50% CPU, label recycling errors,
line cards rebooting etc).

5.1.3 + latest SP is most stable we have found. Once some routers came
with 5.1.2 out of the box which we would 5.1.3 once they were deployed
in the DC (once they were racked etc, whereas we would normally
upgrade pre-shipping to DC) we had some PHY bugs with a WDWM mux. Not
to mention we hit that famous SSH bug on those too with SSH crashing.

> My stance on this is that I'd beaten the kit to death in the lab anyways before deployment so even if Cisco would swear there are no bug I'd do my own scrutiny.
> On the other hand having to report elemental bugs sucks. But is it the case on x.x.3 or x.x.4 version of the code please or was I just lucky or ignorant?

What we often hear from TAC is "$this feature is supported and so is
$that feature, but $this and $that together is not" - they are good in
that they will and look into the problem anyway however telling us
that $this + $that is not support on the same box makes them not
usable for their intended market?!

James.


More information about the cisco-nsp mailing list