[c-nsp] ASR1002 -- interface stops passing IPv4 traffic?

Fri May 19 03:32:19 EDT 2017

Hi John,

Unfortunately we have seen several examples of this on the ASR9k platform with partial NP microcode lockups. Almost every case relates to us using more than 2 or 3 'features' per physical interface (like Sampled Netflow + Mirror First 64 bytes + Sub Interfaces + CBQoS). These may not apply to you, but after countless TAC cases and following various leads I can confirm there are many different new and exciting ways to break it with each IOS release. We also had examples of breakage in only IPv4 or only IPv6, or only sub-interfaces... and so on.

In all cases, the rest of the platform (forwarding, routing and in some cases even hardware BFD) continued to believe the port was up/up. In fact, in several cases even the port mirroring itself continued to work - but none of the traffic on the mirror was actually being forwarded. All of our examples were repeatable with the correct combination of features. For other readers, all of these issues were in the 5.3.3 train and at the very beginning of the 5.3.4 train (mostly fixed in 5.3.4 SMUs).

So, purely on experience with the 9k platform (XR obviously), this sounds like you are hitting a bug to me. TAC may be your best option here, or start stripping out features until it stops doing it, then add them back in the same order you removed them (which is what we did, as TAC was taking too long).

Not a massive amount of help, but thought I'd share!

Cheers,

Robert Williams
Custodian Data Centre
Email: Robert at CustodianDC.com
http://www.CustodianDC.com

-----Original Message-----
From: cisco-nsp [mailto:cisco-nsp-bounces at puck.nether.net] On Behalf Of John Osmon
Sent: 19 May 2017 05:44
To: Cisco Network Service Providers <cisco-nsp at puck.nether.net>
Subject: [c-nsp] ASR1002 -- interface stops passing IPv4 traffic?

I've never found an IOS device I couldn't tame with the help of Usenet and then google.  However, I'm new to the ASR1000 and IOS-XE, and I'm running into something I've never seen before.

I've got GigE ports that will pass traffic, and then suddenly stop.
The interface still shows up/up, but you can't even ping the local interface from the router itself.

We've can restore traffic by moving the config to another port, but the "dead" port stays dead.  We've tried shut/no shut, new SFPs, and new configs -- but the port still won't work.

Interestingly, the port *DOES* work with IPv6 -- but not IPv4.  This router doesn't use IPv6, but when I put an address on the interface, it is pingable.

If you apply an IPv4 /24 to the dead interface, the routing table shows the /24 as a "connected" network, and shows a "local" /32 for the address in use -- but is not pingable.

The only thing we've found in common between the ports is that they were connected to eBGP peers.  We've had three events, on ports connected to two different providers.

My next step is to get to the colo and move one of the "dead" ports to a spanned port switch and start sniffing the line.

Any suggestions would be appreciated.  Hardware in use includes:
   ASR1000-ESP10
   ASR1002-RP1
   SPA-8X1GE-V2

Problem has occurred in both built-in and SPA-8X1GE-V2 ports, with multi-mode, and GE-T transceivers.

_______________________________________________
cisco-nsp mailing list  cisco-nsp at puck.nether.net https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/