[c-nsp] ASR920 stops routing unexpectedly
Nathan Ward
cisco-nsp at daork.net
Wed May 11 08:54:51 EDT 2016
> On 11/05/2016, at 21:48, Eric Van Tol <eric at atlantech.net> wrote:
>
> Hi all,
> I am now on my third day with TAC on this problem and they are driving me up a wall. I have an ASR-920-24SZ-M that has been in service for almost a year, on 3.15.0S running ISIS, BFD on one link, BGP, LDP, MPLS. Shortly after midnight this past Saturday, it stopped routing for no apparent reason. As we have it connected to an ethernet OOB network, I was able to get in it to take a look.
>
> The first thing we see is that BFD on one of the upstream links times out. Then *every* ISIS session on the router goes down and it stops processing ISIS updates. All the interfaces were up/up and doing a shut/no shut on them did not bring ISIS back up. Unfortunately, I wasn't thoughtful enough to do a simple ping across them to see if they were processing *anything*. The only way to recover appeared to be a reboot.
>
> I'd chalk that up to a "network anomaly", but the following night it happened again right around the same time. Again, in my rush to get everything back up and running, I had to reboot it before I could gather much information (ie. no ping, like an idiot).
Hi Eric,
I had exactly the same problem on an ASR-920-4SZ-A I think, twice in a few days, within a week of putting it in the network. Pulled it straight back out again. TAC RMAd it, but we never put it back in to the network in the same place.
We were doing a small number (10ish) of L2VPN terminations, so, MPLS+L2VPN in on 2 discrete core facing 10Gs, and 802.1q tagged (1 per L2VPN) ethernet out on 2 discrete 10Gs.
I can’t remember for sure, but I think we were doing BFD, ISIS, and LDP. I don’t think we were doing RSVP, and definitely not BGP. Pretty sure we couldn’t ping across it, but I’d have to check. The other end of the MPLS interfaces were ASR9001s, which ended up being where TAC wanted to focus, which was silly because it was two different ASR9001s. I think we could actually see the rx counter increasing on the ASR920 but the packets never made it anywhere.. again, I’d have to check. Reboot was the only way out, I definitely remember that.
We now have the boxes doing 10G breakout to subrate peering interfaces, not using any L2VPN services, carrying about the same amount of traffic, and they seem to work well like that.
I note that the RMAd one has the static mounting point on the back in a slightly different place, not sure if that means they refreshed something.
I have debug info from ages ago if anyone is interested, and I think we shared the TAC case number on this list as well actually.
Here we go, poke around here, and let me know if you want any more info: http://marc.info/?l=cisco-nsp&m=144524503928911&w=2
--
Nathan Ward
More information about the cisco-nsp
mailing list