[c-nsp] ASR1K forwarding failures on 10G SPA's
Sascha E. Pollok
sp at iphh.net
Fri Oct 7 04:21:42 EDT 2016
Cc: Back to list for documentation purposes
Hi Stephen,
sorry for the late reply. Yes, there is a new IOS XE version having the
NTP bug fixed. Otherwise they claim an NTP access-list might help as a workaround.
The bug is:
CSCva52489 Input queue wedge. NTP packets destined to networks configured
on router
Good luck!
Sascha
Am 04.10.2016 um 23:58 schrieb Stephen Fulton:
> Hi Sascha,
>
> That's it exactly!
>
> Input queue: 376/375/872/0 (size/max/drops/flushes); Total output drops: 0
>
> (I had cleared the interface a few moments prior).
>
> Thanks for the SR. Was your case resolved?
>
> -- Stephen
>
> On 2016-10-04 5:15 PM, Sascha E. Pollok wrote:
>> (Not replying to the list but all folks who joined the discussion)
>>
>> Hi Stephen,
>>
>> the drops were high or the input queue? What we've seen before was that the queue was
>> filled exactly one packet more than the maximum. Our case was SR ***.
>>
>> Let me know if this matched your case.
>>
>> Cheers
>> Sascha
>>
>> Am 04.10.2016 um 15:55 schrieb Stephen Fulton:
>>> Gentlemen,
>>>
>>> Interesting, I checked this morning and the input drops were very high, despite being
>>> cleared 12 hours ago on a router no longer in production. If anyone has an TAC case they
>>> can reference (privately or otherwise) I'd appreciate it, as I have a TAC case open now.
>>> I'll wait on updating IOS-XE from 3.16.3.S until TAC is ready.
>>>
>>> Thanks,
>>>
>>> -- Stephen
>>>
>>> On 2016-10-04 1:57 AM, Sascha Pollok wrote:
>>>> Exactly. OP might try to raise hold-queue xx in on those interfaces. If
>>>> it solves the problem temporarily (!) he found it.
>>>>
>>>> If so, show buffers input-interfacw should give a hint.
>>>> The NTP bug came up pretty recently (2 months or so?) so it could
>>>> actually be the cause.
>>>>
>>>> -Sascha
>>>>
>>>> Am 4. Oktober 2016 07:45:36 schrieb Mark Tees <marktees at gmail.com>:
>>>>
>>>>> That sounds like what I experienced in ASR920 land recently with bad
>>>>> packets filling up interface input queues causing a wedge.
>>>>>
>>>>> When it happens check the interface input queues and save the output.
>>>>>
>>>>> The resolution for us so far has been tight CoPP with discards, iACLs,
>>>>> and the like to only allow things towards the boxes that are as
>>>>> trusted as possible.
>>>>>
>>>>> On Tuesday, 4 October 2016, Sascha Pollok <sp at iphh.net
>>>>> <mailto:sp at iphh.net>> wrote:
>>>>>
>>>>> Just to make sure: latest IOS XE version? Its not the NTP
>>>>> processing bug filling up interface queues? How does the input
>>>>> queue look on the affected interfaces?
>>>>>
>>>>> Cheers
>>>>> Sascha
>>>>>
>>>>>
>>>>> Am 4. Oktober 2016 05:33:39 schrieb Stephen Fulton
>>>>> <sf at lists.esoteric.ca>:
>>>>>
>>>>> ISIS adjacencies drop as well as BGP sessions on neighboring
>>>>> devices drop.
>>>>>
>>>>> Issue just reoccurred.
>>>>>
>>>>> -- Stephen
>>>>>
>>>>> On 2016-10-03 10:59 PM, Scott Granados wrote:
>>>>>
>>>>> Anything logged while this happens?
>>>>>
>>>>> On Oct 3, 2016, at 10:52 PM, Stephen Fulton
>>>>> <sf at lists.esoteric.ca> wrote:
>>>>>
>>>>> Hi all,
>>>>>
>>>>> I have run into a number of forwarding failure events
>>>>> on ASR1K's with 10G SPA's. These have occurred across
>>>>> a range of IOS-XE versions, using various ROMMON
>>>>> versions and across two different ASR1K platforms
>>>>> (1002's and 1004's). Multiple SPA's have been
>>>>> replaced, IOS-XE versions and ROMMON versions upgraded
>>>>> and in the case of the ASR1004's, SIP's replaced (both
>>>>> SIP10 and SIP40's). TAC cases have been opened
>>>>> several times.
>>>>>
>>>>> What occurs is forwarding across an interface fails
>>>>> completely. The easiest way to find it is the lack of
>>>>> ARP entries on the interface/sub-interface, due to
>>>>> time-outs, but traffic is still attempting to traverse
>>>>> the interface. When I ping the IP address associated
>>>>> with the failed interface, it fails. ARP resolution
>>>>> of any neighbors fails, and neighboring devices on the
>>>>> same broadcast domain cannot reach it - though will
>>>>> see its MAC in the ARP table.
>>>>>
>>>>> In all cases, ISIS and MPLS was configured on the
>>>>> interfaces. BFD has been on some, not on others.
>>>>>
>>>>> I recently found learned of another organization that
>>>>> saw the same behavior on an ASR1006 with 10G SPA's.
>>>>> SPA's and SIP's were replaced and the last advice they
>>>>> received from TAC was that if it occurred again the
>>>>> chassis would need to replaced. It did but they chose
>>>>> not to replace the chassis and simply stopped using
>>>>> 10G entirely.
>>>>>
>>>>> Has anyone else seen this?
>>>>>
>>>>> -- Stephen
>>>>
More information about the cisco-nsp
mailing list