[c-nsp] Nexus 7000 - vPC during NX-OS upgrade (ISSU)

Chris Evans chrisccnpspam2 at gmail.com
Thu Jan 27 17:44:20 EST 2011


Lincoln,

We've had two major bugs and potentially a 3rd pop up this last week up that
have been root caused due to ISSU.. They could have been because we ISSU'ed
from the software that the platforms have shipped with pre 4.2, which were
ultimately ISSU'ed to 4.2(4).. Based on the bugs we've found so far, our
HTTS team has recommended that during the install of the devices, we should
reboot them fully after the code upgrade and to not use ISSU if possible.

-One bug that was identified during our ECATS testing was a netflow related
issue.. If I remember correctly stats do not get exported correctly after
ISSU until you do a device reboot. This testing was done back in Q2 of 2010,
so my memory is a bit fuzzy.

-Last week we have found one of our 7K's netflow daemon to randomly crash.
We do not have a RCA for this one yet, the DE's are working on it and cannot
reproduce it yet however. This is currently being blamed on ISSU as well,
but not fully guaranteed.

-The biggest bug we've found in relation to ISSU is a nasty one that has hit
us twice now on different 7Ks. Essentially the device black holes traffic
ingress on certain ports. The fix is to shut/no shut the port/port-channel,
reboot the module(s) or ultimately reboot the whole box. It was RCA'ed to a
register that was incorrectly set on the ASIC which disables forwarding for
the port, the only way to identify it was through some internal commands.
The normal commands that average users utilize doesn't show anything
incorrect.  In the first instance traffic didn't work from the get go, then
the 2nd one hit when a policy based routing policy was updated during a
change which was weeks after the first occurrence.. Both times causing a big
outage. So that has driven fear into us as it can just hit out of the blue
and we have no way to really validate the device post-change.

With those bugs mentioned, you can understand why our HTTS team has
recommended we NOT use ISSU for now.

I'll have to get the bug IDs and get back to the list. Lincoln, I will reply
tomorrow from my work email account with names of our HTTS team so you can
speak with them.

Thanks

Chris

On Thu, Jan 27, 2011 at 5:21 PM, Lincoln Dale <ltd at cisco.com> wrote:

> On 27/01/2011, at 10:19 PM, Manu Chao wrote:
>
> > I need to upgrade (ISSU) multiples N7K Dual Supervisor running vPC
> domains
> > from NX-OS 4.2(6) to 5.1(1a).
>
> ISSU from 4.2(6) to 5.1(1a) is non-disruptive.  you should be able to
> upgrade with no disruption to service.
>
> having said that, always carefully read the release notes posted on
> cisco.com for a given release.
> it may be that an upgrade between two releases requires you to do
> something.
> e.g. see <
> http://www.cisco.com/en/US/docs/switches/datacenter/sw/5_x/nx-os/release/notes/51_nx-os_release_note.html#wp293013
> >
>
> you can certainly run vPC with one vPC peer switch being a different NX-OS
> release to the other vPC peer switch.
>
> On 27/01/2011, at 11:52 PM, Chris Evans wrote:
> > Cisco has advised us to not use issu when possible.. we have had a few
> weird
> > bugs from it after the fact..  we are running 4.2(4)..
>
> please send me details (off list) of who at "Cisco" that advised you of
> this.  its not accurate.
>
>
> On 28/01/2011, at 12:09 AM, Ryan West wrote:
> > I'm sure the release notes say it, but the 4.x to 5.x major requires a
> full reload.  I spent a lot of time tracking down a BPDU rate limiting issue
> only to find the customer had ISSU'd from 4 to 5 and did not reload.
>
> an upgrade from 4.2.x to 5.x does not require a reload.  my guess is that
> the release-notes were not followed as it does talk about some very specific
> things in upgrade/downgrade considerations.
>
>
> cheers,
>
> lincoln.
>
>


More information about the cisco-nsp mailing list