[c-nsp] CSR1000v + ASR1000 Code Upgrade Pleasure...
Mark Tinka
mark at tinka.africa
Wed Oct 13 00:52:48 EDT 2021
Hi all.
I thought I'd share our recent experiences, per subject, just in case
others run into the same problems.
So... we finally decided to try 17.3(4a)MD for the CSR1000v, after years
of happy operation. Good Lord, what a drama!
At first, we couldn't figure out why iBGP sessions to all Cisco boxes
could not stand up. Then we realized it's because IS-IS to them could
not stand up. Then we realized it's because BFD sessions could not stand up.
But even after removing BFD, IS-IS remained down.
After 3 days of searching, we finally landed on CSCuz58508. In case you
don't have CCO access, it is the same issue as described here:
https://community.cisco.com/t5/cisco-cloud-service-router-csr/b00ocg4q4e-csr-1000v-16-3-1a-can-t-set-mtu-on-gig-interface/td-p/3054853
This was even more confusing for us, because our interface driver on
VMware ESXi is vmxnet3.
The bug ID suggests the problem is fixed in 16.3(2) and 16.4(1). So to
be safe, we tested 16.12(5)MD, which allowed us to enable jumbo frames,
but that only appeared to be a cosmetic thing. In the background, the
box was simply dropping packets, silently. We found this out when we
tried to copy other files to the node, and it would just hang without
any feedback. Removing the jumbo frame support allowed the files to come
through.
We noticed that nodes still running 3.17(0)S did not have any issues
with IS-IS or BFD, or MTU. However, this code was only ever released as
an ED train (and to be fair, we've been having dodgy issues with it in
recent years), so we decided to downgrade to 3.16(9)S (which is actually
an upgrade from 3.17(00)S, since the 3.16 train is an MD release, with
the latest release being March 2019, vs. July 2017 for 3.17(4)SED).
With that, no more MTU issues, BFD and IS-IS are happy, iBGP is happy.
We definitely won't be wasting any more time trying to make Denali,
Gibraltar, Fuji, Everest or Amsterdam work on our CSR1000v complement.
Needless to say, moving the ASR1000 platform to 17.3 has also come with
its own avenue of pleasure, what with all the ROMMON, CPLD and FPGA
upgrade mess that is. What the documentation says and what happens in
real life are two very different things. It has taken us a week to come
up with our own working procedure to upgrade just one box, worse if it's
a dual-RP system.
Mark.
More information about the cisco-nsp
mailing list