[c-nsp] 3750: SNMP-3-INPUT_QFULL_ERR, ssh session dies, show tech support fails, switch stack crashes on reload

Jeff Kell jeff-kell at utc.edu
Mon May 5 22:46:28 EDT 2014


On 5/5/2014 11:10 AM, Darren O'Connor wrote:
> Never seen it myself, but googling around brings up a few things.
>
> Did this recently start? Any other switch on the same code having the same issues or not? Generally if five different devices all start having the same issue an external issue is to blame. Maybe your SNMP server is sending a particular packet that this IOS code doesn't like?
>
> Have you tried restarting SNMP itself on the switch?

Are these stacks of more than two switches?  And are they the original
3750Gs, or something else?

We have had recurring problems with a 4-stack of 3750-48Gs that for
various reasons end up with MALLOC errors (out of memory) and you can no
longer establish an SSH, Telnet, nor even serial console connection
"%Low on memory, try again later".

This started with the 12.2 train and has continued into the 15.x train. 
We are NOT yet on the latest-and-greatest which as explained to me by
our account rep is a result of adding "bells and whistles" to the IOS
while these original 3750s are already memory constrained.  Supposedly
this was addressed in the most recent 15.x release to be more
"conservative" about memory utilization.  However, our stack is
presently "stuck" in the "Low on memory, try again later" state and will
require a hard reload (power cycle).  Supposedly this only affects
stacks of > 2 switches.  Simply power cycling the current stack the last
time around lasted about an hour before running out of memory again. 
They continue to forward packets (thankfully) but you can't do anything
with them at all.  We plan an update to the latest 15.x release at the
next maintenance window, but since this stack powers one of our primary
server farms (top-of-racks), we can't just arbitrarily power cycle them.

TAC has been less than useful, and this started over a year ago, but
seems to recur more often in the 15.x train.

If this sounds familiar, I can provide some case numbers of past
attempts to remedy this... but previously a power-cycle would clear it
up for a few months (while the 15.x train is down to hours).

Jeff



More information about the cisco-nsp mailing list