[c-nsp] Problems with 7500 router crashing

Ben Crocker ben at hamsterjam.net
Sat Jul 24 03:08:26 EDT 2004


Which commands did you use?

Did you try either of

ipc cache
ip cef linecard ipc memory

We had similar problems that went one step further and we where running 
out of XDR's as well, these two commands seemed to fix our problems, 
think we set them to 5k and 10k respectively.

I think it's actually not the size of your FIB that causes your problem 
it's the amount of changes etc that are being made to it, CEF IPC 
(Inter Process Communication) is running out of memory because it's 
having to deal with more updates from the other processes than it can 
handle, if it didn't disable CEF then you would have inconsistent 
routing information on the VIP.


On Jul 22, 2004, at 4:52 PM, Olav Langeland wrote:

> Hi,
>
> we have a problem with one of our border routers apparently crashing
> randomly. Our setup is 2 uplinks with one Cisco 7513 for each uplink,
> doing full eBGP up and iBGP between the routers, nothing more fancy.
> They share a HSRP IP on the inside, so all traffic goes to router1
> before it either goes to router2 or internet. The hardware is/was 100%
> identical, yet router2 has been stable as rock. We have had crashes 
> with
> both IOS 12.2 and 12.3, so it doesn't seem version related.
>
> Router1 crashed about a year ago, didn't get any logs or find anything
> interesting when we got it back up. It has been stable since, until
> recently it crashed several times late one Friday so I ended up
> switching it off. We got some syslogs this time, bits of it included
> below. We borrowed a Cisco 7505 chassis and changed a RSP card, but
> noticed the router rebooted once a couple of days ago with more or less
> the same error. Most of the crashes has been simple reboots, other 
> times
> CPU went 100% so forcing a reboot.
>
> Did some checking with the Output Interpreter, didn't help much. It
> listed some bugid's that didn't seem related and also the always 
> helpful
> "The failure was caused by a software defect. Note that this is a bus
> error crash and can also be hardware related." ...
> I found some pages on cisco.com regarding the %FIB-3-FIBDISABLE error,
> relates to IPC running out of memory. I did what one page suggested and
> increased the allocated memory, didn't help. Is all this caused by
> faulty hardware (replacing the VIP4-80 in slot1 with a VIP4-5 this
> weekend and switching back to 7513 chassis), hardware that cant cope
> with traffic or a IOS configuration issue.
>
> Any hints appreciated!
>
> More info:
> --show ver--
> System returned to ROM by bus error at PC 0x4051C618, address 0x5A7
> --end--
>
> --show stacks--
> Stack trace from system failure:
> FP: 0x42B748B8, RA: 0x4051C618
> FP: 0x42B748F8, RA: 0x402F16EC
> FP: 0x42B74960, RA: 0x402E9034
> FP: 0x42B749A8, RA: 0x40272F50
> --end--
>
> --begin syslog--
> CET: %IPC-5-SLAVELOG: VIP-SLOT1:
> %SYS-2-MALLOCFAIL: Memory allocation of 65556 bytes failed from
> 0x60113CF8, alignment 16
> Pool: Processor  Free: 92976  Cause: Memory fragmentation
> Alternate Pool: None  Free: 0  Cause: No Alternate pool
> -Process= "CEF IPC Background", ipl= 2, pid= 36
> -Traceback= 60118D80 60119EF8 60113D00 603F9D10 603FA598 603FA820
> 603DC7B8 603E1388 603E1C9C 603E8510 603F5680 603EED94 603EF07C 603EF40C
> 603EFAA4
> %IPC-5-SLAVELOG: VIP-SLOT1:
> %SYS-2-MALLOCFAIL: Memory allocation of 65556 bytes failed from
> 0x60113CF8, alignment 16
> Pool: Processor  Free: 41916  Cause: Not enough free memory
> Alternate Pool: None  Free: 0  Cause: No Alternate pool
> -Process= "CEF IPC Background", ipl= 0, pid= 36
> -Traceback= 60118D80 60119EF8 60113D00 603F9C00 603DC380 603E1B98
> 603E8510 603F5680 603EED94 603EF07C 603EF40C 603EFAA4
> %FIB-3-FIBDISABLE: Fatal error, slot 1: no memory
> %IPC-5-SLAVELOG: VIP-SLOT1:
> %SYS-2-MALLOCFAIL: Memory allocation of 65556 bytes failed from
> 0x60113CF8, alignment 16
> Pool: Processor  Free: 82196  Cause: Memory fragmentation
> Alternate Pool: None  Free: 0  Cause: No Alternate pool
> -Process= "CEF IPC Background", ipl= 0, pid= 36
> --end--
>
> --show tech--
> ------------------ show controllers cbus ------------------
>   slot0: VIP4-50 RM5271, hw 2.02, sw 22.20, ccb F800FF10, cmdq 
> E8000080,
> vps 8192
>     software loaded from system
>     IOS (tm) VIP Software (SVIP-DW-M), Version 12.3(5b), RELEASE
> SOFTWARE (fc1)
>     ROM Monitor version 103.0
>     POS0/0/0, applique is SONET
>       gfreeq E8000170, lfreeq E8000180 (4512 bytes)
>       rxlo 4, rxhi 81, rxcurr 4, maxrxcurr 43
>       txq E8001A00, txacc E8001A02 (value 80), txlimit 81
>     FastEthernet0/1/0, addr 0000.0c62.e308 (bia 0000.0c62.e308)
>       gfreeq E8000150, lfreeq E8000188 (1600 bytes)
>       rxlo 4, rxhi 530, rxcurr 0, maxrxcurr 0
>       txq E8001A08, txacc E8001A0A (value 0), txlimit 530
>   slot1: VIP4-80 RM7000, hw 2.01, sw 22.20, ccb F800FF20, cmdq 
> E8000088,
> vps 8192
>     software loaded from system
>     IOS (tm) VIP Software (SVIP-DW-M), Version 12.3(5b), RELEASE
> SOFTWARE (fc1)
>     ROM Monitor version 103.0
>     GigabitEthernet1/0/0, addr 0000.0c62.e320 (bia 0000.0c62.e320)
>       gfreeq E8000150, lfreeq E8000190 (1600 bytes)
>       rxlo 4, rxhi 795, rxcurr 6, maxrxcurr 369
>       txq E8001A48, txacc E8001A4A (value 529), txlimit 530
> --end--
>
>
> /olav langeland
>
> _______________________________________________
> cisco-nsp mailing list  cisco-nsp at puck.nether.net
> https://puck.nether.net/mailman/listinfo/cisco-nsp
> archive at http://puck.nether.net/pipermail/cisco-nsp/
>
>
###############################################
"Beer is proof that god loves us and wants us to be happy"

Benjamin Franklin

ben at hamsterjam.net
http://www.hamsterjam.net

###############################################



More information about the cisco-nsp mailing list