[nsp] sudden OSPF failures

jlewis at lewis.org jlewis at lewis.org
Tue Sep 23 17:10:35 EDT 2003


On our network of roughly 40 routers, 50 access-servers (all cisco), OSPF 
for one of our smaller remote POPs began breaking down last night.  This 
POP has existed for years in its own OSPF area connected via frame to 
a router in area 0.

The POP consists of a 2501 and 4 AS5248's.  The 5200's are in area 1000,
the 2500 has its serial0.1 in area 0 and connects back to one of our core
routers (a 7206 running 122-14.S3) and has e0 in area 1000.

The 2501 was running 11.0(22), but I upgraded it to 12.0(27) last night
while troubleshooting this.  The 5248's run 11.3(11)aAA.  Each AS5248 has
a dial pool of a /27 and /28 (just enough IPs for the channelized T1's on
each box).  Last night, for unknown reasons, the dial pool routes for
AS5248-2 started flapping, even when watched from the POP's 2501.  sh ip
ro blah (for either fo AS5248-2's dial pools) on the 2501 would
alternatively show the /27|/28 routes, or the /17 they're part of coming
from our core.  When the /27|/28 routes were active, they might last a few
seconds to a minute or so...then disappear for similar time.  If I shut
down the T1 controllers on AS5248-2, routing instability ceased and the
/27|/28 routes propogated properly.  When unshut, as soon as dialup users
started hitting AS5248-2, the instability returned.  I didn't see any
signifigant ethernet errors, but as we were using a very old 1912 switch,
we swapped it out for a 2924xl.  That made no difference.

I thought this could perhaps be due to the size of our OSPF routes (~1300 
routes), so I converted the remote POP into an NSSA area, reducing the 
number of routes seen in that POP to around 150.  That seemed to work for 
a while (30-60m), but then random routes from the POP ceased to propogate 
beyond the core router that POP's 2501 connects to.

At the moment, I seem to have things working again, with the remote POP 
ethernet in its own NSSA area, the 2501's s0.1 in area 0, and all the 
dialup pool subnets static routed to each 5248 from the POP's 2501.

Anyone seen stuff like this before?  We've got several other small POPs 
consisting of similar setups (2501's and AS5248's) not having this 
problem.  Most of our network is in area 0...only a few POPs are split off 
in their own areas.

----------------------------------------------------------------------
 Jon Lewis *jlewis at lewis.org*|  I route
 Senior Network Engineer     |  therefore you are
 Atlantic Net                |  
_________ http://www.lewis.org/~jlewis/pgp for PGP public key_________



More information about the cisco-nsp mailing list