[c-nsp] weird BGP stuff
Paul Stewart
paul at paulstewart.org
Tue Jun 22 19:36:38 EDT 2010
Hey folks...
I'm looking for a second set of eyes here ;) Have a pair of 7606 boxes that
have been handling 100's of BGP sessions for a long time now with no
problems (well, performance but I'll leave that alone).
We added a Juniper MX480 into the mix recently and now seem to be having a
routing issue that I can't seem to pinpoint where it's occurring.
Here's a quick rundown to get started of a remote site that is reachable
from other providers and should be reachable from us we'd confident:
traceroute to 216.166.249.148 (216.166.249.148), 30 hops max, 40 byte
packets
1 dis1-rtr-mb-vl10.nexicom.net (216.168.115.177) 0.468 ms 0.477 ms
0.543 ms
2 core2-rtr-to-ge4-12-vl4.nexicom.net (98.124.0.226) 8.803 ms 8.866 ms
8.941 ms
3 * * *
4 * * *
5 * * *
6 * * *
7 * * *
So dis1 is a 6500 and core2 in this case is on the BGP speaking 7606's I was
talking about. Traffic just stops at 98.124.0.226 or the next hop - it's
unclear. So using this destination for example I jump onto core2 and do a
lookup:
core2-rtr-to#sh ip bgp 216.166.249.148
BGP routing table entry for 216.166.248.0/21, version 315975
Paths: (2 available, best #1, table Default-IP-Routing-Table)
Advertised to update-groups:
11 13 17 18 19 22 23
6939 22561
209.51.163.145 from 98.124.59.17 (76.75.100.59)
Origin IGP, localpref 100, valid, internal, best
Community: 11666:1000 11666:1006
6939 22561
209.51.163.145 from 98.124.59.25 (76.75.100.59)
Origin IGP, localpref 100, valid, internal
Community: 11666:1000 11666:1006
You'll see two paths, both valid and both from an iBGP neighbour. The next
hop of 98.124.59.17 is valid and reachable.
If I run a traceroute directly on the core2 7606 box I get timeouts:
core2-rtr-to#traceroute 216.166.249.148
Type escape sequence to abort.
Tracing the route to 216-166-249-148.clec.peknil.commercial.madisonriver.net
(216.166.249.148)
1 * * *
2 * *
Finally, the MX480 where this transit provider connects I do a traceroute
and it's perfect:
paul at core1.toronto1> traceroute 216.166.249.148
traceroute to 216.166.249.148 (216.166.249.148), 30 hops max, 40 byte
packets
1 gige-g2-20.core1.tor1.he.net (209.51.163.145) 0.458 ms 0.401 ms 0.294
ms
2 10gigabitethernet1-2.core1.nyc5.he.net (72.52.92.165) 21.863 ms 22.573
ms 24.961 ms
3 10gigabitethernet1-4.core1.nyc1.he.net (72.52.92.153) 27.827 ms 18.939
ms 25.197 ms
4 198.32.160.19 (198.32.160.19) 16.381 ms 16.543 ms 16.427 ms
5 bb-nycmny83-jx9-02-ae0-0.core.centurytel.net (208.110.248.114) 27.572
ms 16.578 ms 16.591 ms
MPLS Label=521136 CoS=0 TTL=1 S=1
6 bb-chcgilwu-jx9-02-ae4-0.core.centurytel.net (208.110.248.69) 38.239 ms
38.107 ms 38.254 ms
MPLS Label=570289 CoS=0 TTL=1 S=1
7 bb-mrghmoqa-jx9-02-xe-1-1-0.core.lightcore.net (206.51.69.45) 60.820 ms
45.567 ms 45.416 ms
MPLS Label=656386 CoS=0 TTL=1 S=1
8 bb-peknilxd-jm1-01-ge-0-1-0-298.core.lightcore.net (206.51.69.238)
51.356 ms 51.256 ms 51.440 ms
9 peknil-coe-ci7507-01.grics.net (64.40.75.4) 54.189 ms 53.656 ms
54.102 ms
10 209-102-183-102.nworla.commercial.madisonriver.net (209.102.183.102)
63.918 ms 60.269 ms 60.593 ms
So why is it failing from the Cisco to the Juniper? I'm pulling my hair
(what I have left) out on this ... and it's only happening to a handful of
routes that we are aware of so far....
Thanks,
Paul
More information about the cisco-nsp
mailing list