[c-nsp] weird BGP stuff

Rodney Dunn rodunn at cisco.com
Tue Jun 22 22:45:58 EDT 2010


A lot of times it's the return path that's failing. When you do the 
traceroute from another device you have changed the source so it may 
have a valid return path.

Not sure what code but you could look at Mini Protocol Analyzer to give 
you an inline trace view to watch for the forward and reverse packets:

http://www.cisco.com/en/US/docs/switches/lan/catalyst6500/ios/12.2SX/configuration/guide/mpa.html

You mentioned equal cost paths so you can look at 'sh mls cef 
exact-route srcip dstip' to see whic path the hardware would forward down.

You then need to check the downstream neighbor with some form of packet 
capture (netflow, ACL's, etc..) to find out where it's truly lost.

Rodney



On 6/22/10 7:36 PM, Paul Stewart wrote:
> Hey folks...
>
>
>
> I'm looking for a second set of eyes here ;)  Have a pair of 7606 boxes that
> have been handling 100's of BGP sessions for a long time now with no
> problems (well, performance but I'll leave that alone).
>
>
>
> We added a Juniper MX480 into the mix recently and now seem to be having a
> routing issue that I can't seem to pinpoint where it's occurring.
>
>
>
> Here's a quick rundown to get started of a remote site that is reachable
> from other providers and should be reachable from us we'd confident:
>
>
>
> traceroute to 216.166.249.148 (216.166.249.148), 30 hops max, 40 byte
> packets
>
>   1  dis1-rtr-mb-vl10.nexicom.net (216.168.115.177)  0.468 ms  0.477 ms
> 0.543 ms
>
>   2  core2-rtr-to-ge4-12-vl4.nexicom.net (98.124.0.226)  8.803 ms  8.866 ms
> 8.941 ms
>
> 3  * * *
>
>   4  * * *
>
>   5  * * *
>
>   6  * * *
>
>   7  * * *
>
>
>
> So dis1 is a 6500 and core2 in this case is on the BGP speaking 7606's I was
> talking about.  Traffic just stops at 98.124.0.226 or the next hop - it's
> unclear.  So using this destination for example I jump onto core2 and do a
> lookup:
>
>
>
> core2-rtr-to#sh ip bgp 216.166.249.148
>
> BGP routing table entry for 216.166.248.0/21, version 315975
>
> Paths: (2 available, best #1, table Default-IP-Routing-Table)
>
>    Advertised to update-groups:
>
>       11         13         17         18         19         22         23
>
>    6939 22561
>
>      209.51.163.145 from 98.124.59.17 (76.75.100.59)
>
>        Origin IGP, localpref 100, valid, internal, best
>
>        Community: 11666:1000 11666:1006
>
>    6939 22561
>
>      209.51.163.145 from 98.124.59.25 (76.75.100.59)
>
>        Origin IGP, localpref 100, valid, internal
>
>        Community: 11666:1000 11666:1006
>
>
>
> You'll see two paths, both valid and both from an iBGP neighbour.  The next
> hop of 98.124.59.17 is valid and reachable.
>
>
>
> If I run a traceroute directly on the core2 7606 box I get timeouts:
>
>
>
> core2-rtr-to#traceroute 216.166.249.148
>
>
>
> Type escape sequence to abort.
>
> Tracing the route to 216-166-249-148.clec.peknil.commercial.madisonriver.net
> (216.166.249.148)
>
>
>
>    1  *  *  *
>
>    2  *  *
>
>
>
> Finally, the MX480 where this transit provider connects I do a traceroute
> and it's perfect:
>
>
>
> paul at core1.toronto1>  traceroute 216.166.249.148
>
> traceroute to 216.166.249.148 (216.166.249.148), 30 hops max, 40 byte
> packets
>
>   1  gige-g2-20.core1.tor1.he.net (209.51.163.145)  0.458 ms  0.401 ms  0.294
> ms
>
>   2  10gigabitethernet1-2.core1.nyc5.he.net (72.52.92.165)  21.863 ms  22.573
> ms  24.961 ms
>
>   3  10gigabitethernet1-4.core1.nyc1.he.net (72.52.92.153)  27.827 ms  18.939
> ms  25.197 ms
>
>   4  198.32.160.19 (198.32.160.19)  16.381 ms  16.543 ms  16.427 ms
>
>   5  bb-nycmny83-jx9-02-ae0-0.core.centurytel.net (208.110.248.114)  27.572
> ms  16.578 ms  16.591 ms
>
>       MPLS Label=521136 CoS=0 TTL=1 S=1
>
>   6  bb-chcgilwu-jx9-02-ae4-0.core.centurytel.net (208.110.248.69)  38.239 ms
> 38.107 ms  38.254 ms
>
>       MPLS Label=570289 CoS=0 TTL=1 S=1
>
>   7  bb-mrghmoqa-jx9-02-xe-1-1-0.core.lightcore.net (206.51.69.45)  60.820 ms
> 45.567 ms  45.416 ms
>
>       MPLS Label=656386 CoS=0 TTL=1 S=1
>
>   8  bb-peknilxd-jm1-01-ge-0-1-0-298.core.lightcore.net (206.51.69.238)
> 51.356 ms  51.256 ms  51.440 ms
>
>   9  peknil-coe-ci7507-01.grics.net (64.40.75.4)  54.189 ms  53.656 ms
> 54.102 ms
>
> 10  209-102-183-102.nworla.commercial.madisonriver.net (209.102.183.102)
> 63.918 ms  60.269 ms  60.593 ms
>
>
>
>
>
> So why is it failing from the Cisco to the Juniper?  I'm pulling my hair
> (what I have left) out on this ... and it's only happening to a handful of
> routes that we are aware of so far....
>
>
>
> Thanks,
>
>
>
> Paul
>
>
>
> _______________________________________________
> cisco-nsp mailing list  cisco-nsp at puck.nether.net
> https://puck.nether.net/mailman/listinfo/cisco-nsp
> archive at http://puck.nether.net/pipermail/cisco-nsp/


More information about the cisco-nsp mailing list