[outages] Daily packet loss/routing issue (Layer42/Verizon/Deutsche Telekom/Comcast)
Matthew Petach
mpetach at netflight.com
Sun Oct 27 02:21:01 EDT 2013
I've been hating on Comcast this past week, because
they keep losing connectivity to the outside world in
Palo Alto for 3-5 minutes at a time. When it happens,
I can traceroute towards places like google, make it
5 or 6 hops through comcast, at which point it
consistently dies at 529bryant, like this:
mpetach at markbestgrew-lm:~% traceroute www.youtube.com
traceroute: Warning: www.youtube.com has multiple addresses; using
74.125.239.130
traceroute to youtube-ui.l.google.com (74.125.239.130), 64 hops max, 52
byte packets
1 rtr (172.16.0.1) 1.395 ms 0.958 ms 0.727 ms
2 76.102.48.1 (76.102.48.1) 113.310 ms 62.540 ms 58.273 ms
3 te-0-1-0-1-ur05.santaclara.ca.sfba.comcast.net (68.87.196.113) 9.965
ms 11.765 ms 9.556 ms
4 te-1-1-0-13-ar01.sfsutro.ca.sfba.comcast.net (69.139.199.106) 11.827 ms
te-1-1-0-12-ar01.sfsutro.ca.sfba.comcast.net (69.139.199.94) 15.437 ms
te-1-1-0-11-ar01.sfsutro.ca.sfba.comcast.net (69.139.199.90) 13.214 ms
5 he-1-6-0-0-cr01.sanjose.ca.ibone.comcast.net (68.86.90.157) 13.895 ms
18.774 ms 16.012 ms
6 pos-0-3-0-0-pe01.529bryant.ca.ibone.comcast.net (68.86.87.142) 15.177
ms 17.322 ms 15.498 ms
7 * * *
8 * * *
9 * * *
10 * * *
11 * * *
running pings shows large scale packet loss to google:
Request timeout for icmp_seq 23270
Request timeout for icmp_seq 23271
Request timeout for icmp_seq 23272
64 bytes from 74.125.239.133: icmp_seq=23273 ttl=50 time=75.800 ms
64 bytes from 74.125.239.133: icmp_seq=23274 ttl=50 time=141.421 ms
64 bytes from 74.125.239.133: icmp_seq=23275 ttl=50 time=521.741 ms
64 bytes from 74.125.239.133: icmp_seq=23276 ttl=50 time=253.848 ms
64 bytes from 74.125.239.133: icmp_seq=23277 ttl=50 time=174.845 ms
64 bytes from 74.125.239.133: icmp_seq=23278 ttl=50 time=162.240 ms
64 bytes from 74.125.239.133: icmp_seq=23279 ttl=50 time=57.788 ms
64 bytes from 74.125.239.133: icmp_seq=23280 ttl=50 time=226.018 ms
64 bytes from 74.125.239.133: icmp_seq=23281 ttl=50 time=134.495 ms
64 bytes from 74.125.239.133: icmp_seq=23282 ttl=50 time=72.048 ms
64 bytes from 74.125.239.133: icmp_seq=23283 ttl=50 time=229.693 ms
64 bytes from 74.125.239.133: icmp_seq=23284 ttl=50 time=208.210 ms
64 bytes from 74.125.239.133: icmp_seq=23285 ttl=50 time=184.294 ms
64 bytes from 74.125.239.133: icmp_seq=23286 ttl=50 time=71.336 ms
64 bytes from 74.125.239.133: icmp_seq=23287 ttl=50 time=48.964 ms
64 bytes from 74.125.239.133: icmp_seq=23288 ttl=50 time=273.211 ms
64 bytes from 74.125.239.133: icmp_seq=23289 ttl=50 time=37.991 ms
64 bytes from 74.125.239.133: icmp_seq=23290 ttl=50 time=203.306 ms
64 bytes from 74.125.239.133: icmp_seq=23291 ttl=50 time=291.084 ms
64 bytes from 74.125.239.133: icmp_seq=23292 ttl=50 time=518.671 ms
64 bytes from 74.125.239.133: icmp_seq=23293 ttl=50 time=199.055 ms
^C
--- google.com ping statistics ---
23294 packets transmitted, 12496 packets received, 46.4% packet loss
round-trip min/avg/max/stddev = 32.792/120.653/3023.649/147.378 ms
mpetach at markbestgrew-lm:~%
I was hoping I wouldn't have order another T1 to the
new house, but it looks like that might be my only
option for stable connectivity. :(
Matt
On Fri, Oct 25, 2013 at 8:49 PM, Jeremy Chadwick <jdc at koitsu.org> wrote:
> Here's another one of these incredibly annoying situations that baffles
> me (and of course figuring out who's responsible is not something I have
> the power to do).
>
> Been watching this one happen every day, usually between the hours of
> 1700 and 2000 PDT (UTC-0700). Packet loss and latency lasts anywhere
> between 30 to 75 minutes. From today around 1915:
>
> src IP: 204.8.213.80
> dst IP: 76.102.14.35
>
> Host Loss% Snt Last Avg
> Best Wrst StDev
> 1. --- 0.0% 161 21.5 1.4
> 0.3 59.3 6.0
> 2. --- 0.0% 160 0.4 1.2
> 0.3 41.4 4.6
> 3. --- 0.0% 160 0.3 2.2
> 0.2 44.5 6.5
> 4. --- 0.0% 160 1.5 3.3
> 1.2 47.2 6.7
> 5. xe0-1-0-1.core1.sv8.layer42.net 0.0% 160 2.2 2.1 1.9
> 2.4 0.1
> 6. 193.159.165.73 0.0% 160 4.9 4.1 1.9
> 8.1 1.3
> 7. 80.156.163.154 24.5% 160 1000. 933.0
> 6.3 1738. 624.0
> 8. pos-1-12-0-0-cr01.sanjose.ca.ibone.comcast 27.7% 160 987.5 932.7
> 7.2 1706. 647.7
> 9. he-0-6-0-0-ar01.sfsutro.ca.sfba.comcast.ne 26.4% 160 981.4 917.1
> 7.2 1719. 642.9
> 10. te-0-4-0-10-ur05.santaclara.ca.sfba.comcas 30.8% 160 980.0 895.1
> 9.0 1710. 636.9
> 11. te-6-0-acr03.santaclara.ca.sfba.comcast.ne 17.0% 160 967.3 614.0
> 6.9 1224. 518.1
> 12. c-76-102-14-35.hsd1.ca.comcast.net 18.9% 160 970.2 609.1
> 16.2 1229. 517.8
>
>
> src IP: 76.102.14.35
> dst IP: 204.8.213.80
>
> Host Loss% Snt Rcv Last
> Avg Best Wrst
> 1. gw.home.lan 0.0% 563 563 0.3
> 0.2 0.2 1.9
> 2. 76.102.12.1 0.0% 563 563 8.4
> 8.8 7.9 22.8
> 3. te-0-2-0-5-ur06.santaclara.ca.sfba.comcast. 0.0% 563 563 8.7
> 8.9 8.1 23.1
> 4. te-1-1-0-9-ar01.oakland.ca.sfba.comcast.net 0.0% 563 563 13.5
> 12.4 9.7 30.2
> 5. be-90-ar01.sfsutro.ca.sfba.comcast.net 0.0% 563 563 12.5
> 13.7 11.1 41.5
> 6. he-3-8-0-0-cr01.sanjose.ca.ibone.comcast.ne 0.0% 563 563 13.2
> 15.0 12.5 25.0
> 7. be-14-pe02.11greatoaks.ca.ibone.comcast.net 0.0% 562 562 16.5
> 16.5 15.6 43.1
> 8. 23-30-206-94-static.hfc.comcastbusiness.net 0.0% 562 562 16.4
> 22.5 15.6 40.2
> 9. 0.xe-2-0-8.XL3.SJC7.ALTER.NET 0.7% 562 558 20.3
> 29.6 15.3 124.7
> 10. 0.so-4-0-0.XL1.SJC1.ALTER.NET 1.1% 562 556 26.6
> 22.9 16.8 174.2
> 0.so-7-0-0.XL1.SJC1.ALTER.NET
> 11. POS1-0.XR1.SJC1.ALTER.NET 23.0% 562 432 196.6
> 150.0 17.8 459.6
> 12. 193.ATM7-0.GW4.SJC1.ALTER.NET 24.6% 562 424 1133.
> 669.3 17.9 1835.
> 13. ???
>
> What interests me: hops #6 and #7 shown in the first mtr:
>
> 193.159.165.73 = AS3320 = Deutsche Telekom AG
> 80.156.163.154 = AS3320 = Deutsche Telekom AG
>
> Using Layer42's own looking glass long after the issue was over:
>
> lg-west>show ip bgp 76.102.14.35
> BGP routing table entry for 76.96.0.0/11, version 4070745
> Paths: (1 available, best #1, table default)
> Not advertised to any peer
> Refresh Epoch 1
> 8121 3320 7922
> 69.36.239.8 from 69.36.239.8 (69.36.239.8)
> Origin IGP, metric 301, localpref 100, valid, external, best
> rx pathid: 0, tx pathid: 0x0
>
> lg-west>traceroute 76.102.14.35
> Type escape sequence to abort.
> Tracing the route to c-76-102-14-35.hsd1.ca.comcast.net (76.102.14.35)
> VRF info: (vrf in name/id, vrf out name/id)
> 1 69.36.229.177 [AS 8121] 1 msec 1 msec 1 msec
> 2 xe0-0-0-4.core1.sv8.layer42.net (65.50.198.161) [AS 8121] 3 msec 2
> msec 2 msec
> 3 193.159.165.73 [AS 3320] 4 msec 2 msec 3 msec
> 4 80.156.163.154 [AS 3320] 4 msec 5 msec 3 msec
> 5 pos-1-12-0-0-cr01.sanjose.ca.ibone.comcast.net (68.86.87.141) [AS
> 7922] 7 msec 6 msec 4 msec
> 6 he-0-5-0-0-ar01.sfsutro.ca.sfba.comcast.net (68.86.91.46) [AS 7922] 6
> msec 5 msec 8 msec
> 7 te-0-4-0-6-ur05.santaclara.ca.sfba.comcast.net (69.139.198.169) [AS
> 7922] 5 msec 6 msec 7 msec
> 8 te-6-0-acr03.santaclara.ca.sfba.comcast.net (68.86.249.66) [AS 7922]
> 5 msec 6 msec 5 msec
> 9 c-76-102-14-35.hsd1.ca.comcast.net (76.102.14.35) [AS 7922] 14 msec
> 18 msec 14 msec
>
> Two final things to note:
>
> 1. When the packet loss/latency issue is not occurring (e.g. presently),
> the paths are still the same. If packets were actually going from
> Sunnyvale (Layer42) to Germany I'd expect a lot higher than 4ms-7ms
> latency consistently,
>
> 2. The source/destination of 204.8.213.80 is a box at my place of work;
> border of network intentionally filters. We do peer with Layer42
> directly, but I have not pursued this with them yet (as said initially,
> it's hard for me to accurately point fingers).
>
> --
> | Jeremy Chadwick jdc at koitsu.org |
> | UNIX Systems Administrator http://jdc.koitsu.org/ |
> | Making life hard for others since 1977. PGP 4BD6C0CB |
>
> _______________________________________________
> Outages mailing list
> Outages at outages.org
> https://puck.nether.net/mailman/listinfo/outages
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://puck.nether.net/pipermail/outages/attachments/20131026/a9f8d0e6/attachment.htm>
More information about the Outages
mailing list