[outages] Cox -> nLayer connectivity issues
Jake Mertel
jake at nobistech.net
Wed Dec 19 20:22:08 EST 2012
I can confirm that things are looking much better from here as well -- In total we had 2 client reports of related issues and 1 has confirmed that it has cleared, and arin.net is loading at the same speed through nLayer<->Cox<->ARIN as it does on other connections to which I have access.
-----Original Message-----
From: outages-bounces at outages.org [mailto:outages-bounces at outages.org] On Behalf Of Jeremy Chadwick
Sent: Wednesday, December 19, 2012 5:41 PM
To: Cary Wiedemann
Cc: outages at outages.org
Subject: Re: [outages] Cox -> nLayer connectivity issues
Likewise, if folks need a west-coast destination to test ICMP against (but not UDP or TCP), you can use mine: 206.125.172.42. VPS provider
(arpnetworks) peers with Mzima, who peers with nLayer.
Earlier I was noticing that packets destined to 192.149.252.75 would consistently (100% of the time) not solicit an ICMP time-exceeded response from Cox's router (hop #7 below), but packets destined to
192.149.252.76 did solicit ICMP time-exceeded.
As of a few minutes ago, that behaviour has changed. How things look right now:
$ traceroute -n -P icmp 192.149.252.75
traceroute to 192.149.252.75 (192.149.252.75), 64 hops max, 72 byte packets
1 206.125.172.41 10.156 ms 4.368 ms 1.405 ms
2 67.199.135.101 8.529 ms 0.638 ms 0.711 ms
3 69.174.121.74 6.212 ms 1.863 ms 1.847 ms
4 69.31.127.129 0.702 ms 0.696 ms 0.465 ms
5 69.31.127.138 2.067 ms 2.010 ms 1.954 ms
6 69.31.127.230 0.695 ms 0.772 ms 2.848 ms
7 68.1.1.5 70.971 ms 104.494 ms 76.750 ms
8 * * *
9 * * *
10 98.172.152.14 80.197 ms 72.742 ms 82.425 ms
11 192.149.252.131 72.426 ms 72.590 ms 82.698 ms
12 192.149.252.75 82.766 ms 73.078 ms 72.656 ms
$ traceroute -n -P icmp 192.149.252.76
traceroute to 192.149.252.76 (192.149.252.76), 64 hops max, 72 byte packets
1 206.125.172.41 4.025 ms 20.726 ms 4.288 ms
2 67.199.135.101 8.299 ms 0.667 ms 0.442 ms
3 69.174.121.74 4.435 ms 1.767 ms 1.699 ms
4 69.31.127.129 0.470 ms 0.710 ms 0.490 ms
5 69.31.127.138 1.940 ms 1.996 ms 1.860 ms
6 69.31.127.230 0.675 ms 0.719 ms 0.714 ms
7 68.1.1.7 70.829 ms 70.958 ms 70.773 ms
8 * * *
9 * * *
10 98.172.152.14 72.634 ms 105.463 ms 89.318 ms
11 192.149.252.131 82.523 ms 72.406 ms 72.402 ms
12 192.149.252.76 82.755 ms 73.740 ms 72.935 ms
How they looked before (for packets destined to 192.149.252.75), and again, this was 100% reproducible (skipping right to TTL 6):
$ traceroute -n -f 6 -P icmp 192.149.252.75 traceroute to 192.149.252.75 (192.149.252.75), 64 hops max, 72 byte packets
6 69.31.127.230 0.759 ms 0.785 ms 0.717 ms
7 * * *
8 * * *
9 * * *
10 98.172.152.14 78.109 ms 74.391 ms 81.120 ms ^C
I can only speculate at what transpired there (possibly some device with a hashing algorithm for LB misbehaving?), and maybe that's related.
Unsure.
--
| Jeremy Chadwick jdc at koitsu.org |
| UNIX Systems Administrator http://jdc.koitsu.org/ |
| Mountain View, CA, US |
| Making life hard for others since 1977. PGP 4BD6C0CB |
On Wed, Dec 19, 2012 at 07:17:53PM -0500, Cary Wiedemann wrote:
> All,
>
> I knew I should have checked this list before opening tickets far and
> wide. I've been experiencing this issue since before 5:30pm EST and
> just wanted to report that ICMP *IS* affected for me, but only for
> certain IP addresses. TCP seems to be intermittently affected.
>
> I host a server at InfoRelay with network 69.169.88.16/28. From a Cox
> Communications optical internet circuit I can ping 69.169.88.20 and
> .21, but not .22 .23 or .24.
>
> chantilly-asa# ping 69.169.88.20
> Type escape sequence to abort.
> Sending 5, 100-byte ICMP Echos to 69.169.88.20, timeout is 2 seconds:
> !!!!!
> Success rate is 100 percent (5/5), round-trip min/avg/max = 1/2/10 ms
>
> chantilly-asa# ping 69.169.88.21
> Type escape sequence to abort.
> Sending 5, 100-byte ICMP Echos to 69.169.88.21, timeout is 2 seconds:
> !!!!!
> Success rate is 100 percent (5/5), round-trip min/avg/max = 1/2/10 ms
>
> chantilly-asa# ping 69.169.88.22
> Type escape sequence to abort.
> Sending 5, 100-byte ICMP Echos to 69.169.88.22, timeout is 2 seconds:
> ?????
> Success rate is 0 percent (0/5)
>
> chantilly-asa# ping 69.169.88.23
> Type escape sequence to abort.
> Sending 5, 100-byte ICMP Echos to 69.169.88.23, timeout is 2 seconds:
> ?????
> Success rate is 0 percent (0/5)
>
> A good trace looks like this (first hop obscured):
>
> C:\>tracert 69.169.88.20
>
> Tracing route to schneller.carywiedemann.com [69.169.88.20] over a
> maximum of 30 hops:
>
> 1 1 ms 1 ms 1 ms
> wsip-174-000-000-000-dc.dc.cox.net[174.000.000.000]
> 2 1 ms 1 ms 1 ms mrfddsrj01gex070003.rd.dc.cox.net[68.100.0.141]
> 3 2 ms 2 ms 2 ms 68.1.4.139
> 4 11 ms 6 ms 4 ms xe-5-0-7.ar1.iad1.us.nlayer.net[69.31.10.81]
> 5 203 ms 204 ms 209 ms
> as33597.xe-3-0-5-304.ar1.iad1.us.nlayer.net[69.31.10.70]
> 6 3 ms 3 ms 3 ms cr1.iad1.inforelay.net [66.231.176.9]
> 7 4 ms 5 ms 3 ms cr1.iad4.inforelay.net [66.231.177.66]
> 8 3 ms 3 ms 3 ms schneller.carywiedemann.com [69.169.88.20]
>
> While a trace to 69.169.88.22 dies after hop 3:
> C:\>tracert 69.169.88.22
>
> Tracing route to fairfaxunderground.com [69.169.88.22] over a maximum
> of 30 hops:
>
> 1 1 ms 1 ms 1 ms
> wsip-174-000-000-000.dc.dc.cox.net[174.000.000.000]
> 2 1 ms 1 ms 1 ms mrfddsrj01gex070003.rd.dc.cox.net[68.100.0.141]
> 3 * 2 ms 2 ms 68.1.4.139
> 4 * * * Request timed out.
> 5 * * * Request timed out.
> 6 * * * Request timed out.
> 7 * * * Request timed out.
>
> Although TCP connections still work, they're highly intermittent. I
> haven't had a successful ICMP echo reply from 69.169.88.22 or
> 69.169.88.23 from a Cox connection via nLayer for several hours.
>
> I'm asking both Cox and InfoRelay to depeer from nLayer.
>
> Feel free to use my server as an ICMP target.
>
> - Cary
>
> On Wed, Dec 19, 2012 at 6:53 PM, Corey Quinn <corey at sequestered.net> wrote:
>
> > Also in LA here.
> >
> > traceroute to arin.net (192.149.252.76), 30 hops max, 60 byte
> > packets
> > 1 10.201.69.1 (10.201.69.1) 0.227 ms 0.233 ms 0.232 ms
> > 2 * * *
> > 3 xe-7-2-0.mpr1.lax112.us.above.net (64.125.170.97) 1.424 ms
> > 1.396 ms
> > 1.366 ms
> > 4 above-cox-1.lax12.us.above.net (64.125.13.10) 1.331 ms above-cox-2.
> > lax12.us.above.net (64.125.13.14) 1.411 ms
> > above-cox-1.lax12.us.above.net(64.125.13.10) 1.386 ms
> > 5 mrfddsrj02-ae0.0.rd.dc.cox.net (68.1.1.7) 67.585 ms mrfddsrj01-ae0.0.
> > rd.dc.cox.net (68.1.1.5) 67.667 ms 67.783 ms
> > 6 * * *
> > 7 * * *
> > 8 wsip-98-172-152-14.dc.dc.cox.net (98.172.152.14) 79.017 ms
> > 69.115 ms 69.117 ms
> > 9 * * *
> > 10 * * *
> > 11 * * *
> >
> >
> >
> > On Dec 19, 2012, at 3:50 PM, Jake Mertel <jake at nobistech.net> wrote:
> >
> > Something else that just clicked, I have been having a number of
> > issues reaching arin.net today from one of my servers in Los Angeles
> > that uses nLayer as its upstream. Request response times are between
> > 20 and 40 seconds as opposed to 2 to 4 seconds on our office
> > connection. Looking at my trace from LA, we are going
> > LA<->Cox<->ARIN.****
> >
> > C:\Users\jake>tracert arin.net****
> >
> > Tracing route to arin.net [192.149.252.76]**** over a maximum of 30
> > hops:****
> >
> > 1 <1 ms 1 ms <1 ms v403.er01.lax.ubiquity.io [72.37.224.129]*
> > ***
> > 2 1 ms 5 ms 1 ms xe-1-0-3.ar1.lax2.us.nlayer.net
> > [69.31.127.45]****
> > 3 <1 ms <1 ms <1 ms ae1-80g.cr1.lax1.us.nlayer.net
> > [69.31.127.129]****
> > 4 2 ms 5 ms 2 ms ae2-50g.ar1.lax1.us.nlayer.net
> > [69.31.127.142]****
> > 5 <1 ms <1 ms <1 ms as22773.ae12.ar1.lax1.us.nlayer.net
> > [69.31.127.230]****
> > 6 70 ms 111 ms 70 ms mrfddsrj01-ae0.0.rd.dc.cox.net [68.1.1.5]*
> > ***
> > 7 * * * Request timed out.****
> > 8 * * * Request timed out.****
> > 9 72 ms 73 ms 82 ms wsip-98-172-152-14.dc.dc.cox.net
> > [98.172.152.14]****
> > 10 72 ms 72 ms 82 ms host-252-131.arin.net [192.149.252.131]****
> > 11 * * * Request timed out.****
> > 12 * * * Request timed out.****
> > 13 * * * Request timed out.****
> > 14 * * * Request timed out.****
> > 15 * * * Request timed out.****
> >
> >
> > *From:* outages-bounces at outages.org
> > [mailto:outages-bounces at outages.org] *On Behalf Of *Jake Mertel
> > *Sent:* Wednesday, December 19, 2012 4:45 PM
> > *To:* 'Brandon Whaley'; 'outages at outages.org'
> > *Subject:* Re: [outages] Cox -> nLayer connectivity issues****
> > ** **
> > We have received a report of similar issues today. The client has
> > servers with us in several locations where we use nLayer and/or
> > PacketExchagne and his monitoring system is on a network that uses
> > Cox as its preferred upstream. He shutdown his Cox upstream and
> > didn?t have any issues reaching the servers over his backup
> > provider. The issues were sporadic and did not affect all protocols
> > ? ICMP pings worked, snmpwalk was fine, but UDP traces were dying
> > somewhere on the reverse path. Seems to be very similar to what you
> > are seeing.****
> >
> > *From:* outages-bounces at outages.org
> > [mailto:outages-bounces at outages.org<outages-bounces at outages.org>
> > ] *On Behalf Of *Brandon Whaley
> > *Sent:* Wednesday, December 19, 2012 4:25 PM
> > *To:* outages at outages.org
> > *Subject:* [outages] Cox -> nLayer connectivity issues****
> > ** **
> > We've been seeing intermittent TCP/UDP connectivity issues from Cox
> > Communications in Virginia to any location that routes over nLayer.
> > UDP traceroutes are fine, but DNS lookups time out for minutes at a
> > time, then work again for ~5 minutes before repeating the problem.
> > ICMP is never affected during the outages.****
> > ** **
> > traceroute to 198.46.80.1 (198.46.80.1), 30 hops max, 60 byte
> > packets****
> > 1 router36f24c.local (192.168.14.1) 0.560 ms 0.531 ms 0.753
> > ms****
> > 2 wsip-174-77-92-169.hr.hr.cox.net (174.77.92.169) 2.532 ms
> > 2.581 ms
> > 3.009 ms****
> > 3 172.21.224.153 (172.21.224.153) 3.995 ms 4.067 ms 4.111
> > ms****
> > 4 172.21.249.101 (172.21.249.101) 4.185 ms 4.396 ms 4.561
> > ms****
> > 5 172.21.249.73 (172.21.249.73) 4.916 ms 4.900 ms 5.128 ms****
> > 6 172.21.249.18 (172.21.249.18) 5.517 ms 5.124 ms 5.043 ms****
> > 7 ip-216-54-33-22.coxfiber.net (216.54.33.22) 210.486 ms 210.477
> > ms
> > 210.466 ms****
> > 8 68.1.4.139 (68.1.4.139) 221.950 ms 222.460 ms 232.268 ms****
> > 9 * xe-5-0-7.ar1.iad1.us.nlayer.net (69.31.10.81) 226.332 ms
> > 226.866
> > ms****
> > 10 as54641.xe-9-0-1.ar1.iad1.us.nlayer.net (69.31.31.42) 223.718
> > ms
> > 225.083 ms 225.744 ms****
> > 11 198.46.80.1 (198.46.80.1) 225.706 ms 226.670 ms 226.698
> > ms****
> > ** **
> > Is anyone with Cox on the list that can investigate/contact me?****
> > ** **
> > --
> > Best Regards,
> > Brandon W.****
> > _______________________________________________
> > Outages mailing list
> > Outages at outages.org
> > https://puck.nether.net/mailman/listinfo/outages
> >
> >
> >
> > _______________________________________________
> > Outages mailing list
> > Outages at outages.org
> > https://puck.nether.net/mailman/listinfo/outages
> >
> >
> _______________________________________________
> Outages mailing list
> Outages at outages.org
> https://puck.nether.net/mailman/listinfo/outages
_______________________________________________
Outages mailing list
Outages at outages.org
https://puck.nether.net/mailman/listinfo/outages
More information about the Outages
mailing list