[outages] Chicago-area Level3 to Amazon AWS EAST-1

Sarlas, George gsarlas at irhythmtech.com
Thu Sep 27 17:00:26 EDT 2012


Jeremy, 

Thank you for the traceroute link.  It was very informative.  To answer your re-route question, I manually entered some routes on my firewall to redirect traffic to my backup ISP's line.

That said, our service seems to be back to normal.  While I haven't gotten an update on the trouble ticket I have opened with my ISP, another person on this mailing list had reported that his (possibly related) issue had been resolved (a Level3 network issue in Washington).  So I tested again, and now I'm up and running.

Thanks to everyone for their input.


-george


-----Original Message-----
From: Jeremy Chadwick [mailto:jdc at koitsu.org] 
Sent: Thursday, September 27, 2012 2:06 PM
To: Sarlas, George
Cc: outages at outages.org
Subject: Re: [outages] Chicago-area Level3 to Amazon AWS EAST-1

George,

For the future: you need to provide traceroutes from both directions.
Most routing these days is asymmetric.  Heavily covered here:

http://www.nanog.org/meetings/nanog47/presentations/Sunday/RAS_Traceroute_N47_Sun.pdf

But now that you've "re-routed traffic" (I assume you simply denounced and de-peered with your AS3356 peering point?  Or did you physically down an interface?), the reverse-path traceroute (from 75.101.163.221 to wherever you did the original TCP ping from) won't necessarily be the same as when you were being impacted.

TL;DR -- always provide traceroutes from both directions, and always perform this *before* making any routing/peering changes.

-- 
| Jeremy Chadwick                                   jdc at koitsu.org |
| UNIX Systems Administrator                http://jdc.koitsu.org/ |
| Mountain View, CA, US                                            |
| Making life hard for others since 1977.             PGP 4BD6C0CB |

On Thu, Sep 27, 2012 at 06:50:34PM +0000, Sarlas, George wrote:
> Since approx. 4am CT, we've been seeing 15-30% dropped packets for traffic going from our Chicago Level3 connection to Amazon's EAST-1 data center.  Our San Francisco office (also using Level3) isn't experiencing a problem.  I've re-routed this traffic to our backup Internet connection (Airlogic) for now, and that traffic is getting through just fine.  Anyone else seeing this problem?  Ticket already opened with Level3, waiting to hear back.  Thank you.
> 
> Here's the route my traffic is taking (time outs at the bottom are normal, ICMP gets blocked from that point on).
> 
> 
> 
> 
> 
> Tracing route to zioreports.com [75.101.163.221]
> 
> over a maximum of 30 hops:
> 
> 
> 
>   1     1 ms     1 ms     1 ms  ge-6-2-107.car4.Chicago1.Level3.net [4.71.102.161]
> 
>   2     *        *        *     Request timed out.
> 
>   3    16 ms    22 ms    25 ms  ae-5-5.ebr2.Chicago2.Level3.net [4.69.140.194]
> 
>   4    19 ms    24 ms    25 ms  ae-6-6.ebr2.Washington12.Level3.net [4.69.148.145]
> 
>   5     *        *        *     Request timed out.
> 
>   6    16 ms    18 ms    23 ms  ae-82-82.csw3.Washington1.Level3.net [4.69.134.154]
> 
>   7    16 ms    16 ms    17 ms  ae-3-80.edge2.Washington1.Level3.net [4.69.149.142]
> 
>   8    17 ms    17 ms    16 ms  AMAZON.COM.edge2.Washington1.Level3.net [4.79.22.74]
> 
>  9     *        *        *     Request timed out.
> 
> 10    17 ms    18 ms    17 ms  72.21.222.139
> 
> 11    18 ms    18 ms    18 ms  216.182.224.17
> 
> 12     *        *        *     Request timed out.
> 
> 13     *        *        *     Request timed out.
> 
> 14     *        *        *     Request timed out.
> 
> 15     *        *        *     Request timed out.
> 
> 16     *        *        *     Request timed out.
> 
> 17     *        *        *     Request timed out.
> 
> 18     *        *     ^C
> 
> 
> Here are the results of a TCPing to port 443:
> 
> 
> 2012:09:27 09:50:02 Probing 75.101.163.221:443/tcp - Port is open - 
> time=20.926ms
> 
> 2012:09:27 09:50:03 Probing 75.101.163.221:443/tcp - Port is open - 
> time=31.168ms
> 
> 2012:09:27 09:50:05 Probing 75.101.163.221:443/tcp - Socket is not 
> connected (10057) - time=2012.341ms
> 
> 2012:09:27 09:50:07 Probing 75.101.163.221:443/tcp - Port is open - 
> time=31.358ms
> 
> 2012:09:27 09:50:08 Probing 75.101.163.221:443/tcp - Socket is not 
> connected (10057) - time=2012.448ms
> 
> 2012:09:27 09:50:10 Probing 75.101.163.221:443/tcp - Port is open - 
> time=30.985ms
> 
> 2012:09:27 09:50:12 Probing 75.101.163.221:443/tcp - Port is open - 
> time=31.224ms
> 
> 2012:09:27 09:50:13 Probing 75.101.163.221:443/tcp - Port is open - 
> time=32.065ms
> 
> 2012:09:27 09:50:15 Probing 75.101.163.221:443/tcp - Socket is not 
> connected (10057) - time=2012.375ms
> 
> 2012:09:27 09:50:17 Probing 75.101.163.221:443/tcp - Port is open - 
> time=31.115ms
> 
> Ping statistics for 75.101.163.221:443
> 
>      10 probes sent.
> 
>      7 successful, 3 failed.
> 
> Approximate trip times in milli-seconds (successful connections only):
>     Minimum = 20.926ms, Maximum = 32.065ms, Average = 20.884ms
> 
> 
> 
> ----
> George Sarlas
> Manager, IT Operations
> iRhythm Technologies, Inc.
> 
> 2 Marriott Dr.
> Lincolnshire, IL 60069
> 
> email: gsarlas at irhythmtech.com
> phone: 224-543-4253
> 

> _______________________________________________
> Outages mailing list
> Outages at outages.org
> https://puck.nether.net/mailman/listinfo/outages







More information about the Outages mailing list