[outages] softlayer issues - monitis ?

Charles Sprickman spork at bway.net
Wed Aug 5 13:44:35 EDT 2015


On Aug 5, 2015, at 10:59 AM, James Milko <jmilko at bandwidth.com> wrote:

> What is everyone's network path to Softlayer?  We just had an issue where traffic in one direction, from one network would fail.  The really interesting bit is that our SIP traffic was failing but ICMP worked fine.

From NAC, which was the only place I was seeing trouble with ICMP not making it through (yes, that was odd - nagios alerted on the host down, but http and other checks were OK, traceroute also showed a response on the last hop), the path is pretty simple, NAC and SL peer with each other:

 3  0.e1-1.tbr1.mmu.nac.net (209.123.10.18)  0.352 ms  0.324 ms  0.358 ms
 4  0.e1-2.tbr1.tl9.nac.net (209.123.10.130)  1.356 ms  1.312 ms  10.229 ms
 5  0.e2-1.pr2.tl9.nac.net (209.123.11.142)  1.101 ms  1.196 ms  1.104 ms
 6  te1-7.bbr01.tl01.nyc01.networklayer.com (198.32.160.27)  1.354 ms  1.446 ms  1.478 ms
 7  ae7.bbr02.tl01.nyc01.networklayer.com (173.192.18.177)  1.731 ms  1.696 ms  1.859 ms
 8  ae1.bbr01.eq01.chi01.networklayer.com (173.192.18.132)  22.718 ms  21.803 ms  22.602 ms
 9  ae20.bbr01.eq01.dal03.networklayer.com (173.192.18.136)  40.673 ms  40.780 ms  39.721 ms
10  ae5.dar01.dal05.networklayer.com (173.192.18.215)  40.550 ms  40.576 ms  40.515 ms
11  po1.fcr02.sr02.dal05.networklayer.com (173.192.118.139)  40.958 ms  40.768 ms  41.833 ms

Same in both directions.

It also fixed itself at 11:52 eastern.

I know long ago SL noted that they had all sorts of nifty IDS stuff in place.  I’m not sure they still do, but if they do, maybe it went nuts trying to “mitigate” something that didn’t need mitigating.  Always weird when you lose one particular protocol (me ICMP, you UDP and possibly only particular ports).

Charles

> 
> We noticed that our transit of choice flipped last night, so we manually flipped it back and the issue resolved.
> 
> 2828 3356 36351 - no SIP in one direction
> 174 1299 36351 - Works fine
> 
> We've fired tickets off to everyone involved hoping someone finds something.
> 
> James Milko
> Architect, Network Engineering
> 900 Main Campus Drive
> Raleigh, NC 27606
> Bandwidth
> 
> On Wed, Aug 5, 2015 at 10:37 AM, Justin Head via Outages <outages at outages.org> wrote:
> I've got a VM with Linode in Dallas (SoftLayer) running a monitoring system and it's seeing connectivity issues over SNMP (UDP) to basically all locations. This has been on and off all morning since at least 5am CDT.
> --
> Justin Head
> Owner / Chief Operating Officer
> CubedHost, LLC
> https://cubedhost.com
> On 8/5/15 9:23 AM, Charles Sprickman via Outages wrote:
>> I haven’t had time to look at this and I’m going out the door right now, but I’ve had some odd alerts on a softlayer-hosted box. It has multiple IPs and from nac.net in NJ, I am unable to ping some IPs but not others.
>> 
>> I won’t have time to poke at it more until this afternoon (eastern) though.
>> 
>> Charles
>> 
>> Izzy Goldstein - TeleGo via Outages <outages at outages.org> wrote:
>> 
>>> seems like monitis is having some issues?
>>> 
>>> 
>>> cant load web page www.monitis.com
>>> 
>>> 
>>> here is a traceroute output
>>> 
>>> 
>>> [root at pbx81 ~]# traceroute monitis.com
>>> traceroute to monitis.com (208.76.244.202), 30 hops max, 60 byte packets
>>>  1  209.200.42.35 (209.200.42.35)  0.516 ms  0.568 ms  0.549 ms
>>>  2  csc180.gsc.webair.net (173.239.28.37)  0.678 ms  0.812 ms  0.795 ms
>>>  3  es0.nyc4.webair.net (173.239.0.25)  1.211 ms  1.335 ms  1.317 ms
>>>  4  208.178.245.149 (208.178.245.149)  0.896 ms  1.031 ms  1.009 ms
>>>  5  * * *
>>>  6  * * *
>>>  7  COLO4-DALLA.ear1.Dallas1.Level3.net (8.9.232.74)  35.522 ms  35.481 ms  35.475 ms
>>>  8  * * *
>>>  9  72.249.131.13 (72.249.131.13)  38.657 ms  38.888 ms  39.113 ms
>>> 10  67.208.119.94-static.reverse.crucialx.net (67.208.119.94)  35.806 ms  36.365 ms  36.474 ms
>>> 11  monitis.com (208.76.244.202)  35.439 ms  35.247 ms  35.596 ms
>>> [root at pbx81 ~]#
>>> 
>>> 
>>> 
>>> Tracing route to monitis.com [208.76.244.202]
>>> over a maximum of 30 hops:
>>> 
>>>   1    <1 ms    <1 ms    <1 ms  Wireless_Broadband_Router.home [192.168.1.1]
>>>   2     *        *        *     Request timed out.
>>>   3    10 ms     8 ms     9 ms  451be041.cst.lightpath.net [65.19.98.65]
>>>   4    11 ms     9 ms    11 ms  ool-4353dda4.dyn.optonline.net [67.83.221.164]
>>>   5     9 ms    11 ms    10 ms  451be075.cst.lightpath.net [65.19.99.117]
>>>   6     *        *        *     Request timed out.
>>>   7     *        *        *     Request timed out.
>>>   8     *        *        *     Request timed out.
>>>   9     *        *        *     Request timed out.
>>>  10     *        *        *     Request timed out.
>>>  11     *        *        *     Request timed out.
>>>  12     *        *        *     Request timed out.
>>>  13     *        *        *     Request timed out.
>>>  14     *        *        *     Request timed out.
>>>  15     *        *        *     Request timed out.
>>>  16     *        *        *     Request timed out.
>>>  17     *        *        *     Request timed out.
>>>  18     *        *        *     Request timed out.
>>>  19     *        *        *     Request timed out.
>>>  20     *        *        *     Request timed out.
>>>  21     *        *        *     Request timed out.
>>>  22     *        *        *     Request timed out.
>>>  23     *        *        *     Request timed out.
>>>  24     *        *        *     Request timed out.
>>>  25     *        *        *     Request timed out.
>>>  26     *        *        *     Request timed out.
>>>  27     *        *        *     Request timed out.
>>>  28     *        *        *     Request timed out.
>>>  29     *        *        *     Request timed out.
>>>  30     *        *        *     Request timed out.
>>> 
>>> Trace complete.
>>> 
>>> 
>>> 
>>> -- 
>>>>>> 
>>> 
>>> Izzy Goldstein
>>> 
>>> Main:   (212) 477.1000
>>> Fax:     (212) 477.8900
>>> 
>>> igoldstein at telego.com
>>> 
>>> Website | LinkedIn | Blog |
>>> 
>>> 
>>> Confidentially Notice: This e-mail may contain confidential and/or privileged information. If you are not the intended recipient or have received this e-mail in error please notify us immediately by email reply and destroy this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden. Any views or opinions presented in this email are solely those of the author and do not necessarily represent those of TeleGo (T). Employees of TeleGo are expressly required not to make defamatory statements and not to infringe or authorize any infringement of copyright or any other legal right by email communications. Any such communication is contrary to TeleGo policy and outside the scope of the employment of the individual concerned. TeleGo will not accept any liability in respect of such communication, and the employee responsible will be personally liable for any damages or other liability arising.
>>> 
>>> Think before you print.
>>> _______________________________________________
>>> Outages mailing list
>>> Outages at outages.org
>>> https://puck.nether.net/mailman/listinfo/outages
>> 
>> 
>> 
>> 
>> _______________________________________________
>> Outages mailing list
>> Outages at outages.org
>> https://puck.nether.net/mailman/listinfo/outages
> 
> 
> _______________________________________________
> Outages mailing list
> Outages at outages.org
> https://puck.nether.net/mailman/listinfo/outages
> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://puck.nether.net/pipermail/outages/attachments/20150805/53946d82/attachment.htm>


More information about the Outages mailing list