[outages] HE FMT2 down, various network foo eminating

Jeremy Chadwick outages at jdc.parodius.com
Sat Sep 26 06:19:19 EDT 2009


This should probably go to outages-discuss, but I have to say something.

These kinds of outages were regular occurrences at HE Fremont (FMT2 had
not been built yet, but I doubt it matters).  Here's a few choice
examples -- and these are just a few -- of recurring things which never
got addressed.

I will never forget how packets with a Comcast source, HE Fremont
destination, went through an AT&T "device" somewhere in the mix which
would regularly burp/flake out and HE couldn't do anything about it.  No
alternate routes were propagated, so literally Comcast->HE was hard
down.  Single point of failure.

I will never forget how packets with a SBC DSL (now AT&T) source, HE
Fremont destination, went through Telia (a Swedish ISP with no North
American NOC) for a single hop, and that Telia would drop their BGP
session or severe their link for whatever reason.  Like the above, no
alternate routes were propagated, so literally SBC->HE was hard down.
Likewise, individuals in Sweden who used Telia also saw the same thing
going from Telia Sweden->HE Fremont.  Single point of failure.

I will never forget the Cisco router which would reboot every 4-5 months
for no reason... a problem which went on for literally years, and all I
was ever told was "well it's back up now" and "we have a case open with
Cisco".  Single point of failure.

I will never forget how HE refused to use VLANs to segregate customers
on a layer 2 level, instead preferring some strange layer 3
implementation.  When we witnessed an unexpected massive (7-8mbit/sec)
increase in inbound traffic, only to find that the destination IPs of
these packets were for another customer in a completely different
netblock/area of the Fremont facility, we were told by support "that's
impossible".  Full tcpdump captures were given, and we were told "this
makes no sense, this can't happen".  4-5 hours later, we were told the
root cause was "a customer who had misconfigured their load balancer".
Right.

I will never forget the two separate times there were full-scale power
outages both caused by "UPS maintenance".  Gas generators?  They have
them, but when I asked why they didn't kick in, I was told "we don't
know".  When I asked if there would be a follow-up investigation as
to why that didn't happen, so the issue wouldn't recur, I was told
"probably".

I will never forget the "maintenances" that we were never told of,
because scheduled maintenances are not announced to customers.  I was
left with the impression they're done on a whim vs. scheduled.

Safe to say when our contract ended, we left.

I will not recommend Hurricane Electric to anyone who wants co-location
with reliable, redundant connectivity.  I *will* recommend them for
cheap "I just need a 1U box stuck somewhere and don't really care about
quality" co-location, although compared to some of their competitors,
they're actually more expensive.

Also, if HE or HE customers read this and want to flame/argue/do
burn-outs in a Pacer over this -- don't bother.  I won't be responding
to any mails.  Why?  Because all outages/incidents witnessed, including
the above, were sent to our account rep. when we were asked "why aren't
you renewing?" I received no response past that point.

I'll leave you with this: if FMT2 had redundancy, then how were things
hard down for over an hour?  Think about it.

-- 
| Jeremy Chadwick                                   jdc at parodius.com |
| Parodius Networking                       http://www.parodius.com/ |
| UNIX Systems Administrator                  Mountain View, CA, USA |
| Making life hard for others since 1977.              PGP: 4BD6C0CB |

On Sat, Sep 26, 2009 at 01:17:22AM -0700, Scott Howard wrote:
> It's back up as of a few seconds ago.
> 
>   Scott.
> 
> 
> On Sat, Sep 26, 2009 at 1:13 AM, Scott Howard <scott at doc.net.au> wrote:
> 
> > Outage started at 11:37pm Pacific, give or take a minute.
> >
> > I spoke to them at about 11:43pm and they were aware there was an issue,
> > but didn't know the cause.  Since then it's been impossible to get through
> > to anything but voicemail :(
> >
> >   Scott
> >
> >
> >
> > On Sat, Sep 26, 2009 at 1:09 AM, George Herbert <george.herbert at gmail.com>wrote:
> >
> >> Hurricane Electric has a datacenter outage at FMT2, cause as yet
> >> unspecified, but there are SF bay area and possibly wider network
> >> issues eminating from it at low to moderate intensity.
> >>
> >> Out around 12:15am PST Sat morning.
> >>
> >
> >

> _______________________________________________
> outages mailing list
> outages at outages.org
> https://puck.nether.net/mailman/listinfo/outages




More information about the Outages mailing list