[Outages-discussion] [outages] IPv6 tunnels in FRA1 on HE.net down?
Jeremy Chadwick
jdc at koitsu.org
Tue May 14 19:38:31 EDT 2013
(Moved to -discussion given the tone/nature of my mail)
On Tue, May 14, 2013 at 03:50:40PM -0700, Constantine A. Murenin wrote:
> For what it is worth, further details about the issue have surfaced.
> I found a friend who also has a tunnel on tserv1.fra1.he.net., and he
> has been running smokeping to various IPv4 and IPv6 resources for
> quite a while.
>
> According to several of his smokeping reports, it can be concluded
> that this very outage occurred during 14T18:00/05 and 14T18:45/50; but
> we've also noticed that there was another, 6 hour (yes, 6 hour) outage
> a day earlier, ~13T12 to ~13T18 (which corresponds to Monday early to
> late morning Pacific Time).
>
> I've contacted he.net again this time around, and they said that
> they're trying to hunt some obscure kernel bug that is causing these
> issues.
>
> The tunnelbroker.net is a free service, but to have a 6 hour outage,
> clearly spanning 1/4th of a whole day, is absolutely ridiculous. I'm
> stunned that IPv6 connectivity of tserv1.fra1.he.net. is, apparently,
> still not monitored, even though it's known to be having these issues.
> ???
So don't use Hurricane Electric. Really. I stopped using them (for
co-location (that means paid service, as if it matters) years ago for
reasons I've already discussed on the list in the past:
https://puck.nether.net/pipermail/outages/2009-September/001547.html
http://puck.nether.net/pipermail/outages-discussion/2011-May/000225.html
There are other brokers you can choose from:
http://en.wikipedia.org/wiki/List_of_IPv6_tunnel_brokers
> Alternatively, it is, of course, possible that some engineer has been
> troubleshooting the root cause of this issue for those whole 5 or 6
> hours on Sunday/Monday night; but I find that somewhat hard to
> believe; more like it got busted, and noone responsible knew about it
> being busted for most of the time that it was.
>
> Even more troubling, is that they don't even publish any reports about
> these extended outages.
>
> For tserv1.fra1.he.net. end users: if you can `ping6 ordns.he.net`
> (it runs on tserv itself, try $(host `dig +short -6 @ordns.he.net
> whoami.akamai.net`)), but cannot `ping6 ns4.he.net`, then it most
> likely means that tserv1.fra1.he.net IPv6-connectivity is down again,
> and you must open a ticket with HE.net ASAP. Perhaps someone should
> setup a smokeping with automatic emails to support at he.net?
What if HE routers begin filtering (or rate-limiting) ingress ICMPv6
ECHO to their nameservers and your pings never make it there? What if
HE nameservers being dropping (or rate-limiting) ingress ICMPv6 ECHO, or
setting the kernel to never solicit ICMPv6 ECHO_REPLY? Find a better
destination (i.e. a destination you have full control over).
Regarding the "automatic Emails" to support at he.net -- do not do this,
especially with something like smokeping**.
Take your hands off the keyboard and step back for a moment. Now think
about the repercussions/ramifications of what you've proposed. Think
about what kind of reaction you will get if such was implemented. Think
about how any online presence would react. Think about how this could
affect other people, or affect their existing support model. Think
about how this could affect you as well (how you will be viewed or
judged).
I cannot tell you how many times over the years (I've been online since
roughly 1990) I've witnessed online resources becoming hindered or
"intentionally retarded" as a result of *one single person* doing
something hasty (either in retaliation/reaction or because they thought
they were being clever). I surely can't be the only one who remembers
when SMTP "just worked" (the idea/concept of spam didn't exist,
everything was congruous, with person/place X trusting person/place Y
due to folks being good-natured) -- but this applies to so many other
things than just SMTP...
So please don't do this; try a different approach if you're annoyed with
HE. (My words don't come lightly, as I'm the type of person who prefers
to fix actual issues than avoid them, but sometimes avoidance is the
better (less stressful) choice, and can sometimes speak louder than
fixes...)
** -- Smokeping is still one of the most worthless "tools" I have ever
seen used in the wild. It's pointless, telling the viewer absolutely
nothing about the real cause of an outage. Periodic mtr/traceroute,
written to a log file, is more useful than smokeping (and not to mention
doesn't involve horrible resource-chewing crap like rrdtool). All it's
good for is making pretty gradients (which I can do in MS Paint).
--
| Jeremy Chadwick jdc at koitsu.org |
| UNIX Systems Administrator http://jdc.koitsu.org/ |
| Mountain View, CA, US |
| Making life hard for others since 1977. PGP 4BD6C0CB |
More information about the Outages-discussion
mailing list