[outages] Wikipedia suffers global outage.

virendra rode virendra.rode at gmail.com
Wed Mar 24 17:23:57 EDT 2010


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

http://techblog.wikimedia.org/2010/03/global-outage-cooling-failure-and-dns/

or

Text version:

Due to an overheating problem in our European data center many of our
servers turned off to protect themselves. As this impacted all Wikipedia
and other projects access from European users, we were forced to move
all user traffic to our Florida cluster, for which we have a standard
quick failover procedure in place, that changes our DNS entries.

However, shortly after we did this failover switch, it turned out that
this failover mechanism was now broken, causing the DNS resolution of
Wikimedia sites to stop working globally. This problem was quickly
resolved, but unfortunately it may take up to an hour before access is
restored for everyone, due to caching effects.

We apologize for the inconvenience this has caused.

Update: Unfortunately, for many, this outage seems to have lasted longer
than an hour. It appears that many ISPs’ DNS resolvers do not honor the
so-called Negative Cache TTL that we send (1 hour), and instead use a
longer value. We have circumvented this problem by renaming the affected
DNS record to something else.




regards,
/virendra
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFLqoKPpbZvCIJx1bcRAsEMAJ45YWISm+0lPQNuEUT3mLkzSrDn6gCeNZBZ
GpGEHc45ATzP+OVuekugSCY=
=p3AM
-----END PGP SIGNATURE-----



More information about the Outages mailing list