[outages] Facebook

Terry t0psecret at yahoo.com
Mon Dec 10 18:58:31 EST 2012


Yeah, I think my result was a red herring. a.ns.facebook.com and b.ns.facebook.com still can't resolve the A record for star.facebook.com, despite things seemingly being back to normal now. The NS record is what's key and by the time I looked at it, it was fixed.

Why some people feel the need to get so clever with DNS is beyond me. How about just resolving the A records directly from the facebook.com NS servers, instead of via a CNAME to another group of DNS servers? Would that be so difficult? Then you're shocked when there's an outage.


________________________________
 From: Jeremy Chadwick <jdc at koitsu.org>
To: Terry <t0psecret at yahoo.com> 
Cc: Richard Mahoney <richard.mahoney at tracesmart.co.uk>; Corey Quinn <corey at sequestered.net>; "outages at outages.org" <outages at outages.org> 
Sent: Monday, December 10, 2012 6:40 PM
Subject: Re: [outages] Facebook
 
I could have provided dig +trace output but this is shorter and reads
easier.

It looks like records get looked up as follows (and I'm excluding the
root server lookups, i.e. . --> .com --> facebook.com):

facebook.com.           147814  IN      NS      b.ns.facebook.com.
facebook.com.           147814  IN      NS      a.ns.facebook.com.

And the A records:

a.ns.facebook.com.      172573  IN      A       69.171.239.12
b.ns.facebook.com.      172573  IN      A       69.171.255.12

The SOA for facebook.com (domain itself) hasn't been changed since
2012/12/07 (if SOA serial is truly kept in lines with the YYYYMMDD
model).

69.171.239.12 when queried for any records for www.facebook.com
results in a CNAME response to star.facebook.com.  It's probably named
"star" to indicate asterisk (*):

www.facebook.com.       338     IN      CNAME   star.facebook.com.
star.facebook.com.      1238    IN      NS      glb2.facebook.com.
star.facebook.com.      1238    IN      NS      glb1.facebook.com.

And the A records:

glb1.facebook.com.      3038    IN      A       69.171.239.10
glb2.facebook.com.      3038    IN      A       69.171.255.10

glb obviously stands for "global load balancer", though I have no idea
what device they use (F5s, Citrix Netscalers, Alteons (god forbid), or
something home-grown).

Given the below analysis from Terry, it looks to me like:

a) one or both of their load balancers may have been overloaded briefly
   and did not respond to DNS queries (or possibly something at layer 2
   or layer 3 was affecting this)
b) one or more of the nameservers *behind* glb[12].facebook.com were
   overloaded or broken in some way, or layer 2/3 was responsible for
   breakage (between glbs and nameservers)

The only people who know for certain are -- yup -- the Facebook folks.

And naturally this is me doing my testing from a single source, so its
possible they use anycast to distribute some of their load, in which
case the above analysis (despite speculative) is still correct, except
what actual devices/networks are involved would be different.

You're welcome.  :-)

-- 
| Jeremy Chadwick                                  jdc at koitsu.org |
| UNIX Systems Administrator                http://jdc.koitsu.org/ |
| Mountain View, CA, US                                            |
| Making life hard for others since 1977.             PGP 4BD6C0CB |

On Mon, Dec 10, 2012 at 03:24:28PM -0800, Terry wrote:
> Still broke here. Silly CNAMEs.
> 
> ~ > nslookup
> > server a.ns.facebook.com
> Default server: a.ns.facebook.com
> Address: 69.171.239.12#53
> 
> > www.facebook.com
> Server: ? ? ? ? a.ns.facebook.com
> Address: ? ? ? ?69.171.239.12#53
> www.facebook.com ? ? ? ?canonical name = star.facebook.com.
> 
> 
> > star.facebook.com
> Server: ? ? ? ? a.ns.facebook.com
> Address: ? ? ? ?69.171.239.12#53
> 
> Non-authoritative answer:
> *** Can't find star.facebook.com: No answer
> 
> 
> ________________________________
>  From: Richard Mahoney <richard.mahoney at tracesmart.co.uk>
> To: Corey Quinn <corey at sequestered.net>; "outages at outages.org" <outages at outages.org> 
> Sent: Monday, December 10, 2012 6:21 PM
> Subject: Re: [outages] Facebook
>  
> 
>  
> Seems to be resolving again now on Virgin Media (UK). Guess it was just a hiccup.
> ?
> PS C:\Windows\system32> nslookup www.facebook.com
> Server:? (removed)
> Address:? (removed)
> ?
> Non-authoritative answer:
> Name:??? star.facebook.com
> Addresses:? 2a03:2880:2110:9f02:face:b00c:0:4
> ????????? 69.171.247.20
> Aliases:? www.facebook.com
> ?
> Kind regards
> ?
> Richard Mahoney, CEH?
> Systems Administrator
> Tracesmart
> T?029 2067 8534????M?07714 486543????E?richard.mahoney at tracesmart.co.uk
> www.tracesmartcorporate.co.uk????www.traceiq.co.uk
> Global Reach ?Dunleavy Drive ?Cardiff ?CF11 0SN
> Follow us on?Twitter
> ISO/IEC 27001?CERTIFICATE: GB 10/81945
> We are proud to sponsor?missingpeople.org.uk
> This email and any attachments are confidential to Tracesmart Ltd and are solely for use by the intended recipient. If you are not the intended recipient you must not disclose, copy or distribute its contents to any other person nor make use of its contents in any way. If you have received this email in error please forward a copy to?info at tracesmart.co.uk?and remove it from your system.This email and any attachments have been scanned for the presence of computer viruses. Neither Tracesmart Ltd nor the sender accepts any responsibility for computer viruses once this email has been transmitted. The content of this message may contain personal views, which are not the views of Tracesmart Ltd, unless specifically stated. Tracesmart may monitor email traffic data and also the content of email for the purposes of security and staff training.Tracesmart Ltd is a company registered in England & Wales with company registration number 3827062 whose registered
>  office is at Global Reach, Dunleavy Drive, Cardiff CF11 0SN. ?Our Data Protection Number is Z708281X and our Consumer Credit Licence Number is 565961.
> ?
> From:outages-bounces at outages.org [mailto:outages-bounces at outages.org] On Behalf Of Corey Quinn
> Sent: 10 December 2012 23:15
> To: outages at outages.org
> Subject: Re: [outages] Facebook
> ?
> Can you be a bit more specific? ?"Works for me."
> ?
> cquinn at quinntel ~ % dig facebook.com ? ? ? ? ? ? ? 5344 15:14:37 Mon 12-10-2012
> ?
> ; <<>> DiG 9.9.1-P2 <<>> facebook.com
> ;; global options: +cmd
> ;; Got answer:
> ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 63691
> ;; flags: qr rd ra; QUERY: 1, ANSWER: 6, AUTHORITY: 2, ADDITIONAL: 3
> ?
> ;; OPT PSEUDOSECTION:
> ; EDNS: version: 0, flags:; udp: 4096
> ;; QUESTION SECTION:
> ;facebook.com.??????????????????????????????????? IN??????? A
> ?
> ;; ANSWER SECTION:
> facebook.com.???????????? 7200??? IN??????? A???????? 66.220.152.16
> facebook.com.???????????? 7200??? IN??????? A???????? 69.171.224.32
> facebook.com.???????????? 7200??? IN??????? A???????? 173.252.100.16
> facebook.com.???????????? 7200??? IN??????? A???????? 69.171.229.16
> facebook.com.???????????? 7200??? IN??????? A???????? 173.252.101.16
> facebook.com.???????????? 7200??? IN??????? A???????? 66.220.158.16
> ?
> ;; AUTHORITY SECTION:
> facebook.com.???????????? 139086??????????? IN??????? NS?????? a.ns.facebook.com.
> facebook.com.???????????? 139086??????????? IN??????? NS?????? b.ns.facebook.com.
> ?
> ;; ADDITIONAL SECTION:
> a.ns.facebook.com.????? 139086??????????? IN??????? A???????? 69.171.239.12
> b.ns.facebook.com.????? 139086??????????? IN??????? A???????? 69.171.255.12
> ?
> ;; Query time: 50 msec
> ;; SERVER: 10.201.1.103#53(10.201.1.103)
> ;; WHEN: Mon Dec 10 15:14:40 2012
> ;; MSG SIZE ?rcvd: 204
> ?
> ?
> On Dec 10, 2012, at 3:12 PM, Richard Mahoney <richard.mahoney at tracesmart.co.uk> wrote:
> 
> 
> Seeing DNS issues for Facebook here.
> Anyone else?
> ?
> Kind regards
> ?
> Richard Mahoney, CEH?
> Systems Administrator
> Tracesmart
> T?029 2067 8534????M?07714 486543????E?richard.mahoney at tracesmart.co.uk
> www.tracesmartcorporate.co.uk????www.traceiq.co.uk
> Global Reach ?Dunleavy Drive ?Cardiff ?CF11 0SN
> Follow us on?Twitter
> ISO/IEC 27001?CERTIFICATE: GB 10/81945
> We are proud to sponsor?missingpeople.org.uk
> This email and any attachments are confidential to Tracesmart Ltd and are solely for use by the intended recipient. If you are not the intended recipient you must not disclose, copy or distribute its contents to any other person nor make use of its contents in any way. If you have received this email in error please forward a copy to?info at tracesmart.co.uk?and remove it from your system.This email and any attachments have been scanned for the presence of computer viruses. Neither Tracesmart Ltd nor the sender accepts any responsibility for computer viruses once this email has been transmitted. The content of this message may contain personal views, which are not the views of Tracesmart Ltd, unless specifically stated. Tracesmart may monitor email traffic data and also the content of email for the purposes of security and staff training.Tracesmart Ltd is a company registered in England & Wales with company registration number 3827062 whose registered
>  office is at Global Reach, Dunleavy Drive, Cardiff CF11 0SN. ?Our Data Protection Number is Z708281X and our Consumer Credit Licence Number is 565961.
> ?
> _______________________________________________
> Outages mailing list
> Outages at outages.org
> https://puck.nether.net/mailman/listinfo/outages
> ?
> _______________________________________________
> Outages mailing list
> Outages at outages.org
> https://puck.nether.net/mailman/listinfo/outages

> _______________________________________________
> Outages mailing list
> Outages at outages.org
> https://puck.nether.net/mailman/listinfo/outages
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://puck.nether.net/pipermail/outages/attachments/20121210/facc5fc7/attachment.htm>


More information about the Outages mailing list