[outages] Facebook

Jeremy Chadwick jdc at koitsu.org
Mon Dec 10 19:13:16 EST 2012


There *is* no A record for star.facebook.com -- there's only a CNAME,
which in turn induces per-record NS lookups.  So this is normal.

As for your 2nd question -- this opens a very large Pandora's box when
it comes to DNS engineering and operational deployments.  I can think of
quite a few reasons there should be separate load balancers for the
"main" facebook.com domain and ns.facebook.com bits, and the actual LBs
which return A records for the webservers that return web content.
Effectively what you're proposing is just to have lots of LBs handling
the DNS queries for everything, rather than segregate/delineate things
a bit more.  I can assure you there are justifications for this, but
since I don't work at Facebook, I can't provide those.

What I've seen today is not that complex of a setup, though depending on
what LBs they use, I might not particularly enjoy looking at those
configurations.  But what's shown here, at least if using a Citrix
NetScaler, isn't that complex.

All that said -- it is absolutely possible for them to remove use of the
star.facebook.com CNAME and just have a series of LBs respond with
appropriate A records when faced with a www.facebook.com A record query.

However, given that there's quite a lot of crapola-nonsense that falls
under the facebook.com domain (I'm talking about the equivalent of a
content delivery network, their image hosting stuff, and God knows what
else today -- I haven't used Facebook since January 2011), due to "Web
2.0" nutballs always wanting to make a mess of things ( ;-) ), I'm not
too surprised by the above.

-- 
| Jeremy Chadwick                                   jdc at koitsu.org |
| UNIX Systems Administrator                http://jdc.koitsu.org/ |
| Mountain View, CA, US                                            |
| Making life hard for others since 1977.             PGP 4BD6C0CB |

On Mon, Dec 10, 2012 at 03:58:31PM -0800, Terry wrote:
> Yeah, I think my result was a red herring. a.ns.facebook.com and b.ns.facebook.com still can't resolve the A record for star.facebook.com, despite things seemingly being back to normal now. The NS record is what's key and by the time I looked at it, it was fixed.
> 
> Why some people feel the need to get so clever with DNS is beyond me. How about just resolving the A records directly from the facebook.com NS servers, instead of via a CNAME to another group of DNS servers? Would that be so difficult? Then you're shocked when there's an outage.
> 
> 
> ________________________________
>  From: Jeremy Chadwick <jdc at koitsu.org>
> To: Terry <t0psecret at yahoo.com> 
> Cc: Richard Mahoney <richard.mahoney at tracesmart.co.uk>; Corey Quinn <corey at sequestered.net>; "outages at outages.org" <outages at outages.org> 
> Sent: Monday, December 10, 2012 6:40 PM
> Subject: Re: [outages] Facebook
>  
> I could have provided dig +trace output but this is shorter and reads
> easier.
> 
> It looks like records get looked up as follows (and I'm excluding the
> root server lookups, i.e. . --> .com --> facebook.com):
> 
> facebook.com.? ? ? ? ?  147814? IN? ? ? NS? ? ? b.ns.facebook.com.
> facebook.com.? ? ? ? ?  147814? IN? ? ? NS? ? ? a.ns.facebook.com.
> 
> And the A records:
> 
> a.ns.facebook.com.? ? ? 172573? IN? ? ? A? ? ?  69.171.239.12
> b.ns.facebook.com.? ? ? 172573? IN? ? ? A? ? ?  69.171.255.12
> 
> The SOA for facebook.com (domain itself) hasn't been changed since
> 2012/12/07 (if SOA serial is truly kept in lines with the YYYYMMDD
> model).
> 
> 69.171.239.12 when queried for any records for www.facebook.com
> results in a CNAME response to star.facebook.com.? It's probably named
> "star" to indicate asterisk (*):
> 
> www.facebook.com.? ? ?  338? ?  IN? ? ? CNAME?  star.facebook.com.
> star.facebook.com.? ? ? 1238? ? IN? ? ? NS? ? ? glb2.facebook.com.
> star.facebook.com.? ? ? 1238? ? IN? ? ? NS? ? ? glb1.facebook.com.
> 
> And the A records:
> 
> glb1.facebook.com.? ? ? 3038? ? IN? ? ? A? ? ?  69.171.239.10
> glb2.facebook.com.? ? ? 3038? ? IN? ? ? A? ? ?  69.171.255.10
> 
> glb obviously stands for "global load balancer", though I have no idea
> what device they use (F5s, Citrix Netscalers, Alteons (god forbid), or
> something home-grown).
> 
> Given the below analysis from Terry, it looks to me like:
> 
> a) one or both of their load balancers may have been overloaded briefly
> ?  and did not respond to DNS queries (or possibly something at layer 2
> ?  or layer 3 was affecting this)
> b) one or more of the nameservers *behind* glb[12].facebook.com were
> ?  overloaded or broken in some way, or layer 2/3 was responsible for
> ?  breakage (between glbs and nameservers)
> 
> The only people who know for certain are -- yup -- the Facebook folks.
> 
> And naturally this is me doing my testing from a single source, so its
> possible they use anycast to distribute some of their load, in which
> case the above analysis (despite speculative) is still correct, except
> what actual devices/networks are involved would be different.
> 
> You're welcome.? :-)
> 
> -- 
> | Jeremy Chadwick? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? jdc at koitsu.org |
> | UNIX Systems Administrator? ? ? ? ? ? ? ? http://jdc.koitsu.org/ |
> | Mountain View, CA, US? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? |
> | Making life hard for others since 1977.? ? ? ? ? ?  PGP 4BD6C0CB |
> 
> On Mon, Dec 10, 2012 at 03:24:28PM -0800, Terry wrote:
> > Still broke here. Silly CNAMEs.
> > 
> > ~ > nslookup
> > > server a.ns.facebook.com
> > Default server: a.ns.facebook.com
> > Address: 69.171.239.12#53
> > 
> > > www.facebook.com
> > Server: ? ? ? ? a.ns.facebook.com
> > Address: ? ? ? ?69.171.239.12#53
> > www.facebook.com ? ? ? ?canonical name = star.facebook.com.
> > 
> > 
> > > star.facebook.com
> > Server: ? ? ? ? a.ns.facebook.com
> > Address: ? ? ? ?69.171.239.12#53
> > 
> > Non-authoritative answer:
> > *** Can't find star.facebook.com: No answer
> > 
> > 
> > ________________________________
> >? From: Richard Mahoney <richard.mahoney at tracesmart.co.uk>
> > To: Corey Quinn <corey at sequestered.net>; "outages at outages.org" <outages at outages.org> 
> > Sent: Monday, December 10, 2012 6:21 PM
> > Subject: Re: [outages] Facebook
> >? 
> > 
> >? 
> > Seems to be resolving again now on Virgin Media (UK). Guess it was just a hiccup.
> > ?
> > PS C:\Windows\system32> nslookup www.facebook.com
> > Server:? (removed)
> > Address:? (removed)
> > ?
> > Non-authoritative answer:
> > Name:??? star.facebook.com
> > Addresses:? 2a03:2880:2110:9f02:face:b00c:0:4
> > ????????? 69.171.247.20
> > Aliases:? www.facebook.com
> > ?
> > Kind regards
> > ?
> > Richard Mahoney, CEH?
> > Systems Administrator
> > Tracesmart
> > T?029 2067 8534????M?07714 486543????E?richard.mahoney at tracesmart.co.uk
> > www.tracesmartcorporate.co.uk????www.traceiq.co.uk
> > Global Reach ?Dunleavy Drive ?Cardiff ?CF11 0SN
> > Follow us on?Twitter
> > ISO/IEC 27001?CERTIFICATE: GB 10/81945
> > We are proud to sponsor?missingpeople.org.uk
> > This email and any attachments are confidential to Tracesmart Ltd and are solely for use by the intended recipient. If you are not the intended recipient you must not disclose, copy or distribute its contents to any other person nor make use of its contents in any way. If you have received this email in error please forward a copy to?info at tracesmart.co.uk?and remove it from your system.This email and any attachments have been scanned for the presence of computer viruses. Neither Tracesmart Ltd nor the sender accepts any responsibility for computer viruses once this email has been transmitted. The content of this message may contain personal views, which are not the views of Tracesmart Ltd, unless specifically stated. Tracesmart may monitor email traffic data and also the content of email for the purposes of security and staff training.Tracesmart Ltd is a company registered in England & Wales with company registration number 3827062 whose registered
> >? office is at Global Reach, Dunleavy Drive, Cardiff CF11 0SN. ?Our Data Protection Number is Z708281X and our Consumer Credit Licence Number is 565961.
> > ?
> > From:outages-bounces at outages.org [mailto:outages-bounces at outages.org] On Behalf Of Corey Quinn
> > Sent: 10 December 2012 23:15
> > To: outages at outages.org
> > Subject: Re: [outages] Facebook
> > ?
> > Can you be a bit more specific? ?"Works for me."
> > ?
> > cquinn at quinntel ~ % dig facebook.com ? ? ? ? ? ? ? 5344 15:14:37 Mon 12-10-2012
> > ?
> > ; <<>> DiG 9.9.1-P2 <<>> facebook.com
> > ;; global options: +cmd
> > ;; Got answer:
> > ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 63691
> > ;; flags: qr rd ra; QUERY: 1, ANSWER: 6, AUTHORITY: 2, ADDITIONAL: 3
> > ?
> > ;; OPT PSEUDOSECTION:
> > ; EDNS: version: 0, flags:; udp: 4096
> > ;; QUESTION SECTION:
> > ;facebook.com.??????????????????????????????????? IN??????? A
> > ?
> > ;; ANSWER SECTION:
> > facebook.com.???????????? 7200??? IN??????? A???????? 66.220.152.16
> > facebook.com.???????????? 7200??? IN??????? A???????? 69.171.224.32
> > facebook.com.???????????? 7200??? IN??????? A???????? 173.252.100.16
> > facebook.com.???????????? 7200??? IN??????? A???????? 69.171.229.16
> > facebook.com.???????????? 7200??? IN??????? A???????? 173.252.101.16
> > facebook.com.???????????? 7200??? IN??????? A???????? 66.220.158.16
> > ?
> > ;; AUTHORITY SECTION:
> > facebook.com.???????????? 139086??????????? IN??????? NS?????? a.ns.facebook.com.
> > facebook.com.???????????? 139086??????????? IN??????? NS?????? b.ns.facebook.com.
> > ?
> > ;; ADDITIONAL SECTION:
> > a.ns.facebook.com.????? 139086??????????? IN??????? A???????? 69.171.239.12
> > b.ns.facebook.com.????? 139086??????????? IN??????? A???????? 69.171.255.12
> > ?
> > ;; Query time: 50 msec
> > ;; SERVER: 10.201.1.103#53(10.201.1.103)
> > ;; WHEN: Mon Dec 10 15:14:40 2012
> > ;; MSG SIZE ?rcvd: 204
> > ?
> > ?
> > On Dec 10, 2012, at 3:12 PM, Richard Mahoney <richard.mahoney at tracesmart.co.uk> wrote:
> > 
> > 
> > Seeing DNS issues for Facebook here.
> > Anyone else?
> > ?
> > Kind regards
> > ?
> > Richard Mahoney, CEH?
> > Systems Administrator
> > Tracesmart
> > T?029 2067 8534????M?07714 486543????E?richard.mahoney at tracesmart.co.uk
> > www.tracesmartcorporate.co.uk????www.traceiq.co.uk
> > Global Reach ?Dunleavy Drive ?Cardiff ?CF11 0SN
> > Follow us on?Twitter
> > ISO/IEC 27001?CERTIFICATE: GB 10/81945
> > We are proud to sponsor?missingpeople.org.uk
> > This email and any attachments are confidential to Tracesmart Ltd and are solely for use by the intended recipient. If you are not the intended recipient you must not disclose, copy or distribute its contents to any other person nor make use of its contents in any way. If you have received this email in error please forward a copy to?info at tracesmart.co.uk?and remove it from your system.This email and any attachments have been scanned for the presence of computer viruses. Neither Tracesmart Ltd nor the sender accepts any responsibility for computer viruses once this email has been transmitted. The content of this message may contain personal views, which are not the views of Tracesmart Ltd, unless specifically stated. Tracesmart may monitor email traffic data and also the content of email for the purposes of security and staff training.Tracesmart Ltd is a company registered in England & Wales with company registration number 3827062 whose registered
> >? office is at Global Reach, Dunleavy Drive, Cardiff CF11 0SN. ?Our Data Protection Number is Z708281X and our Consumer Credit Licence Number is 565961.
> > ?
> > _______________________________________________
> > Outages mailing list
> > Outages at outages.org
> > https://puck.nether.net/mailman/listinfo/outages
> > ?
> > _______________________________________________
> > Outages mailing list
> > Outages at outages.org
> > https://puck.nether.net/mailman/listinfo/outages
> 
> > _______________________________________________
> > Outages mailing list
> > Outages at outages.org
> > https://puck.nether.net/mailman/listinfo/outages



More information about the Outages mailing list