[VoiceOps] Bandwidth - Monday Outage

Peter Beckman beckman at angryox.com
Tue Sep 28 15:25:30 EDT 2021


On Tue, 28 Sep 2021, Ryan Delgrosso wrote:

> Yep, except that
>
> A: Bandwidth had to know this is a when not an if. In today's internet if 
> your company can be considered critical infra, you will be attacked. The more 
> likley scenario is the technical staff knew this but the MBA types said they 
> were paranoid delusions and denied the project budget.

  They might have planned for a certain scale, but if they are getting with
  with 100s of Gigabits or Terabits of traffic, they probably are not in a
  situation where the cost of having that infrastructure was reasonable.

  Bandwidth likely does not have multiple 10Tb links with multiple carriers.

> B: I believe they need to be drawing national attention to this to highlight 
> what a steaming dumpster fire much of the critical infra really is. Mostly 
> because its designed to maximize quarterly earnings, not stay working in the 
> face of adversity.

  Until things are attacked, people are willfully ignorant. Proactive Red Team
  attacks on infrastructure is really the best way to find out from someone
  on your side where your infrastructure is vulnerable. But you gotta wanna
  know where your vulnerabilities are and be willing to pay to find them.
  Capitalism beats out rational thought.

> C: I'm absolutely sympathetic to their plight having been through a crippling 
> DDOS in a past life which spurred the complete redesign of the entire network 
> into sacrificial pods with more robust transport, and a triage runbook to 
> keep the most things available in the face of an insurmountable onslaught.

  Yup. It's hard to find, hire, and keep engaged people who know how to do
  mitigate DDoS attacks at the level that these attacks are occurring. It's
  gotta be multiple Tbps IMHO. I'll be disappointed if it was a 1Gbps
  sustained issue that took them down, I sure hope not.

> D: Why is the discussion not yet turning to the fact that all major eyeball 
> networks in the US still don't implement BCP38 as a matter of laziness (or 
> above MBA reasons), and this is what allows these attacks to happen. The 
> telco guys are being held to the STIR/SHAKEN standard over robocalling but 
> for decades the major US ISP's could have implemented network policies that 
> would break the chain of DDOS escalation and don't because they cant be 
> bothered to.

  It seems to take huge failures to get companies to change, and for people
  to change. Once the incident passes, fixing it for the future becomes a
  low-priority task again. Urgent vs Important is a real struggle.

> I once gave a talk on DDOS at a Carrier fraud association task force meeting 
> (cfca.org) and had representatives from every major US eyeball network in the 
> room and asked the above question and the overwhelming answer I got is 
> "leadership doesn't feel its a worthwhile risk/reward to implement".

  Because it's not worth preventing until it hurts financially.

  Maybe the DDoS actors are really just trying to get more companies to
  improve their networks and are just a bunch of white hats forcing
  companies to do better.

  OK, probably not.

  The good news is that BW likely will have some excellent infrastructure
  improvements over the next few weeks/months that will increase my
  confidence in them. Hopefully. This is the first major ongoing issue I've
  seen with BW in 6 years.

  Outages happen. Mistakes made. You either trust your vendor to get it
  right or you leave and hope the new one is better, lacking any trust built
  up that you had.

---------------------------------------------------------------------------
Peter Beckman                                                  Internet Guy
beckman at angryox.com                                https://www.angryox.com/
---------------------------------------------------------------------------


More information about the VoiceOps mailing list