[j-nsp] Whats the best way to announce an IP range in BGP? Doesn't physically exist anywhere.

Tue Jun 26 07:39:33 EDT 2012

25.06.2012 16:06, Scott T. Cameron:
>
>     1. First, sorry for writing this once again, but it's just not the
>     case.
>     Any more or less smart stateful device, whether SRX or anything else,
>     must not create session states for packets falling under a discard
>     route. And SRX does not, I checked. Filling up the session table is
>     caused by either a bug or (rather) a design/config mistake.
>
>
>
> I'm not sure I agree with this assessment.
>
> The SRX is very quick at disposing of invalid sessions, generally.
>  However, it is easily susceptible to DDOS if you let it reach the
> session table.
>
> Here's some quick POC code:
>
> http://pastebin.com/FjgavSwn
>
> You can run this against some non-operational IPs, but present via,
> say, discard route in your config.  You will see the invalid sessions
> rise dramatically via 'show sec flow sess sum'.

Hm.

The test itself is not that complex and I've just checked it. Just wrote
a 10.1.1.0/24 discard and tried to ping it with -f option.

Yes, the number of invalidates sessions increases in the "show flow sec
sess sum" output, buy here is what happens in flow-trace:

> Jun 26 14:13:25
> 14:13:25.049336:CID-0:RT:<10.0.0.17/2031->10.1.1.20/6175;1> matched
> filter 1:
> Jun 26 14:13:25 14:13:25.049336:CID-0:RT:packet [84] ipid = 0, @4092e624
> Jun 26 14:13:25 14:13:25.049336:CID-0:RT:---- flow_process_pkt: (thd
> 2): flow_ctxt type 14, common flag 0x0, mbuf 0x4092e400, rtbl_idx = 0
> Jun 26 14:13:25 14:13:25.049336:CID-0:RT: flow process pak fast ifl 71
> in_ifp ge-0/0/0.6
> Jun 26 14:13:25 14:13:25.049336:CID-0:RT: 
> ge-0/0/0.6:10.0.0.17->10.1.1.20, icmp, (8/0)
> Jun 26 14:13:25 14:13:25.049336:CID-0:RT: find flow: table 0x4354f9b8,
> hash 55925(0xffff), sa 10.0.0.17, da 10.1.1.20, sp 2031, dp 6175,
> proto 1, tok 6
> Jun 26 14:13:25 14:13:25.049336:CID-0:RT:  no session found, start
> first path. in_tunnel - 0, from_cp_flag - 0
> Jun 26 14:13:25 14:13:25.049336:CID-0:RT:  flow_first_create_session
> Jun 26 14:13:25 14:13:25.049774:CID-0:RT:  flow_first_in_dst_nat: in
> <ge-0/0/0.6>, out <N/A> dst_adr 10.1.1.20, sp 2031, dp 6175
> Jun 26 14:13:25 14:13:25.049814:CID-0:RT:  chose interface ge-0/0/0.6
> as incoming nat if.
> Jun 26 14:13:25 14:13:25.049814:CID-0:RT:flow_first_rule_dst_xlate:
> DST no-xlate: 0.0.0.0(0) to 10.1.1.20(6175)
> Jun 26 14:13:25 14:13:25.049814:CID-0:RT:flow_first_routing: vr_id 0,
> call flow_route_lookup(): src_ip 10.0.0.17, x_dst_ip 10.1.1.20, in ifp
> ge-0/0/0.6, out ifp N/A sp 2031, dp 6175, ip_proto 1, tos 0
> Jun 26 14:13:25 14:13:25.049814:CID-0:RT:Doing DESTINATION addr
> route-lookup
> Jun 26 14:13:25 14:13:25.049814:CID-0:RT:  packet dropped, no route to
> dest
> Jun 26 14:13:25 14:13:25.049814:CID-0:RT:flow_first_routing: DEST
> route-lookup failed, dropping pkt and not creating session nh: 0
> Jun 26 14:13:25 14:13:25.049814:CID-0:RT:  flow find session returns
> error.
> Jun 26 14:13:25 14:13:25.049814:CID-0:RT: ----- flow_process_pkt rc
> 0x7 (fp rc -1)

Yes, this eats some centeral point resources on SRX-HE or its software
analogue for Branch-SRX. Of course, some resources are also consumed for
internal NPC-SPC forwarding, etc. But I'd say it's rather CPU-cycles,
not the session table memory. Of course, if we issue "show security flow
session destination-prefix 10.1.1.0/24", there are no records in the table.

However it is still less resource intensive than, say, processing
packets for which there is an explicit (or implicit) deny policy. And
this is what stateful firewalls are invented for.

So, I agree, any stateful device has a grater potential of being DoSed
than a router. Just because this risk is proportional to the number of
elementary operations, the device performs with a packet, and, by
definition, stateful device is more fragile here.

But, again, it does not depend on the firewall's routing role.  It is a
general cost of using a stateful device.

> Malicious user aside, a legitimate application trying to hit an
> invalid IP would give the same result.
Real-life example is even simpler. Any source NAT box (CGN, enterprise,
whatever). It has, say, a NAT pool, which is also defined as a static
discard and announced upsteam using some routing protocol (does not
matter which). All the IPs are used for NAT, so there are no unused
addresses. Users/subscribers behind NAT use torrents, and, of course,
there are lots of sessions from outside users, trying to establish
connections with your subscribers behind NAT. And the same thing
happens: packets come to the NAT device and fall under the discard route.

Yep, it's a good idea to use devices performing route lookups in
dedicated hardware for massive-scale NAT deployments. Also some
additional techniques like stateless filters and early-aging. And common
sense, as Stefan said.

But it has nothing to do with routing (sorry :)

Moreover. Imagine you don't have a static discard route for NAT pool and
no dynamic routing is used at all. Just a static route for the NAT pool
on the upstream device, pointing to the NAT device, and a static default
on the NAT device towards the uplink. What will happen? A packet will
also come to the NAT device, which will perform a route lookup (same
operations as we saw above) but it will find a route --- default. It
means it will go ahead and check the zones, policies, etc (more
resources consumed). Much funnier thing will happen if you have (for
some reason) a permitting policy from untrust to untrust (or however you
call it). On MS-PIC-based NAT box this will also give you a loop, if you
don't care.