[j-nsp] A low number of Firewall filters reducing the bandwidth capacity.

Mon Aug 26 11:23:46 EDT 2024

You don't have a counter term for drops? So we can't be entirely sure,
your terms aren't responsible for it? Please add counters for the
discard terms.

You can view the global NPU load % from the PFE CLI, as well as to a
degree individual PPE.

I'd say

- let's ensure it's not the filter dropping the packets as requested
- find out where the drops are reported (interface/extensive counters,
PFE stream counters, QoS counters, NPU exception counters, MQ FI/FO
counters ...)

On Mon, 26 Aug 2024 at 16:43, Gustavo Santos <gustkiller at gmail.com> wrote:
>
> Awesome, thanks for the info!
> Rules are like the one below.
> after adjusting the detection engine to handle as /24 network instead of /32 hosts the issue is gone..
> As you said the issue was not caused by pps as the attack traffic was just about 30Mpps and with adjusted rules to /24 networks
> there were not more dropped packets from PFE.
>
> Did you know or have how to check the PPE information that should show what may have happened?
>
>
> Below a sample rule that was generated ( about 300 of them via netconf that caused the slowdown).
>
>       set term e558d83516833f77dea28e0bd5e65871-match from destination-address 131.0.245.143/32
>                 set term e558d83516833f77dea28e0bd5e65871-match from protocol 6
>                                 set term e558d83516833f77dea28e0bd5e65871-match from source-port 443
>                 set term e558d83516833f77dea28e0bd5e65871-match from packet-length 32-63
>                 set term e558d83516833f77dea28e0bd5e65871-match from tcp-flags "syn & ack & !fin & !rst & !psh"
>                 set term e558d83516833f77dea28e0bd5e65871-match then count Corero-auto-block-e558d83516833f77dea28e0bd5e65871-match port-mirror next term
>         set term e558d83516833f77dea28e0bd5e65871-action from destination-address 131.0.245.143/32
>                 set term e558d83516833f77dea28e0bd5e65871-action from protocol 6
>                                 set term e558d83516833f77dea28e0bd5e65871-action from source-port 443
>                 set term e558d83516833f77dea28e0bd5e65871-action from packet-length 32-63
>                 set term e558d83516833f77dea28e0bd5e65871-action from tcp-flags "syn & ack & !fin & !rst & !psh"
>                 set term e558d83516833f77dea28e0bd5e65871-action then count Corero-auto-block-e558d83516833f77dea28e0bd5e65871-discard  discard
>
>
> Em dom., 25 de ago. de 2024 às 02:36, Saku Ytti <saku at ytti.fi> escreveu:
>>
>> The RE and LC CPU have nothing to do with this, you'd need to check
>> the Trio PPE congestion levels to figure out if you're running out of
>> cycles for ucode execution.
>>
>> This might improve your performance:
>> https://www.juniper.net/documentation/us/en/software/junos/cli-reference/topics/ref/statement/firewall-fast-lookup-filter.html
>>
>> There is also old and new trio ucode, new being 'hyper mode', but this
>> may already be default on, depending on your release. Hyper mode
>> should give a bit more PPS.
>>
>> There is precious little information available, like what exactly are
>> your filters doing, what kind of PPS are you pushing in the Trio
>> experiencing this, where are you seeing the drops, if you are
>> dropping, they are absolutely accounted for somewhere.
>>
>> Unless you are really pushing very heavy PPS, I have difficulties
>> seeing 100 sensible FW rules impacting performance, not saying it is
>> impossible, but suspecting there is a lot more here. We'd need to deep
>> dive into the rules, PPE configuration and load.
>>
>> On Sat, 24 Aug 2024 at 23:35, Gustavo Santos via juniper-nsp
>> <juniper-nsp at puck.nether.net> wrote:
>> >
>> > Hi,
>> >
>> > We have noticed that when a not so large  number of firewall filters terms
>> > are generated and pushed to edge routers via  via NETCONF into a triplet of
>> > MX10003 ,
>> > we start receiving customer complaints. These issues seem to be related to
>> > the router's FPC limiting overall network traffic. To resolve the problem,
>> > we simply deactivate the ephemeral configuration database that contains the
>> > rules, which removes all the rules,
>> > and the traffic flow returns to normal. Is there any known limitation or
>> > bug that could cause this type of issue?
>> > We typically observe this problem with more than 100 rules; with a smaller
>> > number of rules, we don't experience the same issue, even with much larger
>> > attacks. Is there any known bug or limitation?
>> >
>> > As it is a customer traffic issue I didn't have the time to check fpc
>> > memory or fpc shell.  I just checked the routing engine and fpc cpu and
>> > they are all fine ( under 50% fpc and under 10% RE).
>> >
>> > Any thoughts?
>> >
>> > Regards.
>> > _______________________________________________
>> > juniper-nsp mailing list juniper-nsp at puck.nether.net
>> > https://puck.nether.net/mailman/listinfo/juniper-nsp
>>
>>
>>
>> --
>>   ++ytti

-- 
  ++ytti