[j-nsp] Weird SRX flow timeout issue

Mon Nov 12 15:49:29 EST 2012

Just as a bit of a follow on…

Early I made the assertion that I was using a route-based VPN. Of course, I wasn't (and thanks Cooper for pointing that out to me.)

We removed the inactivity-timeout definition, tried disabling tcp sun checking in tunnel, etc; but none of these seemed to restore the correct behaviour that we had before yesterday; sessions were still closing.

I managed to catch in the flow session trace that what was happening was the 20 second window was expiring, and then we were seeing an 1800 second session for about 1 second, which then disappeared, but the tcpdump trace was showing that it should not have been closed.

I then removed the additional policy statements that matched the application specific traffic, and presto - everything back to normal with (short 30 minute) timeouts; but traffic is at least flowing again.

So two things I'm thinking from here:

1) Ask the customer to implement the postgres keep alive options (which are recommended for Windows, and I believe they are a purely linux environment) and see if that holds the sessions open

2) See if we can move to a route based VPN, as the problem I was talking about appears to come from the intersection of two very similar policies.

Thanks,
Andrew

--
Andrew Yager, Managing Director   (MACS Snr CP BCompSc MCP MCE JNCIA-Junos)
Real World Technology Solutions Pty Ltd  - IT people you can trust
ph: 1300 798 718 or (02) 9037 0500
fax: (02) 9037 0591 mob: 0405 152 568
http://www.rwts.com.au/

On 13/11/2012, at 7:34 AM, Tim Eberhard <xmin0s at gmail.com> wrote:

> The SRX's behavior is if any packet passes over that session to reset
> the timeout on that session, keep alive, data packet, whatever. As
> long as it matches that session it will reset the timeout to the
> default value and start decrementing again. So I'm not sure what you
> mean when it says dropping tcp sessions with active TCP keepalives.
> 
> I've never had a problem where an application sent keepalives at a
> rate greater than the default time out (say time out is 30 minutes,
> keepalives are every 10 minutes). Then that session can last as long
> as it wants. This is expected behavior.
> 
> -Tim Eberhard
> 
> On Mon, Nov 12, 2012 at 1:43 PM, Benny Amorsen <benny+usenet at amorsen.dk> wrote:
>> Tim Eberhard <xmin0s at gmail.com> writes:
>> 
>>> While I haven't read this entire thread, it's worth mentioning that
>>> this is a correct statement. TCP connections (by default) must be
>>> initiated by a standard 3-way handshake. You can disabled this by
>>> turning off tcp-syn-checking under security -> flow.
>>> 
>>> I wouldn't recommend it however, as enforcing proper TCP state is
>>> always a good security practice.
>> 
>> Enforcing proper TCP state is certainly good security practice. Dropping
>> a TCP session with active TCP keepalives is simply buggy and wrong.
>> 
>> That does not have anything to do with the 3-way handshake or
>> tcp-syn-checking which should be on.
>> 
>> 
>> /Benny
> _______________________________________________
> juniper-nsp mailing list juniper-nsp at puck.nether.net
> https://puck.nether.net/mailman/listinfo/juniper-nsp