[f-nsp] multiple service failover

David Miller syslog at d.sparks.net
Tue Jul 14 10:26:33 EDT 2009


Hi All;

I've got a situation that I'm not sure of the best way to handle.

I have a pair of servers that are able to run the same application.  
Caching issues make it weird though.  The application writes to the 
database when updates come in, and (of course) updates its own internal 
cache.  The servers don't update each other, however, nor do they get 
updates from the database any time other than at startup.  "startup" in 
this case is defined as the first query that hits tomcat.

What this means is that I want to run off S1 as long as its running.  
And if S1 becomes unavailable I want to run off S2 until I come back and 
fix things. lb-pri-servers takes care of that part.

Here's the complicated part.  The servers accept SSL as well for user 
authentication.  I need http and ssl to fail to S2 as a pair - both or 
neither.  We recently had application response issues where S1 was very 
slow to respond, and http failed over but ssl did not.  This broke the 
customers ability to authenticate.  We're terminating ssl on the 
ServerIron 4G's and talking plaintext on port 443 to the server.  Apache 
is listening on 80 and 443 expecting plaintext.

We've played with boolean health checks, but I haven't implemented 
them.  I'm concerned about separate health checks because nothing 
different is being tested.  It seems like a race condition would still 
exist where http would fail over, but a second later the ssl check would 
pass.  What we need is for a single failed health check to down the 
server and fail all services over to the backup.

If it makes things easier, we can just skip the ssl test.  A single test 
for http is adequate for making sure apache is responding on the server.


Suggestions very welcome.

--- David







More information about the foundry-nsp mailing list