[c-nsp] PIX 6.3.3 Failover Problem

ChrisSerafin chris at chrisserafin.com
Tue Jan 27 10:47:04 EST 2009


I have a pair of 525 PIX's running 6.3.3 (old I know, downtime 
preventing upgarde/hardware swap out) that just decided to start 
throwing failover errors.

I saw this in the logs from the time of the failure:
Jan 23 15:39:33 elm-pix-1 Jan 23 2009 15:39:33: %PIX-1-709003: (Primary) 
Beginning configuration replication: Send to mate.
Jan 23 15:39:34 elm-pix-1 Jan 23 2009 15:39:34: %PIX-1-709003: (Primary) 
Beginning configuration replication: Send to mate.
Jan 23 15:39:34 elm-pix-1 Jan 23 2009 15:39:34: %PIX-1-709003: (Primary) 
Beginning configuration replication: Send to mate.
Jan 23 15:39:35 elm-pix-1 Jan 23 2009 15:39:35: %PIX-1-709003: (Primary) 
Beginning configuration replication: Send to mate.
Jan 23 15:39:49 elm-pix-2 Jan 23 2009 15:39:49: %PIX-1-709006: 
(Secondary) End Configuration Replication (STB)
Jan 23 15:39:49 elm-pix-1 Jan 23 2009 15:39:49: %PIX-1-709004: (Primary) 
End Configuration Replication (ACT)
Jan 23 15:41:30 elm-pix-1 Jan 23 2009 15:41:30: %PIX-1-709003: (Primary) 
Beginning configuration replication: Send to mate.
Jan 23 15:41:44 elm-pix-2 Jan 23 2009 15:41:44: %PIX-1-709006: 
(Secondary) End Configuration Replication (STB)
Jan 23 15:41:44 elm-pix-1 Jan 23 2009 15:41:44: %PIX-1-709004: (Primary) 
End Configuration Replication (ACT)
Jan 23 18:26:34 elm-pix-2 Jan 23 2009 18:26:34: %PIX-1-105005: 
(Secondary) Lost Failover communications with mate on interface 1
Jan 23 18:26:34 elm-pix-2 Jan 23 2009 18:26:34: %PIX-1-105008: 
(Secondary) Testing Interface 1
Jan 23 18:26:45 elm-pix-1 Jan 23 2009 18:26:45: %PIX-1-103005: (Primary) 
Other firewall reporting failure.

Then after getting to the unit and unplugging and reconnecting the 
failover cable, I saw this:
Jan 27 07:25:36 elm-pix-1 Jan 27 2009 07:25:36: %PIX-1-709003: (Primary) 
Beginning configuration replication: Send to mate.
Jan 27 07:25:50 elm-pix-2 Jan 27 2009 07:25:50: %PIX-1-709006: 
(Secondary) End Configuration Replication (STB)
Jan 27 07:25:50 elm-pix-1 Jan 27 2009 07:25:50: %PIX-1-709004: (Primary) 
End Configuration Replication (ACT)
Jan 27 09:20:47 elm-pix-2 Jan 27 2009 09:20:47: %PIX-1-101004: 
(Secondary) Failover cable not connected (other unit)
Jan 27 09:20:51 elm-pix-1 Jan 27 2009 09:20:51: %PIX-1-101003: 
(Secondary) Failover cable not connected (this unit)
*Jan 27 09:21:17 elm-pix-2 Jan 27 2009 09:21:17: %PIX-1-101001: 
(Secondary) Failover cable OK.
Jan 27 09:21:21 elm-pix-1 Jan 27 2009 09:21:21: %PIX-1-101001: (Primary) 
Failover cable OK.*
Jan 27 09:21:37 elm-pix-1 Jan 27 2009 09:21:37: %PIX-1-709003: (Primary) 
Beginning configuration replication: Send to mate.
Jan 27 09:21:51 elm-pix-2 Jan 27 2009 09:21:51: %PIX-1-709006: 
(Secondary) End Configuration Replication (STB)
Jan 27 09:21:51 elm-pix-1 Jan 27 2009 09:21:51: %PIX-1-709004: (Primary) 
End Configuration Replication (ACT)
Jan 27 09:23:37 elm-pix-1 Jan 27 2009 09:23:37: %PIX-1-709003: (Primary) 
Beginning configuration replication: Send to mate.
Jan 27 09:23:51 elm-pix-2 Jan 27 2009 09:23:51: %PIX-1-709006: 
(Secondary) End Configuration Replication (STB)
Jan 27 09:23:51 elm-pix-1 Jan 27 2009 09:23:51: %PIX-1-709004: (Primary) 
End Configuration Replication (ACT)

So I can then do a wr standby on the primary BUT I DO NOT see the 
'starting to sync', and I get this from the 'sh failover'......failover 
config below as well:
ELM-PIX525-1(config)# sh fail
Failover On
Cable status: Normal
Reconnect timeout 0:00:00
Poll frequency 15 seconds
failover replication http
Last Failover at: 09:14:49 CST Fri Mar 28 2008
        This host: Primary - Active
                Active time: 26707815 (sec)
                Interface outside (65.166.254.2): Normal
                Interface inside (10.200.1.249): Normal
                Interface EDMZ1 (172.30.1.1): Normal
                Interface EDMZ2 (0.0.0.0): Link Down (Shutdown)
                Interface MGT (10.200.1.125): Link Down (Waiting)
                Interface intf5 (172.27.0.1): Normal
        Other host: Secondary - Standby (Failed)
                Active time: 0 (sec)
                Interface outside (65.166.254.3): Normal
                Interface inside (10.200.1.250): Normal
                Interface EDMZ1 (172.30.1.3): Normal
                Interface EDMZ2 (172.31.1.3): Link Down (Shutdown)
                Interface MGT (10.200.1.126): Link Down (Waiting)
                Interface intf5 (172.27.0.2): Normal

failover
failover timeout 0:00:00
failover poll 15
failover replication http
failover ip address outside xx.xx.254.3
failover ip address inside 10.200.1.250
failover ip address EDMZ1 172.30.1.3
failover ip address EDMZ2 172.31.1.3
failover ip address MGT 10.200.1.126
failover ip address intf5 172.27.0.2
failover link intf5




Thanks for any help....

Chris Serafin
Chris at chrisserafin.com









More information about the cisco-nsp mailing list