[c-nsp] PIX 6.3.3 Failover Problem (Solved)
ChrisSerafin
chris at chrisserafin.com
Tue Jan 27 17:36:41 EST 2009
This resolved the issue:
-Use the failover reset command on the primary-active unit to recover
the standby from the failed state.
http://cisco.com/en/US/docs/security/pix/pix63/command/reference/df.html#wp1029143
-Reload the standby unit if the failover reset does not help.
ChrisSerafin wrote:
> I have a pair of 525 PIX's running 6.3.3 (old I know, downtime
> preventing upgarde/hardware swap out) that just decided to start
> throwing failover errors.
>
> I saw this in the logs from the time of the failure:
> Jan 23 15:39:33 elm-pix-1 Jan 23 2009 15:39:33: %PIX-1-709003:
> (Primary) Beginning configuration replication: Send to mate.
> Jan 23 15:39:34 elm-pix-1 Jan 23 2009 15:39:34: %PIX-1-709003:
> (Primary) Beginning configuration replication: Send to mate.
> Jan 23 15:39:34 elm-pix-1 Jan 23 2009 15:39:34: %PIX-1-709003:
> (Primary) Beginning configuration replication: Send to mate.
> Jan 23 15:39:35 elm-pix-1 Jan 23 2009 15:39:35: %PIX-1-709003:
> (Primary) Beginning configuration replication: Send to mate.
> Jan 23 15:39:49 elm-pix-2 Jan 23 2009 15:39:49: %PIX-1-709006:
> (Secondary) End Configuration Replication (STB)
> Jan 23 15:39:49 elm-pix-1 Jan 23 2009 15:39:49: %PIX-1-709004:
> (Primary) End Configuration Replication (ACT)
> Jan 23 15:41:30 elm-pix-1 Jan 23 2009 15:41:30: %PIX-1-709003:
> (Primary) Beginning configuration replication: Send to mate.
> Jan 23 15:41:44 elm-pix-2 Jan 23 2009 15:41:44: %PIX-1-709006:
> (Secondary) End Configuration Replication (STB)
> Jan 23 15:41:44 elm-pix-1 Jan 23 2009 15:41:44: %PIX-1-709004:
> (Primary) End Configuration Replication (ACT)
> Jan 23 18:26:34 elm-pix-2 Jan 23 2009 18:26:34: %PIX-1-105005:
> (Secondary) Lost Failover communications with mate on interface 1
> Jan 23 18:26:34 elm-pix-2 Jan 23 2009 18:26:34: %PIX-1-105008:
> (Secondary) Testing Interface 1
> Jan 23 18:26:45 elm-pix-1 Jan 23 2009 18:26:45: %PIX-1-103005:
> (Primary) Other firewall reporting failure.
>
> Then after getting to the unit and unplugging and reconnecting the
> failover cable, I saw this:
> Jan 27 07:25:36 elm-pix-1 Jan 27 2009 07:25:36: %PIX-1-709003:
> (Primary) Beginning configuration replication: Send to mate.
> Jan 27 07:25:50 elm-pix-2 Jan 27 2009 07:25:50: %PIX-1-709006:
> (Secondary) End Configuration Replication (STB)
> Jan 27 07:25:50 elm-pix-1 Jan 27 2009 07:25:50: %PIX-1-709004:
> (Primary) End Configuration Replication (ACT)
> Jan 27 09:20:47 elm-pix-2 Jan 27 2009 09:20:47: %PIX-1-101004:
> (Secondary) Failover cable not connected (other unit)
> Jan 27 09:20:51 elm-pix-1 Jan 27 2009 09:20:51: %PIX-1-101003:
> (Secondary) Failover cable not connected (this unit)
> *Jan 27 09:21:17 elm-pix-2 Jan 27 2009 09:21:17: %PIX-1-101001:
> (Secondary) Failover cable OK.
> Jan 27 09:21:21 elm-pix-1 Jan 27 2009 09:21:21: %PIX-1-101001:
> (Primary) Failover cable OK.*
> Jan 27 09:21:37 elm-pix-1 Jan 27 2009 09:21:37: %PIX-1-709003:
> (Primary) Beginning configuration replication: Send to mate.
> Jan 27 09:21:51 elm-pix-2 Jan 27 2009 09:21:51: %PIX-1-709006:
> (Secondary) End Configuration Replication (STB)
> Jan 27 09:21:51 elm-pix-1 Jan 27 2009 09:21:51: %PIX-1-709004:
> (Primary) End Configuration Replication (ACT)
> Jan 27 09:23:37 elm-pix-1 Jan 27 2009 09:23:37: %PIX-1-709003:
> (Primary) Beginning configuration replication: Send to mate.
> Jan 27 09:23:51 elm-pix-2 Jan 27 2009 09:23:51: %PIX-1-709006:
> (Secondary) End Configuration Replication (STB)
> Jan 27 09:23:51 elm-pix-1 Jan 27 2009 09:23:51: %PIX-1-709004:
> (Primary) End Configuration Replication (ACT)
>
> So I can then do a wr standby on the primary BUT I DO NOT see the
> 'starting to sync', and I get this from the 'sh
> failover'......failover config below as well:
> ELM-PIX525-1(config)# sh fail
> Failover On
> Cable status: Normal
> Reconnect timeout 0:00:00
> Poll frequency 15 seconds
> failover replication http
> Last Failover at: 09:14:49 CST Fri Mar 28 2008
> This host: Primary - Active
> Active time: 26707815 (sec)
> Interface outside (65.166.254.2): Normal
> Interface inside (10.200.1.249): Normal
> Interface EDMZ1 (172.30.1.1): Normal
> Interface EDMZ2 (0.0.0.0): Link Down (Shutdown)
> Interface MGT (10.200.1.125): Link Down (Waiting)
> Interface intf5 (172.27.0.1): Normal
> Other host: Secondary - Standby (Failed)
> Active time: 0 (sec)
> Interface outside (65.166.254.3): Normal
> Interface inside (10.200.1.250): Normal
> Interface EDMZ1 (172.30.1.3): Normal
> Interface EDMZ2 (172.31.1.3): Link Down (Shutdown)
> Interface MGT (10.200.1.126): Link Down (Waiting)
> Interface intf5 (172.27.0.2): Normal
>
> failover
> failover timeout 0:00:00
> failover poll 15
> failover replication http
> failover ip address outside xx.xx.254.3
> failover ip address inside 10.200.1.250
> failover ip address EDMZ1 172.30.1.3
> failover ip address EDMZ2 172.31.1.3
> failover ip address MGT 10.200.1.126
> failover ip address intf5 172.27.0.2
> failover link intf5
>
>
>
>
> Thanks for any help....
>
> Chris Serafin
> Chris at chrisserafin.com
>
>
>
>
>
>
>
> _______________________________________________
> cisco-nsp mailing list cisco-nsp at puck.nether.net
> https://puck.nether.net/mailman/listinfo/cisco-nsp
> archive at http://puck.nether.net/pipermail/cisco-nsp/
> ------------------------------------------------------------------------
>
>
> No virus found in this incoming message.
> Checked by AVG - http://www.avg.com
> Version: 8.0.176 / Virus Database: 270.10.13/1916 - Release Date: 1/26/2009 7:08 AM
>
>
More information about the cisco-nsp
mailing list