[j-nsp] Odd behaviour testing failover

Jared Gull jmgull at yahoo.com
Wed Sep 7 23:41:32 EDT 2005


Gordon,

I haven't done much GRES testing in the 7.3 release
but prior to that I have and have seen no such
behavior.  Have you been able to reproduce this or was
it only seen one time?  If this happened only once I
would say that HDD does sound suspect.

Jared

--- Gordon Smith <gsmith at wxc.co.nz> wrote:

> Hi Jared,
> 
> Yes to everything  :-)
> 
> RE's are both RE2.0 
> Bios is version 1.2
> 
> Cheers,
> Gordon
> 
> 
> > -----Original Message-----
> > From: Jared Gull [mailto:jmgull at yahoo.com] 
> > Sent: Thursday, 8 September 2005 3:06 p.m.
> > To: Gordon Smith; juniper-nsp at puck.nether.net
> > Subject: Re: [j-nsp] Odd behaviour testing
> failover
> > 
> > Gordon,
> > 
> > Are both REs running the same version?  Also, are
> the REs the 
> > same revision (i.e. RE2.0/3.0)?
> > 
> > Jared
> > 
> > --- Gordon Smith <gsmith at wxc.co.nz> wrote:
> > 
> > > Hi all,
> > > 
> > > I'm seeing odd RE behaviour on an M20 while
> testing RE failover.
> > > >From the log entries, I'm guessing that a
> snapshot
> > > of the current route
> > > table entries gets tarballed and passed to the
> second RE to 
> > use while 
> > > the control plane re-establishes adjacencies
> with other devices.
> > > 
> > > Problem is, the tarball can't be found by the
> RE, and 
> > transit traffic 
> > > fails until it re-meshes.
> > > 
> > > Has anyone come across this before?
> > > If so, how do I get this to behave? Graceful
> failover is enabled on 
> > > this box (JUNOS 7.3R1.6)
> > > 
> > > Also seeing kernel page faults in the logs on
> the backup RE during 
> > > failover. Hard disk failing?
> > > 
> > > 
> > > 
> > > Sep  8 14:36:35  jcore2 chassisd[74828]:
> > > CHASSISD_SNMP_TRAP7: SNMP trap
> > > generated: Fru Online (jnxFruContentsIndex 9,
> jnxFruL1Index 2, 
> > > jnxFruL2Index 0, jnxFruL3Index 0, jnxFruName
> Routing Engine 1, 
> > > jnxFruType 6, jnxFruSlot 2) Sep  8 14:36:35 
> jcore2 craftd[2599]: 
> > > attempt to delete alarm not in list Sep  8
> 14:36:35  jcore2 
> > > craftd[2599]: forwarding display request to
> > > chassisd: type = 4, subtype = 44
> > > Sep  8 14:36:41  jcore2 rshd[75266]: root at re1 as
> > > root: cmd='rcp -T -t
> > > /var/db/dcd.snmp_ix+'
> > > Sep  8 14:36:42  jcore2 rshd[75298]: root at re1 as
> > > root: cmd='mv
> > > /var/db/dcd.snmp_ix+ /var/db/dcd.snmp_ix'
> > > Sep  8 14:36:59  jcore2 dumpd: Core and context
> for rpd saved in 
> > > /var/tmp/rpd.core-tarball.4.tgz Sep  8 14:37:00 
> jcore2 dumpd: tar: 
> > > rpd.info.4:
> > > Cannot stat: No such
> > > file or directory tar: Error exit delayed from
> previous 
> > errors Sep  8 
> > > 14:37:00  jcore2 dumpd: Unable to create core
> tarball 
> > > /var/tmp/rpd.core-tarball.4.tgz Sep  8 14:37:00 
> jcore2 dumpd: tar: 
> > > rpd.info.4:
> > > Cannot stat: No such
> > > file or directory tar: Error exit delayed from
> previous 
> > errors Sep  8 
> > > 14:37:00  jcore2 dumpd: Unable to create core
> tarball 
> > > /var/tmp/rpd.core-tarball.4.tgz
> > > 
> > > 
> > > 
> > > Sep  8 14:36:24  jcore2 /kernel: Trapframe
> Register
> > > Dump:
> > > Sep  8 14:36:24  jcore2 /kernel: eax: 00000000 
> ecx:
> > > 085cf000   edx:
> > > 085fabb0   ebx: 00000012
> > > Sep  8 14:36:24  jcore2 /kernel: esp: bfbff800 
> ebp:
> > > bfbffc28   esi:
> > > 085c63a0   edi: 0864c000
> > > Sep  8 14:36:24  jcore2 /kernel: eip: 0812f706
> > > eflags: 00010206
> > > Sep  8 14:36:24  jcore2 /kernel: cs: 001f      
> ss:
> > > 002f        ds:
> > > bfbf002f    es: 8868002f
> > > Sep  8 14:36:24  jcore2 /kernel: fs: 864002f   
> > > trapno: 0000000c
> > > err: 00000004
> > > Sep  8 14:36:25  jcore2 /kernel: Page table info
> for PC address
> > > 0x812f706: PDE = 0x189a5067, PTE = 2c65425 Sep 
> 8 14:36:25  jcore2 
> > > /kernel: Dumping 16 bytes starting at PC address
> > > 0x812f706:
> > > Sep  8 14:36:25  jcore2 /kernel: 80 b8 10 02 00
> 00 00 75 0d 
> > 80 bd e7 
> > > fb ff ff 03 Sep  8 14:36:25  jcore2 /kernel:
> BAD_PAGE_FAULT: pid
> > > 62594 (rpd), uid 0:
> > > pc 0x812f706 got a read fault at 0x210, x86
> fault flags = 0x4
> > > 
> > > 
> > > _______________________________________________
> > > juniper-nsp mailing list
> juniper-nsp at puck.nether.net 
> > >
> http://puck.nether.net/mailman/listinfo/juniper-nsp
> > > 
> > 
> > 
> > 
> > 	
> > 		
> >
>
______________________________________________________
> > Click here to donate to the Hurricane Katrina
> relief effort.
> > http://store.yahoo.com/redcross-donate3/
> > 
> 



	
		
______________________________________________________
Click here to donate to the Hurricane Katrina relief effort.
http://store.yahoo.com/redcross-donate3/


More information about the juniper-nsp mailing list