[j-nsp] Routing stopped due to failed hard disk?

C. Hagel nanog at lordkron.net
Sun May 8 17:00:05 EDT 2005


	I have seen this before...unfortunately there isn't much in the
way of preventing it.  Looking at the messages you pasted it only is
saying that tthe HDD failed a SMART test.. I really doubt that it had
anything to do with running "mirror-flash-on-disk".  I know of quite a
few companies that are doing this and it isn't a "cause" for this type
of problem.  I am shocked that the HDD failing took down the entire
router unless it was running from "alternate-media" or to be more
specific...the HDD, then I could see why the HDD crash would have caused
the router crash.  
	I would suggest you work with JTAC to get the RE RMA'd to
replace that faulty HDD. 



On Fri, 6 May 2005 07:47:02 +0200 (CEST)
Blaz Zupan <blaz at inlimbo.org> wrote:

BZ> We had a complete catastrophy this night. All routing through one of the main 
BZ> POPs stopped. Rebooting the M10i that is servicing the POP fixed the problem, 
BZ> but after the reboot it seems like the hard disk is no longer recongized and 
BZ> the system is running purely off of flash:
BZ> 
BZ> ad0: 245MB <SanDisk SDCFB-256> [980/16/32] at ata0-master using PIO4
BZ> rd0: ATA SW-RAID configuring 1 subdisks
BZ> rd0: mirrordisk #ad/0x1000a not found, mirroring disabled
BZ> rd0: stripe 0: subdisk rad0 mirrordisk -
BZ> Mounting root from ufs:/dev/rd0s1a
BZ> 
BZ> We have "system mirror-flash-on-disk" configured on all boxes. In retrospect 
BZ> this seems like a bad idea because apparently this caused the trouble. Looking 
BZ> at our logs, I can see this:
BZ> 
BZ> May  6 01:46:15 maribor2-lo0.ipv4 smartd[2597]: atareadsmartvalues: ioctl: Resource temporarily unavailable
BZ> May  6 01:46:15 maribor2-lo0.ipv4 smartd[2597]: checkdevices: Non zero return from atacheckdevice
BZ> 
BZ> Ten minutes later, everything stopped.... So apparently the hard disk failed 
BZ> and took the whole box with it - not good. Unfortunatelly, currently the box 
BZ> does not have a redundant routing engine (not my decision), except for the 
BZ> power supply.
BZ> 
BZ> Has anybody seen something like this? Is mirror-flash-on-disk a good idea?
BZ> 
BZ> Blaz Zupan,  Medinet d.o.o, Trzaska 85, SI-2000 Maribor, Slovenia
BZ> E-mail: blaz at amis.net, Tel: +386 2 320 6320, Fax: +386 2 320 6325
BZ> _______________________________________________
BZ> juniper-nsp mailing list juniper-nsp at puck.nether.net
BZ> http://puck.nether.net/mailman/listinfo/juniper-nsp


-- 
C. Hagel           <nanog at lordkron.net>
JNCIP #103
--




More information about the juniper-nsp mailing list