[cisco-voip] 7828H3 freezing up issue - manual reboot needed

Lelio Fulgenzi lelio at uoguelph.ca
Thu Feb 3 14:54:39 EST 2011


i've always been weary of starting up a server with two inconsistent drive in place. it's likely due to my ignorance of how the system decided to pick which drive to mirror onto which drive. 

if it was me, i'd would shut the system down, pull out the bad drive, power it up, answer the questions that mark the drive bad but not the array and then insert the drive after it has come up. then the system knows darn well that you want to rebuild with the running drive. 

that's just me. it's what we did for years and never had a problem. the drive are hot pluggable. so it's a supported mechanism as far as i know for drive repair. not for "backing up" though. ;) 

--- 
Lelio Fulgenzi, B.A. 
Senior Analyst (CCS) * University of Guelph * Guelph, Ontario N1G 2W1 
(519) 824-4120 x56354 (519) 767-1060 FAX (JNHN) 
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 
Cooking with unix is easy. You just sed it and forget it. 
- LFJ (with apologies to Mr. Popeil) 


----- Original Message -----
From: "Erick B." <erickbee at gmail.com> 
To: "Wes Sisk" <wsisk at cisco.com> 
Cc: "voip puck" <cisco-voip at puck.nether.net> 
Sent: Thursday, February 3, 2011 2:48:49 PM 
Subject: Re: [cisco-voip] 7828H3 freezing up issue - manual reboot needed 

Regarding the hard drive swap, on this model and version of CUCM... 

TAC is telling us to power down the server and swap the HD and power 
it back on, and it rebuild on it's own. 

Just not finding any concrete cisco docs that say that, and finding 
some other discussions that say a single drive failure may end up in a 
rebuild. 

the show hardware shows the questionable drive is always verifying... 
and the SMART saids No on one of the drives so something up with drive 
1. 

---- 

admin:show hardware 

HW Platform : 7828H3 
Processors : 1 
Type : Intel(R) Xeon(R) CPU 3050 @ 2.13GHz 
CPU Speed : 2133 
Memory : 6144 MBytes 
Object ID : 1.3.6.1.4.1.9.1.901 
OS Version : UCOS 4.0.0.0-9 

RAID Version : 
RAID Firmware Version: N/A 
RAID BIOS Version: Not supported 

BIOS Information : 
Vendor: HP 
Version: W04 
Release Date: 06/10/2008 

RAID Details : 
Controllers found: 1 

---------------------------------------------------------------------- 
Controller information 
---------------------------------------------------------------------- 
Controller Status : OK 
Channel description : SATA 
Defunct disk drive count : 0 
Logical drives/Failed/Degraded : 1/0/0 

---------------------------------------------------------------------- 
Logical drive information 
---------------------------------------------------------------------- 
Logical drive number 1 
Logical drive name : Device 1 
Status of logical drive : Optimal 
RAID level : 1 
Size : 238290 MB 
Number of chunks : 2 
Drive(s) (Channel,Device) : 0,1 0,2 

---------------------------------------------------------------------- 
Physical device information 
---------------------------------------------------------------------- 
Channel #0: 
Transfer Speed : SATA 1.5 Gb/s 
Device #1 
Device is a Hard drive 
State : Online 
Transfer Speed : SATA 1.5 Gb/s 
Vendor : GB0250C8 
Model : 045 
Firmware : HPG6 
Serial number : 9SF0RKMW 
Size : 238418 MB 
Write Cache : Enabled (write-back) 
FRU : none 
S.M.A.R.T. : No 
Device #2 
Device is a Hard drive 
State : Online 
Transfer Speed : SATA 1.5 Gb/s 
Vendor : GB0250C8 
Model : 045 
Firmware : HPG6 
Serial number : 9SF0RLNW 
Size : 238418 MB 
Write Cache : Enabled (write-back) 
FRU : none 
S.M.A.R.T. : Yes 

Command completed successfully. 
Controllers found: 1 

Logical drive Task: 
Logical drive : 1 
Current operation : Verify 
Status : In Progress 
Percentage complete : 42 



On Wed, Feb 2, 2011 at 7:14 PM, Erick B. <erickbee at gmail.com> wrote: 
> Right, I am familiar with the IBM fw issue with these type of errors. 
> 
> I just found out, that there was a RTMT alert raised for 
> hardwarefailure on one of the drives (S.M.A.R.T) so going replace that 
> HD when the server became froze sometimes (this error didn't happen 
> all the time). 
> 
> Thanks for the feedback, as always Wes. 
> 
> On Wed, Feb 2, 2011 at 6:11 PM, Wes Sisk <wsisk at cisco.com> wrote: 
>> This is similar but distinctly separate from CSCti52867. In that 
>> investigation we learned that linux marks a file system read only if disk 
>> i/o is unresponsive even for very short amounts of time. Under windows the 
>> disk queues back up and eventually clear. Under linux the filesystem is 
>> re-mounted as readonly. 
>> 
>> So far we know this indicates a delay in disk access. In CSCti52867 with 
>> IBM servers that was due to a specific issue on a specific hard drive. 
>> 
>> Regards, 
>> Wes 
>> 
>> 
>> On 2/2/2011 6:45 PM, Erick B. wrote: 
>>> 
>>> Anyone run into this? TAC believes it is bad HD which were working 
>>> on, and I found previous discussion mentioning bug id CSCsm25875 
>>> maybe. But just checking here incase anyone else has ran into this. 
>>> 
>>> The server is a CUCMBE with drives with firmware version HPG6 so is 
>>> higher then the FW issue. 
>>> 
>>> CUCM version is 7.0.2.20000-5 
>>> 
>>> What happens is server is running, then phones stop working, and the 
>>> web pages don't respond, and only a few commands work on SSH CLI such 
>>> as show hardware, show status. 
>>> 
>>> If we try a utils system restart from CLI, it fails, and saids 
>>> appliance failed to restart. So only way to get it back running is to 
>>> pull the power. 
>>> 
>>> There are no core files found. 
>>> 
>>> When you log in via SSH, see this.... 
>>> 
>>> Last login: 
>>> 
>>> java.io.FileNotFoundException: /var/log/active/platform/log/cli.bin 
>>> (Read-only file system) 
>>> at java.io.RandomAccessFile.open(Native Method) 
>>> at java.io.RandomAccessFile.<init>(RandomAccessFile.java:212) 
>>> at 
>>> com.cisco.iptplatform.fappend.ciscoRollingFileAppender.restoreIndex(ciscoRollingFileAppender.java:100) 
>>> at 
>>> com.cisco.iptplatform.fappend.ciscoRollingFileAppender.setFile(ciscoRollingFileAppender.java:43) 
>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
>>> at 
>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) 
>>> at 
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) 
>>> at java.lang.reflect.Method.invoke(Method.java:585) 
>>> at 
>>> org.apache.log4j.config.PropertySetter.setProperty(PropertySetter.java:196) 
>>> at 
>>> org.apache.log4j.config.PropertySetter.setProperty(PropertySetter.java:155) 
>>> at 
>>> org.apache.log4j.xml.DOMConfigurator.setParameter(DOMConfigurator.java:530) 
>>> at 
>>> org.apache.log4j.xml.DOMConfigurator.parseAppender(DOMConfigurator.java:182) 
>>> at 
>>> org.apache.log4j.xml.DOMConfigurator.findAppenderByName(DOMConfigurator.java:140) 
>>> at 
>>> org.apache.log4j.xml.DOMConfigurator.findAppenderByReference(DOMConfigurator.java:153) 
>>> at 
>>> org.apache.log4j.xml.DOMConfigurator.parseChildrenOfLoggerElement(DOMConfigurator.java:415) 
>>> at 
>>> org.apache.log4j.xml.DOMConfigurator.parseRoot(DOMConfigurator.java:384) 
>>> at 
>>> org.apache.log4j.xml.DOMConfigurator.parse(DOMConfigurator.java:783) 
>>> at 
>>> org.apache.log4j.xml.DOMConfigurator.doConfigure(DOMConfigurator.java:666) 
>>> at 
>>> org.apache.log4j.xml.DOMConfigurator.doConfigure(DOMConfigurator.java:616) 
>>> at 
>>> org.apache.log4j.xml.DOMConfigurator.doConfigure(DOMConfigurator.java:584) 
>>> at 
>>> org.apache.log4j.xml.DOMConfigurator.configure(DOMConfigurator.java:687) 
>>> at sdMain.main(sdMain.java:511) 
>>> java.lang.NullPointerException 
>>> at 
>>> com.cisco.iptplatform.fappend.ciscoRollingFileAppender.updateIndex(ciscoRollingFileAppender.java:117) 
>>> at 
>>> com.cisco.iptplatform.fappend.ciscoRollingFileAppender.nextFileName(ciscoRollingFileAppender.java:92) 
>>> at 
>>> com.cisco.iptplatform.fappend.ciscoRollingFileAppender.append(ciscoRollingFileAppender.java:74) 
>>> at 
>>> org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:221) 
>>> at 
>>> org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:57) 
>>> at org.apache.log4j.Category.callAppenders(Category.java:187) 
>>> at org.apache.log4j.Category.forcedLog(Category.java:372) 
>>> at org.apache.log4j.Category.info(Category.java:674) 
>>> at sdMain.main(sdMain.java:525) 
>>> log4j:ERROR No output stream or file set for the appender named [CLI_LOG]. 
>>> 
>>> 
>>> Thanks 
>>> _______________________________________________ 
>>> cisco-voip mailing list 
>>> cisco-voip at puck.nether.net 
>>> https://puck.nether.net/mailman/listinfo/cisco-voip 
>> 
> 

_______________________________________________ 
cisco-voip mailing list 
cisco-voip at puck.nether.net 
https://puck.nether.net/mailman/listinfo/cisco-voip 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://puck.nether.net/pipermail/cisco-voip/attachments/20110203/a536878e/attachment.html>


More information about the cisco-voip mailing list