[cisco-voip] 7828H3 freezing up issue - manual reboot needed
Ryan Ratliff
rratliff at cisco.com
Thu Feb 3 14:57:41 EST 2011
Rebuild in this case I'm confident refers to the array, not the server OS.
These drives are not hot swappable, thus the power down. A hot insertion will only work in specific circumstances (likely when the drive is declared dead by the array IIRC). Anything else is not supported by the manufacturer, thus not by Cisco.
-Ryan
On Feb 3, 2011, at 2:48 PM, Erick B. wrote:
Regarding the hard drive swap, on this model and version of CUCM...
TAC is telling us to power down the server and swap the HD and power
it back on, and it rebuild on it's own.
Just not finding any concrete cisco docs that say that, and finding
some other discussions that say a single drive failure may end up in a
rebuild.
the show hardware shows the questionable drive is always verifying...
and the SMART saids No on one of the drives so something up with drive
1.
----
admin:show hardware
HW Platform : 7828H3
Processors : 1
Type : Intel(R) Xeon(R) CPU 3050 @ 2.13GHz
CPU Speed : 2133
Memory : 6144 MBytes
Object ID : 1.3.6.1.4.1.9.1.901
OS Version : UCOS 4.0.0.0-9
RAID Version :
RAID Firmware Version: N/A
RAID BIOS Version: Not supported
BIOS Information :
Vendor: HP
Version: W04
Release Date: 06/10/2008
RAID Details :
Controllers found: 1
----------------------------------------------------------------------
Controller information
----------------------------------------------------------------------
Controller Status : OK
Channel description : SATA
Defunct disk drive count : 0
Logical drives/Failed/Degraded : 1/0/0
----------------------------------------------------------------------
Logical drive information
----------------------------------------------------------------------
Logical drive number 1
Logical drive name : Device 1
Status of logical drive : Optimal
RAID level : 1
Size : 238290 MB
Number of chunks : 2
Drive(s) (Channel,Device) : 0,1 0,2
----------------------------------------------------------------------
Physical device information
----------------------------------------------------------------------
Channel #0:
Transfer Speed : SATA 1.5 Gb/s
Device #1
Device is a Hard drive
State : Online
Transfer Speed : SATA 1.5 Gb/s
Vendor : GB0250C8
Model : 045
Firmware : HPG6
Serial number : 9SF0RKMW
Size : 238418 MB
Write Cache : Enabled (write-back)
FRU : none
S.M.A.R.T. : No
Device #2
Device is a Hard drive
State : Online
Transfer Speed : SATA 1.5 Gb/s
Vendor : GB0250C8
Model : 045
Firmware : HPG6
Serial number : 9SF0RLNW
Size : 238418 MB
Write Cache : Enabled (write-back)
FRU : none
S.M.A.R.T. : Yes
Command completed successfully.
Controllers found: 1
Logical drive Task:
Logical drive : 1
Current operation : Verify
Status : In Progress
Percentage complete : 42
On Wed, Feb 2, 2011 at 7:14 PM, Erick B. <erickbee at gmail.com> wrote:
> Right, I am familiar with the IBM fw issue with these type of errors.
>
> I just found out, that there was a RTMT alert raised for
> hardwarefailure on one of the drives (S.M.A.R.T) so going replace that
> HD when the server became froze sometimes (this error didn't happen
> all the time).
>
> Thanks for the feedback, as always Wes.
>
> On Wed, Feb 2, 2011 at 6:11 PM, Wes Sisk <wsisk at cisco.com> wrote:
>> This is similar but distinctly separate from CSCti52867. In that
>> investigation we learned that linux marks a file system read only if disk
>> i/o is unresponsive even for very short amounts of time. Under windows the
>> disk queues back up and eventually clear. Under linux the filesystem is
>> re-mounted as readonly.
>>
>> So far we know this indicates a delay in disk access. In CSCti52867 with
>> IBM servers that was due to a specific issue on a specific hard drive.
>>
>> Regards,
>> Wes
>>
>>
>> On 2/2/2011 6:45 PM, Erick B. wrote:
>>>
>>> Anyone run into this? TAC believes it is bad HD which were working
>>> on, and I found previous discussion mentioning bug id CSCsm25875
>>> maybe. But just checking here incase anyone else has ran into this.
>>>
>>> The server is a CUCMBE with drives with firmware version HPG6 so is
>>> higher then the FW issue.
>>>
>>> CUCM version is 7.0.2.20000-5
>>>
>>> What happens is server is running, then phones stop working, and the
>>> web pages don't respond, and only a few commands work on SSH CLI such
>>> as show hardware, show status.
>>>
>>> If we try a utils system restart from CLI, it fails, and saids
>>> appliance failed to restart. So only way to get it back running is to
>>> pull the power.
>>>
>>> There are no core files found.
>>>
>>> When you log in via SSH, see this....
>>>
>>> Last login:
>>>
>>> java.io.FileNotFoundException: /var/log/active/platform/log/cli.bin
>>> (Read-only file system)
>>> at java.io.RandomAccessFile.open(Native Method)
>>> at java.io.RandomAccessFile.<init>(RandomAccessFile.java:212)
>>> at
>>> com.cisco.iptplatform.fappend.ciscoRollingFileAppender.restoreIndex(ciscoRollingFileAppender.java:100)
>>> at
>>> com.cisco.iptplatform.fappend.ciscoRollingFileAppender.setFile(ciscoRollingFileAppender.java:43)
>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>> at
>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>> at
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>> at java.lang.reflect.Method.invoke(Method.java:585)
>>> at
>>> org.apache.log4j.config.PropertySetter.setProperty(PropertySetter.java:196)
>>> at
>>> org.apache.log4j.config.PropertySetter.setProperty(PropertySetter.java:155)
>>> at
>>> org.apache.log4j.xml.DOMConfigurator.setParameter(DOMConfigurator.java:530)
>>> at
>>> org.apache.log4j.xml.DOMConfigurator.parseAppender(DOMConfigurator.java:182)
>>> at
>>> org.apache.log4j.xml.DOMConfigurator.findAppenderByName(DOMConfigurator.java:140)
>>> at
>>> org.apache.log4j.xml.DOMConfigurator.findAppenderByReference(DOMConfigurator.java:153)
>>> at
>>> org.apache.log4j.xml.DOMConfigurator.parseChildrenOfLoggerElement(DOMConfigurator.java:415)
>>> at
>>> org.apache.log4j.xml.DOMConfigurator.parseRoot(DOMConfigurator.java:384)
>>> at
>>> org.apache.log4j.xml.DOMConfigurator.parse(DOMConfigurator.java:783)
>>> at
>>> org.apache.log4j.xml.DOMConfigurator.doConfigure(DOMConfigurator.java:666)
>>> at
>>> org.apache.log4j.xml.DOMConfigurator.doConfigure(DOMConfigurator.java:616)
>>> at
>>> org.apache.log4j.xml.DOMConfigurator.doConfigure(DOMConfigurator.java:584)
>>> at
>>> org.apache.log4j.xml.DOMConfigurator.configure(DOMConfigurator.java:687)
>>> at sdMain.main(sdMain.java:511)
>>> java.lang.NullPointerException
>>> at
>>> com.cisco.iptplatform.fappend.ciscoRollingFileAppender.updateIndex(ciscoRollingFileAppender.java:117)
>>> at
>>> com.cisco.iptplatform.fappend.ciscoRollingFileAppender.nextFileName(ciscoRollingFileAppender.java:92)
>>> at
>>> com.cisco.iptplatform.fappend.ciscoRollingFileAppender.append(ciscoRollingFileAppender.java:74)
>>> at
>>> org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:221)
>>> at
>>> org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:57)
>>> at org.apache.log4j.Category.callAppenders(Category.java:187)
>>> at org.apache.log4j.Category.forcedLog(Category.java:372)
>>> at org.apache.log4j.Category.info(Category.java:674)
>>> at sdMain.main(sdMain.java:525)
>>> log4j:ERROR No output stream or file set for the appender named [CLI_LOG].
>>>
>>>
>>> Thanks
>>> _______________________________________________
>>> cisco-voip mailing list
>>> cisco-voip at puck.nether.net
>>> https://puck.nether.net/mailman/listinfo/cisco-voip
>>
>
_______________________________________________
cisco-voip mailing list
cisco-voip at puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-voip
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://puck.nether.net/pipermail/cisco-voip/attachments/20110203/ea955e08/attachment.html>
More information about the cisco-voip
mailing list