[cisco-voip] 7828H3 freezing up issue - manual reboot needed

Ryan Ratliff rratliff at cisco.com
Thu Feb 3 14:57:41 EST 2011


Rebuild in this case I'm confident refers to the array, not the server OS.

These drives are not hot swappable, thus the power down.  A hot insertion will only work in specific circumstances (likely when the drive is declared dead by the array IIRC).  Anything else is not supported by the manufacturer, thus not by Cisco.

-Ryan

On Feb 3, 2011, at 2:48 PM, Erick B. wrote:

Regarding the hard drive swap, on this model and version of CUCM...

TAC is telling us to power down the server and swap the HD and power
it back on, and it rebuild on it's own.

Just not finding any concrete cisco docs that say that, and finding
some other discussions that say a single drive failure may end up in a
rebuild.

the show hardware shows the questionable drive is always verifying...
and the SMART saids No on one of the drives so something up with drive
1.

----

admin:show hardware

HW Platform       : 7828H3
Processors        : 1
Type              : Intel(R) Xeon(R) CPU            3050  @ 2.13GHz
CPU Speed         : 2133
Memory            : 6144 MBytes
Object ID         : 1.3.6.1.4.1.9.1.901
OS Version        : UCOS 4.0.0.0-9

RAID Version      :
RAID Firmware Version: N/A
RAID BIOS Version:  Not supported

BIOS Information  :
Vendor: HP
Version: W04
Release Date: 06/10/2008

RAID Details      :
Controllers found: 1

----------------------------------------------------------------------
Controller information
----------------------------------------------------------------------
  Controller Status                   : OK
  Channel description                 : SATA
  Defunct disk drive count            : 0
  Logical drives/Failed/Degraded      : 1/0/0

----------------------------------------------------------------------
Logical drive information
----------------------------------------------------------------------
Logical drive number 1
  Logical drive name                  : Device 1
  Status of logical drive             : Optimal
  RAID level                          : 1
  Size                                : 238290 MB
  Number of chunks                    : 2
  Drive(s) (Channel,Device)           : 0,1 0,2

----------------------------------------------------------------------
Physical device information
----------------------------------------------------------------------
  Channel #0:
     Transfer Speed                   : SATA 1.5 Gb/s
     Device #1
        Device is a Hard drive
        State                         : Online
        Transfer Speed                : SATA 1.5 Gb/s
        Vendor                        : GB0250C8
        Model                         : 045
        Firmware                      : HPG6
        Serial number                 : 9SF0RKMW
        Size                          : 238418 MB
        Write Cache                   : Enabled (write-back)
        FRU                           : none
        S.M.A.R.T.                    : No
     Device #2
        Device is a Hard drive
        State                         : Online
        Transfer Speed                : SATA 1.5 Gb/s
        Vendor                        : GB0250C8
        Model                         : 045
        Firmware                      : HPG6
        Serial number                 : 9SF0RLNW
        Size                          : 238418 MB
        Write Cache                   : Enabled (write-back)
        FRU                           : none
        S.M.A.R.T.                    : Yes

Command completed successfully.
Controllers found: 1

Logical drive Task:
  Logical drive                  : 1
  Current operation              : Verify
  Status                         : In Progress
  Percentage complete            : 42



On Wed, Feb 2, 2011 at 7:14 PM, Erick B. <erickbee at gmail.com> wrote:
> Right, I am familiar with the IBM fw issue with these type of errors.
> 
> I just found out, that there was a RTMT alert raised for
> hardwarefailure on one of the drives (S.M.A.R.T) so going replace that
> HD when the server became froze sometimes (this error didn't happen
> all the time).
> 
> Thanks for the feedback, as always Wes.
> 
> On Wed, Feb 2, 2011 at 6:11 PM, Wes Sisk <wsisk at cisco.com> wrote:
>> This is similar but distinctly separate from CSCti52867.  In that
>> investigation we learned that linux marks a file system read only if disk
>> i/o is unresponsive even for very short amounts of time.  Under windows the
>> disk queues back up and eventually clear.  Under linux the filesystem is
>> re-mounted as readonly.
>> 
>> So far we know this indicates a delay in disk access.  In CSCti52867 with
>> IBM servers that was due to a specific issue on a specific hard drive.
>> 
>> Regards,
>> Wes
>> 
>> 
>> On 2/2/2011 6:45 PM, Erick B. wrote:
>>> 
>>> Anyone run into this?  TAC believes it is bad HD which were working
>>> on, and I found previous discussion mentioning bug id CSCsm25875
>>> maybe. But just checking here incase anyone else has ran into this.
>>> 
>>> The server is a CUCMBE with drives with firmware version HPG6 so is
>>> higher then the FW issue.
>>> 
>>> CUCM version is 7.0.2.20000-5
>>> 
>>> What happens is server is running, then phones stop working, and the
>>> web pages don't respond, and only a few commands work on SSH CLI such
>>> as show hardware, show status.
>>> 
>>> If we try a utils system restart from CLI, it fails, and saids
>>> appliance failed to restart. So only way to get it back running is to
>>> pull the power.
>>> 
>>> There are no core files found.
>>> 
>>> When you log in via SSH, see this....
>>> 
>>> Last login:
>>> 
>>> java.io.FileNotFoundException: /var/log/active/platform/log/cli.bin
>>> (Read-only file system)
>>>         at java.io.RandomAccessFile.open(Native Method)
>>>         at java.io.RandomAccessFile.<init>(RandomAccessFile.java:212)
>>>         at
>>> com.cisco.iptplatform.fappend.ciscoRollingFileAppender.restoreIndex(ciscoRollingFileAppender.java:100)
>>>         at
>>> com.cisco.iptplatform.fappend.ciscoRollingFileAppender.setFile(ciscoRollingFileAppender.java:43)
>>>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>         at
>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>>         at
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>>         at java.lang.reflect.Method.invoke(Method.java:585)
>>>         at
>>> org.apache.log4j.config.PropertySetter.setProperty(PropertySetter.java:196)
>>>         at
>>> org.apache.log4j.config.PropertySetter.setProperty(PropertySetter.java:155)
>>>         at
>>> org.apache.log4j.xml.DOMConfigurator.setParameter(DOMConfigurator.java:530)
>>>         at
>>> org.apache.log4j.xml.DOMConfigurator.parseAppender(DOMConfigurator.java:182)
>>>         at
>>> org.apache.log4j.xml.DOMConfigurator.findAppenderByName(DOMConfigurator.java:140)
>>>         at
>>> org.apache.log4j.xml.DOMConfigurator.findAppenderByReference(DOMConfigurator.java:153)
>>>         at
>>> org.apache.log4j.xml.DOMConfigurator.parseChildrenOfLoggerElement(DOMConfigurator.java:415)
>>>         at
>>> org.apache.log4j.xml.DOMConfigurator.parseRoot(DOMConfigurator.java:384)
>>>         at
>>> org.apache.log4j.xml.DOMConfigurator.parse(DOMConfigurator.java:783)
>>>         at
>>> org.apache.log4j.xml.DOMConfigurator.doConfigure(DOMConfigurator.java:666)
>>>         at
>>> org.apache.log4j.xml.DOMConfigurator.doConfigure(DOMConfigurator.java:616)
>>>         at
>>> org.apache.log4j.xml.DOMConfigurator.doConfigure(DOMConfigurator.java:584)
>>>         at
>>> org.apache.log4j.xml.DOMConfigurator.configure(DOMConfigurator.java:687)
>>>         at sdMain.main(sdMain.java:511)
>>> java.lang.NullPointerException
>>>         at
>>> com.cisco.iptplatform.fappend.ciscoRollingFileAppender.updateIndex(ciscoRollingFileAppender.java:117)
>>>         at
>>> com.cisco.iptplatform.fappend.ciscoRollingFileAppender.nextFileName(ciscoRollingFileAppender.java:92)
>>>         at
>>> com.cisco.iptplatform.fappend.ciscoRollingFileAppender.append(ciscoRollingFileAppender.java:74)
>>>         at
>>> org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:221)
>>>         at
>>> org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:57)
>>>         at org.apache.log4j.Category.callAppenders(Category.java:187)
>>>         at org.apache.log4j.Category.forcedLog(Category.java:372)
>>>         at org.apache.log4j.Category.info(Category.java:674)
>>>         at sdMain.main(sdMain.java:525)
>>> log4j:ERROR No output stream or file set for the appender named [CLI_LOG].
>>> 
>>> 
>>> Thanks
>>> _______________________________________________
>>> cisco-voip mailing list
>>> cisco-voip at puck.nether.net
>>> https://puck.nether.net/mailman/listinfo/cisco-voip
>> 
> 

_______________________________________________
cisco-voip mailing list
cisco-voip at puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-voip

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://puck.nether.net/pipermail/cisco-voip/attachments/20110203/ea955e08/attachment.html>


More information about the cisco-voip mailing list