[cisco-voip] CUCM Pub HWM Disk usage
Wes Sisk
wsisk at cisco.com
Tue May 8 10:34:18 EDT 2012
yep. we've seen many of these. once a FS goes readonly it should really be considered suspect. get the hardware and firmware fixed. Install fresh from a known good backup and upgrade to a software version that includes all kernel and software fixes as soon as possible and without encountering another readonly event.
Keep in mind there are 3 levels to these problems. In logical order:
1. hardware failures - bad hardware or hardware that is going bad producing inconsistent results
2. firmware failures - disk drives, controllers - these are updated with fwucd : Downloads Home
Products
Voice and Unified Communications
Communications Infrastructure
Voice Servers
Cisco 7800 Series Media Convergence Servers
Cisco MCS 7835-I3
Voice Applications OS Firmware For IBM-3.6(6)
http://www.cisco.com/cisco/software/release.html?mdfid=282821930&flowid=6694&softwareid=283046743&release=3.6%286%29&relind=AVAILABLE&rellifecycle=&reltype=latest
3. driver failures - drivers are loaded as modules in the linux kernel - these are updated with CUCM software upgrade that includes new driver or new kernel
hardware, firmware, and drivers and kernel must be fixed. Then the base filesystem must be in tact and sound. Readonly filesystem is a bit like handling tar or syrup - its really tough to get out of.
Regards,
Wes
On May 8, 2012, at 10:11 AM, Gr wrote:
From my experience i had to rebuild, as it would run stable for 2-4 weeks and then go into the same stage, filesytem readonly, database replication errors etc and had to use recovery disk to repair the file system. After i rebuilt it has been stable.
You can monitor if it keeps on running stable fine, again goes unstable i will advise rebuild it. My version was 8.5.
GR
Sent from my iPhone
On 05/05/2012, at 1:24 AM, george.hendrix at l-3com.com wrote:
> All,
>
> I ended up doing a cold reboot of the server (publisher), booted from the recovery cd and selected repair the file system option. The file system is good now and I am not having issues and able to do backups. However, I noticed on the page that has instructions for repairing the file system highly recommends reimaging the system. Do I really need to rebuild the system?
>
> http://www.cisco.com/en/US/products/sw/voicesw/ps556/products_tech_note09186a0080b1f305.shtml
>
> Note: It is highly recommended that you perform the full system backup, and then re-image the system using DRS, to make sure the file system is stable in the future.
>
>
> Bill
>
> From: Ryan Ratliff [mailto:rratliff at cisco.com]
> Sent: Wednesday, May 02, 2012 2:28 PM
> To: Hendrix, George (Bill) @ LSG - STRATIS
> Cc: cisco-voip at puck.nether.net
> Subject: Re: [cisco-voip] CUCM Pub HWM Disk usage
>
> Something is a bit odd here, you should have some logs in those folders, unless I just flat gave you the wrong path.
>
> Looking back at your logging partition it looks like the numbers are a bit off. It appear to say you have more free space than the size of the partition.
> Disk/logging 65574668K 68069024K K (101%)
>
> Note that the 'used' column is empty, and I'd guess the 101% is a byproduct of some negative value not being accounted for.
>
> I think it's time for a boot off the recovery cd to check the filesystem. Got a backup?
>
> -Ryan
>
> On May 2, 2012, at 1:36 PM, george.hendrix at l-3com.com wrote:
>
>
> Yeah, this was a fresh build.
>
> I will try to look around…below is the output from the last commands you gave me.
>
> admin:file list activelog tomcat/logs/* detail
> dir count = 0, file count = 0
> admin:file list activelog cm/log/informix/* detail
> dir count = 0, file count = 0
>
> I would imagine a cold reboot of the server probably wouldn’t help right? Or worse, it won’t come back up.
>
> Bill Hendrix
>
> From: Ryan Ratliff [mailto:rratliff at cisco.com]
> Sent: Wednesday, May 02, 2012 1:14 PM
> To: Hendrix, George (Bill) @ LSG - STRATIS
> Cc: cisco-voip at puck.nether.net
> Subject: Re: [cisco-voip] CUCM Pub HWM Disk usage
>
> The size of your inactive partition tells me you've never done an upgrade on this system so that's out.
>
> You can dig around and try to find a log somewhere under activelog to delete, I'd take a look at informix ccm.log and tomcat catalina.out files since I think there've been bugs for both of those in the past that made them get very large.
>
> file list activelog tomcat/logs/* detail
> file list activelog cm/log/informix/* detail
>
> -Ryan
>
> On May 2, 2012, at 11:19 AM, george.hendrix at l-3com.com wrote:
>
> Ryan,
>
> Below is the output from the show status command showing the partition that’s full is the logging partition.
>
> CPU Idle: 94.00% System: 02.00% User: 01.00%
> IOWAIT: 03.00% IRQ: 00.00% Soft: 00.00% Intr/sec: 1019.00
>
> Memory Total: 2053864K
> Free: 64692K
> Used: 1989172K
> Cached: 672404K
> Shared: 0K
> Buffers: 101304K
>
> Total Free Used
> Disk/active 27632244K 15947620K 11403892K (42%)
> Disk/inactive 27632272K 26195768K 32828K (1%)
> Disk/logging 65574668K 68069024K K (101%)
>
> When I tried to enter the file list command, below is the output. So I was unable to find a file to delete with this command.
>
> admin:file list inactivelog cm/trace/ccm/sdi/*
> no such file or directory can be found
>
> I also enter this command and got the response shown.
> admin:file list activelog cm/trace/ccm/sdi/*
> dir count = 0, file count = 0
>
> Thanks,
>
> Bill Hendrix | Network/VOIP Engineer
> L3 STRATIS POWERED BY EXCELLENCE
>
> From: Ryan Ratliff [mailto:rratliff at cisco.com]
> Sent: Wednesday, May 02, 2012 9:55 AM
> To: Hendrix, George (Bill) @ LSG - STRATIS
> Cc: cisco-voip at puck.nether.net
> Subject: Re: [cisco-voip] CUCM Pub HWM Disk usage
>
> If the filesystem is truly readonly then you've no option but to reboot to recover it. There is a distinction between the disk being full and being marked readonly by the OS however so make sure you know which one you are hitting.
>
> Can you paste the output of 'show status' to confirm which partition is full?
>
> Now to test the readonly filesystem then try and delete a file from the CLI.
> file list inactivelog cm/trace/ccm/sdi/*
> file delete inactivelog cm/trace/ccm/sdi/<insert filename here from above command output>
>
> This will try to delete one of the ccm sdi traces from your inactive partition. This will be completely harmless and if it works then your disk is just full, not readonly.
>
> -Ryan
>
> On May 2, 2012, at 7:58 AM, george.hendrix at l-3com.com wrote:
>
>
>
>
> Hey Guys,
>
> I have a CUCM 6.1 Cluster that started sending me the following alert:
>
> LogPartitionHighWaterMarkExceeded UsedDiskSpace : 101 MessageString : Disk utilization hits HWM!! Purging files...
>
> Both the CDR Repository and CallManager Service are affected by this. I can start them, but then they just stop within a short time. I read somewhere to change the LWM and HWM to low numbers and tried that, but the disk usage is still staying at 101% (not sure how that is even possible). I tried searching in RTMT for all logs on the Pub and it comes back that nothing is found. I tried this by a date range and also within the last 60 days, nothing. I also tried rebooting the server via command line and received an error that the appliance restart failed. From what I’ve read in various threads, the system seems to be in a read-only state and is not able to purge the files now, nor reboot.
>
> Appreciate info anyone can provide as to how to clear the logs in the log partition.
>
> Thanks,
> Bill Hendrix
>
> _______________________________________________
> cisco-voip mailing list
> cisco-voip at puck.nether.net
> https://puck.nether.net/mailman/listinfo/cisco-voip
>
> _______________________________________________
> cisco-voip mailing list
> cisco-voip at puck.nether.net
> https://puck.nether.net/mailman/listinfo/cisco-voip
_______________________________________________
cisco-voip mailing list
cisco-voip at puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-voip
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://puck.nether.net/pipermail/cisco-voip/attachments/20120508/605793ff/attachment.html>
More information about the cisco-voip
mailing list