[cisco-voip] Database Layer Change Notification
Robert Singleton
rsingleton at morsco.com
Mon Apr 27 13:15:34 EDT 2009
Hello, all!
I've just recovered from the second (known) occurrence of a problem
wherein a table in CallManager's database, DBLCNQueueHead, seems to fill
up and never empty, eventually bringing database changes to a grinding halt.
Both times, there has been an otherwise inexplicable call handling issue
that eventually lead to a reboot of the cluster as a
last-ditch-finger-crossing-wood-knocking attempt to make it go again.
Both times, the original complaint was not resolved and the reboot
apparently caused a new error to appear whenever any database change was
attempted.
The first time, Call Forwarding was stuck in whatever state a given DN
was set to. If a DN was forwarded, the act of removing forwarding
appears to work, but calls to the DN were still forwarded. Likewise, if
one forwarded a DN, it would appear to take the command, but the DN
would continue to ring locally. Eventually, we tried the reboot (what I
unaffectionately call "The Windows Fix") and when I started getting the
errors afterward, I opened a TAC SR. I was passed around until I got an
engineer who was very comfortable with the database and found that a few
tables that were apparently related to database change notification were
jam packed with 100's of thousands of records.
Last Friday, I had two locations for which incoming calls did not work
correctly. Some telephones at each site appeared to be stuck loading a
template, though they appeared to be registered in CallManager. Some
switch and routing troubleshooting appeared to point to a UDP problem,
but it was eventually discovered that certain telephones in the
locations did work, though they were phones *without* the incoming DN on
them.
We handle incoming calls at most locations by sending calls to shared
DNs on most, if not all, telephones at the locations. Since phones
without incoming lines were operating normally, we started by picking
one phone, wiping it out and reconfiguring it one line at a time. We
found that once we added the lead number of huntgroup, that phone began
choking on loading a template. So, we deleted all traces of the DNs
associated with incoming calls at that particular location but when we
began adding them back, adding that lead DN number would again bring
down the affected phones.
At that point, we decided that rebooting the cluster would probably be a
good idea. When the system was back up, however, I now began getting
errors whenever I tried make any database changes.
I then reviewed TAC history to find when we'd had similar issues and
found where an engineer had determined that we had 200K+ entries in the
DBLCNQueueHead table in the CCM0301 database. I looked and I had over
456K rows.
I followed the same procedure, which was basically to truncate the three
tables associated with change notification. For 456K rows to truncate
takes almost 9 hours. Once that was done, not only could I now make
database changes, but the original symptoms went away.
Now when I check properties on those tables, they have either one row or
no rows, depending on which table.
I apologize for the exceptionally long introduction, but the real
question is: What do these tables do? What makes them "stick" and fill
up? How many rows is a critical number; when will it break because this
table isn't clearing out?
The three tables are:
DBLCNQueueHead
DBLCNQueueNew
DBLCNQueueOld
I have viewed the contents of DBLCNQueueHead while making various
database changes and the one row never changes. Color me confused.
Thanks!!!
Robert
More information about the cisco-voip
mailing list