[c-nsp] packet loss between adjacent ciscos

Sam Tilders sam+cisco-nsp at australiaonline.net.au
Fri Feb 27 00:28:30 EST 2009


Hi,

We have been experiencing some packet loss between a switch and a
router directly connected to each other and are having some difficulty
finding the problem.

The problem showed up when a customer complained that there were
moments of silence on their voip calls. They did some pings and found
that there was packet loss at the same time as the silence on the calls.

With some further help from the customer, I was able to narrow the
problem down to loss between two ciscos in our rack.

The network layout is like this:

carrier peer port
       |
       |
       | 100% ping success.
       |
       | iofe
border router (7200vxr w/npe-400 12.2(18)S4)
       | pa-fe-tx
       |
       | 99.994 - 99.999% ping success
       |
       |
switch (2924 xl en)
       |
       |
       | 100% ping success
       |
       | iofe
l2tp termination router (7200vxr w/npe-300 12.4(4)T1)
       | gige
       |
       |
downstream to customers

The ping percentages are from repeated 100000 ping samples.

The interfaces are all forced duplex full, the switch interfaces are
forced speed 100.

When the link between the router and the switch has loss it can be
seen in the ping as a slow down then a single timeout.

The ping output goes something like:

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!! !  !   !    !     !      !         .!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

where I've used spacing to indicate the time between markers.

So it appears that the ping and reply begins to slow, then it slows
enough that a single 2 second timeout occurs and then it picks up
again at full speed.

A 100000 ping takes a few minutes to run and during this time it may
lose one or half a dozen pings, each lost ping spaced apart,
apparently with no regular period.

The router and switch are typically running around 30% cpu.

When I run these ping tests the switch gets to 80% cpu, however it can
be shown with cases like a customer's voip call that the problem
occurs even when the util is lower.

I have correlated the ping loss with the customer's voip silence,
having them on a call while running the ping and they experience a
couple of seconds of silence at the same time as the router misses a
ping.

I've been on site and replaced the pa-fe-tx in the 7200 to no
improvement. I moved the PA to a different port on the router, no
improvement. I've replaced the switch with no improvement.

(We had previously tried different switch ports and replacing the cabling.)

All the while, none of the interface statistics report any errors
other than the occasional ignored packet - however, these don't occur  
at the same time as
the problem and much less frequently.

I've had various debug options turned on - both on the switch and the
router - there has been no clear correlation between any events and
the occurence of packet loss.

So, I was wondering if this sounds familiar to anyone or if there is
anything someone might be able to suggest to further investigate or
resolve this issue.

I'd appreciate any advice that can be given.

Regards,

- Sam






More information about the cisco-nsp mailing list