[c-nsp] Dynamic output buffer allocation on Cisco 4948

Thu Sep 26 11:18:27 EDT 2013

It was host to host, so it was really Host A to Host B and vice versa. The
expected RTT was less than a millisecond, which is what they often got, but
the latency would spike regularly up to as high as 24 ms. I initially
thought it was a problem on one of the hosts but they can ping to and from
devices on the same vlan with no variable latency. The latency only occurs
in one direction when going from one vlan to the other. We manipulated the
HSRP configs to shift traffic to different routers and switches but the
behavior didn't change. From Host A to Host B we saw variable latency, but
never ever did we see it if the ping originated from Host B even though,
depending on the HSRP configuration, the packets were traversing exactly
the same path. It has me completely stumped.

On Thu, Sep 26, 2013 at 9:04 AM, Blake Dunlap <ikiris at gmail.com> wrote:

> This may seem like a stupid question, but when you were pinging, were you
> pinging from hosts, or from the routers?
>
> -Blake
>
>
> On Thu, Sep 26, 2013 at 9:38 AM, John Neiberger <jneiberger at gmail.com>wrote:
>
>> Thanks! I talked to our Cisco NCE about this and he gave me these
>> commands:
>>
>> show qos  interface gigabitEthernet x/y- will show you 4 queues and also
>> whether QoS is disabled or not
>>
>> sh int gi x/y counters detail -you will see packet counters in queue #1-4
>> incrementing
>>
>> Sh platform hardware interface g x/y stat | in TxB
>>
>>
>> I'm nearly certain that this big buffer issue is the answer to my high
>> variable latency problem, but there is still one mystery about this that
>> has me completely perplexed. The high variable latency was only occurring
>> in one direction (from VLAN A to VLAN B) but not in the other (from VLAN B
>> to VLAN A). What really makes that weird is that because of some hsrp
>> differences, we really had a circular topology for a bit. The path was
>> *exactly* the same no matter which direction you were pinging. The ICMP
>> packets had to travel around the same circle through the same devices and
>> interfaces. So if we have big buffers on congested interfaces that are
>> introducing variable latency, why would we only see it in one direction?
>>
>>
>> When VLAN A pings VLAN B, it is the initial ICMP packet that would have
>> been delayed, while the response would come in on a different interface
>> that wasn't congested. When VLAN B pings VLAN A, the initial ping would
>> not
>> hit congested interfaces but the ping reply would. The total round trip
>> time should have been similar, but it never was. I'm completely stumped by
>> that. I even had Cisco HTTS on this for a couple of days and they couldn't
>> figure it out.
>>
>>
>> Thanks,
>>
>> John
>>
>>
>> On Thu, Sep 26, 2013 at 1:50 AM, Terebizh, Evgeny <eterebizh at amt.ru>
>> wrote:
>>
>> > Try also
>> > "show platform hardware interface gigabitEthernet 1/1 tx-queue".
>> > I guess it's gonna show the actual values for queue utilisation.
>> > Please let me know if this helps.
>> >
>> > /ET
>> >
>> >
>> >
>> >
>> > On 9/24/13 11:17 PM, "John Neiberger" <jneiberger at gmail.com> wrote:
>> >
>> > >I've been helping to troubleshoot an interesting problem with variable
>> > >latency through a 4948. I haven't run into this before. I usually have
>> > >seen
>> > >really low latency through 4948s, but this particular application
>> requires
>> > >consistent low latency and they've been noticing that latency goes up
>> on
>> > >average as load goes up. It didn't seem to be a problem on their
>> servers,
>> > >but communication through busy interfaces seemed to dramatically
>> increase
>> > >the latency. They were used to <1ms latency and it was bouncing up to
>> 20+
>> > >ms at times. I'm starting to think this is due to the shared output
>> buffer
>> > >in the 4948 causing the output buffer on the uplink to dynamically get
>> > >bigger.
>> > >
>> > >I've been trying to find more details on how the 4948 handles its
>> shared
>> > >output queue space, but I haven't been able to find anything. Do any of
>> > >you
>> > >know more about how this works and what commands I could use to
>> > >troubleshoot? I can't find anything that might show how much buffer
>> space
>> > >a
>> > >particular interface is using at any given time, or if it even makes
>> sense
>> > >to think of it that way. If I knew the size of the buffer at any given
>> > >moment, I could calculate the expected latency and prove whether or not
>> > >that was the problem.
>> > >
>> > >Thanks!
>> > >John
>> > >_______________________________________________
>> > >cisco-nsp mailing list  cisco-nsp at puck.nether.net
>> > >https://puck.nether.net/mailman/listinfo/cisco-nsp
>> > >archive at http://puck.nether.net/pipermail/cisco-nsp/
>> >
>> >
>> _______________________________________________
>> cisco-nsp mailing list  cisco-nsp at puck.nether.net
>> https://puck.nether.net/mailman/listinfo/cisco-nsp
>> archive at http://puck.nether.net/pipermail/cisco-nsp/
>>
>
>