[c-nsp] Multilink PPP (MLPPP) Asymmetrical Throughput Problem NxT1

Wed Jun 6 15:41:00 EDT 2007

Resolution was indeed increasing the output queue size.  Looks like around
160 to 240 (for bonded 2xT1 and 3xT1) seemed to do the trick.  Testing today
has been tremendously positive.  CEF does appear to be okay on MLPPP that
far back (woo-hoo!).

Thanks for your assistance and feedback!
Sean

-----Original Message-----
From: Rodney Dunn [mailto:rodunn at cisco.com] 
Sent: Wednesday, June 06, 2007 10:51 AM
To: Sean Shepard
Cc: 'Rodney Dunn'; cisco-nsp at puck.nether.net
Subject: Re: [c-nsp] Multilink PPP (MLPPP) Asymmetrical Throughput Problem
NxT1

On Tue, Jun 05, 2007 at 11:02:07PM -0400, Sean Shepard wrote:
> Thank you for the reply on this.  We did exactly what you mention here
> (trying to isolate channels) and found the performance metrics didn't
change
> very much except that there seemed to be little impairment with just a
> single T-1.

Good test.

  We do not believe that variance in latency exists to the point
> that we should be having a severe issue and it has since reoccurred on a
> couple of other bundled connections (on this same particular router - see
> below).

Fair enough. There were a lot of MLPPP bugs in older releases too.
MLPPP can be pretty complicated too becuase there are a lot of dependencies
on the driver code to report backpressure correctly to the bundle.
There is no queueing on the interface level so if the driver code doesn't
put the backpressure to the MLPPP virtual interface correctly you will
have probelems.

> 
> None of the T-1s seem to take errors in any of the bundles.  We do see a
lot
> of output queue drops on the Multilink interfaces but not sure how
> concerning that really is.

That's a problem. If they are valid drops you are overrunning the bundle
member links. 

> 
> The only difference between this device and similar ones on our network is
> that we have exceeded the number of fast interfaces (4 vs. recommended 3 -
> but the card in question is in the middle and should be getting its SRAM
> allotment okay) and we do terminate some ATM/PPPoE/L2TP sessions on this
> device.  The system is:

I'd be amazed if that had anything to do with it.

Did you disable MLPPP fragmentation "no ppp multilink fragmentation"
or it's one with the disable CLI. We changed it at some point along the
way.

> 
> 7206 (non-VXR)
> NPE-200 with IO-FE
> IOS 12.2(31) [bootldr 12.0(13)S]
>   (is there perhaps an issue in 12.2(31) with MLPPP?
>    I'd like to go to a 12.3 release but need to verify 
>    Support for the CT3/4T1 for two of our boxes).
> 
> We're using the older CT3/4T1 cards on this edge device and haven't had
> problems with MLPPP in the past on a similar system (running 12.2(23)c).

See above. There are driver dependencies for each card for MLPPP to work.
Can you get 'sh controller' just to see if it shows anything interesting
that's different between the two?

> 
> Download speed continues to perform okay in most tests but uploads get
> woefully bad and we start losing packets above 1.6 to 2.0 mbps (2%
observed
> today as things crept over 2mbps) regardless of the number of bundled
trunks
> [2 or 3].  It "seems" that performance improves in the evenings when there
> is less traffic going through the device, it's lightly loaded even during
> the day (maybe a total of 10 mbps being handled on this one system).

To really isolate that you first need to determine direction of loss/latency
and then narrow down the debugging. That's easier said than done.

> 
> I considered tweaking the buffers, but if it's an issue of emptying the
> queues fast enough (perhaps because it's servicing one too many high speed
> interfaces?) than putting more in the buffers that it can't get to might
> just make things worse.

My experience would say that's pretty much surely not the case. But I've
been wrong before. I don't know if we even have CEF support for MLPPP back
that far. In 'sh int stat' what does it look like for the bundle interface?

> 
> We have several customers utilizing VoIP and have some policy-maps on
those
> interfaces, none of them using MLPPP [yet] but a few on the same box and
> even the same card in question here.  No complaints about lost packets or
> voice quality there so the overall system seems sound and CPU utilization
is
> generally in the low double digits.  Various debug outputs don't seem to
> barking either.

It gets complicated but you would have to get the multilink debugs and
compare to see if you are seeing loss/delay for the fragments.

does sh ppp multilink show anything when you are doing a transfer
that is slow?

> 
> Any suggestions are appreciated.  I think I'm close to just dropping
another
> chassis in with this DS3 on it and seeing if the problem cleans up.

Get some upgraded code (late 12.3 or 12.4) would be a good recommendation.

--history snip--