[c-nsp] SDSL Multilink PPP high latency / lost fragments / input errors on cisco 7200

Johannes Jakob jjj at 3js.de
Wed Mar 24 10:57:41 EDT 2010


[edited copy of email to cisco-bba yesterday]

Dear colleagues,

I've got some serious trouble debugging a problem with some of our
multilink bundles.
I already moved some of them to a separate LNS, to have a better debugging
chance.

I'm talking about L2TP encapsulated PPP bundles all coming from a large
national carrier, originated by linux based CPEs.
All of them are directly terminated on the LNS the carrier sends the
tunnels to.
No forwarding, no mmlp/sgbp on these bundles.

At this debugging level, there are only two bundles with two links each,
but the problem is exactly the same when there are more bundles (same
carrier).

Continuously pinging the CPEs at the other end of the bundle shows peaks
of >1000ms (up to >9000ms) and sometimes single to few packets get lost.

debug ppp multilink events at these times says:


Mar 23 13:56:54: Vi128 MLP: Lost fragment timeout, seq 7BEAD
Mar 23 13:56:54: Vi128 MLP: Discard reassembled packet
Mar 23 13:56:54: Vi148 MLP: Lost fragment timeout, seq 6A3CF
Mar 23 13:56:54: Vi148 MLP: Discard reassembled packet
Mar 23 13:57:04: Vi148 MLP: Lost fragment timeout, seq 6A41F
Mar 23 13:57:04: Vi148 MLP: Discard reassembled packet
Mar 23 13:57:10: Vi128 MLP: Lost fragment timeout, seq 7BFDF
Mar 23 13:57:10: Vi128 MLP: Discard reassembled packet
Mar 23 13:57:11: Vi128 MLP: Lost fragment timeout, seq 7BFE1
Mar 23 13:57:11: Vi128 MLP: Discard reassembled packet
Mar 23 13:57:12: Vi128 MLP: Lost fragment timeout, seq 7BFEA
Mar 23 13:57:12: Vi128 MLP: Begin bit lost, discard fragment 7BFEB



lost fragment counters increase at these times, reordered counter steadily
increases all the time.



What drives me crazy is that it's not one single bundle that is having the
problem at a time, but it's *all* of the bundles on this LNS
simultaneously.
It's not a constant problem and happens from time to time. Completely
unpredictable (duration and interval)!




When forwarding those links to a linux LNS running rp-l2tpd, those links
get bundled just fine, no such problem, everything seems to be ok.
=> neither the individual lines nor the carrier itself is the problem
here.


So I wonder whether there is a common buffer that all bundles share or if
you guys know of any other thing I could check?
I already tried setting 

ppp multilink queue depth qos 3
ppp multilink queue depth fifo 3 / 50 / 255
and/or
ppp multilink slippage mru 16


without any luck or significant change.


BTW: where can I check the status of these buffers?




I already read most of the multilink related messages in the archive and
considered switching to multipath with packet based forwarding, but
because
of the linux kernel on the CPEs only flow based forwarding can be deployed
cpe outbound, so the usage of the single links won't be equal enough...



Any hints, tips, tricks or criticism would be appreciated ;-)


Thanks in advance,


   John




P.S.: The individual links themselves are just fine, when using multipath
instead of multilink or just forwarding them to the linux box to let it
bundle them, everything is fine. Just when the 7200 should do it...
everything goes crazy...

Cisco IOS Software, 7200 Software (C7200-A3JK9S-M), Version 12.4(25b),
RELEASE SOFTWARE (fc1)

Cisco 7204VXR (NPE300) processor (revision D) with 229376K/65536K bytes of
memory.
Processor board ID 28711625
R7000 CPU at 262MHz, Implementation 39, Rev 2.1, 256KB L2 Cache
4 slot VXR midplane, Version 2.7








lns3#show int Vi128
Virtual-Access128 is up, line protocol is up
  Hardware is Virtual Access interface
  Interface is unnumbered. Using address of Loopback11 (3.3.3.3)
  MTU 1454 bytes, BW 2000000 Kbit/sec, DLY 100000 usec,
     reliability 255/255, txload 1/255, rxload 1/255
  Encapsulation PPP, LCP Open, multilink Open
  Listen: IPV6CP
  Open: IPCP
  MLP Bundle vaccess, cloned from AAA, Virtual-Template1
  Vaccess status 0x40, loopback not set
  Keepalive set (10 sec)
  DTR is pulsed for 5 seconds on reset
  Last input 00:00:42, output never, output hang never
  Last clearing of "show interface" counters 02:52:23
  Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 0
  Queueing strategy: fifo
  Output queue: 0/40 (size/max)
  5 minute input rate 5000 bits/sec, 9 packets/sec
  5 minute output rate 6000 bits/sec, 9 packets/sec
     65649 packets input, 4366426 bytes, 0 no buffer
     Received 0 broadcasts, 0 runts, 0 giants, 0 throttles
     80 input errors, 0 CRC, 18 frame, 0 overrun, 0 ignored, 21 abort
     65710 packets output, 5113033 bytes, 0 underruns
     0 output errors, 0 collisions, 0 interface resets
     0 unknown protocol drops
     0 output buffer failures, 0 output buffers swapped out
     0 carrier transitions

lns3#show int Vi148
Virtual-Access148 is up, line protocol is up
  Hardware is Virtual Access interface
  Interface is unnumbered. Using address of Loopback11 (3.3.3.3)
  MTU 1454 bytes, BW 2000000 Kbit/sec, DLY 100000 usec,
     reliability 255/255, txload 1/255, rxload 1/255
  Encapsulation PPP, LCP Open, multilink Open
  Listen: IPV6CP
  Open: IPCP
  MLP Bundle vaccess, cloned from AAA, Virtual-Template1
  Vaccess status 0x40, loopback not set
  Keepalive set (10 sec)
  DTR is pulsed for 5 seconds on reset
  Last input 00:00:14, output never, output hang never
  Last clearing of "show interface" counters 02:56:51
  Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 0
  Queueing strategy: fifo
  Output queue: 0/40 (size/max)
  5 minute input rate 1000 bits/sec, 3 packets/sec
  5 minute output rate 2000 bits/sec, 3 packets/sec
     45163 packets input, 3066362 bytes, 0 no buffer
     Received 0 broadcasts, 0 runts, 0 giants, 0 throttles
     47 input errors, 0 CRC, 13 frame, 0 overrun, 0 ignored, 10 abort
     45715 packets output, 3720617 bytes, 0 underruns
     0 output errors, 0 collisions, 0 interface resets
     0 unknown protocol drops
     0 output buffer failures, 0 output buffers swapped out
     0 carrier transitions


lns3#show ppp multilink interface vi128

Virtual-Access128, bundle name is user1 at realm/0002.b641.8893
  Username is user1 at realm
  Endpoint discriminator is 0002.b641.8893
  Bundle up for 16:30:44, total bandwidth 2000000, load 1/255
  Receive buffer limit 48768 bytes, frag timeout 1000 ms
  Using relaxed lost fragment detection algorithm.
    0/0 fragments/bytes in reassembly list
    41 lost fragments, 37978 reordered
    39/1245 discarded fragments/bytes, 0 lost received
    0x7DA00 received sequence, 0x3ED96 sent sequence
  Member links: 2 (max not set, min not set)
    upstream:Vi126  (2.2.2.2), since 16:30:44, unsequenced
    upstream:Vi144  (2.2.2.2), since 16:30:42, unsequenced


lns3#show ppp multilink interface vi148

Virtual-Access148, bundle name is user2 at realm/0002.b641.888d
  Username is user2 at realm
  Endpoint discriminator is 0002.b641.888d
  Bundle up for 16:30:51, total bandwidth 2000000, load 1/255
  Receive buffer limit 48768 bytes, frag timeout 1000 ms
  Using relaxed lost fragment detection algorithm.
    0/0 fragments/bytes in reassembly list
    24 lost fragments, 27218 reordered
    23/774 discarded fragments/bytes, 0 lost received
    0x6B390 received sequence, 0x363CB sent sequence
  Member links: 2 (max not set, min not set)
    upstream:Vi146  (1.1.1.1), since 16:30:51, unsequenced
    upstream:Vi147  (1.1.1.1), since 16:30:51, unsequenced




interface Virtual-Template1
 description Default Template for L2TP Termination & Multilink Bundles (!)
 mtu 1454
 ip unnumbered Loopback10
 no ip redirects
 no ip proxy-arp
 ip tcp adjust-mss 1414
 no logging event link-status
 no snmp trap link-status
 ipv6 enable
 ipv6 verify unicast reverse-path
 no peer default ip address
 ppp authentication pap chap callin
 ppp ipcp dns 141.1.1.1 8.8.8.8
 ppp multilink
 ppp multilink fragment disable
end



More information about the cisco-nsp mailing list