[c-nsp] SDSL Multilink PPP high latency / lost fragments / input errors on cisco 7200

Johannes Jakob jjj at 3js.de
Fri Mar 26 09:16:12 EDT 2010


To follow up my own thread...

all of you might have seen it on first look - I didn't.

It's a problem on the linux side of course.
While the cisco isn't fragmenting, the linux pppd seems to be...

So I guess I'm looking for answers on the wrong list ;)

with multilink I'll propably get similar problems, because linux doesn't
support per packet multipath AFAIK.

BUT: one question that you guys might be able to answer:
when using packet based multipath, will that be a problem for voice and/or
UDP packets?


Thanks for reading,

  John

 


Am Mittwoch, den 24.03.2010, 15:57 +0100 schrieb Johannes Jakob
<jjj at 3js.de>:
> [edited copy of email to cisco-bba yesterday]
> 
> Dear colleagues,
> 
> I've got some serious trouble debugging a problem with some of our
> multilink bundles.
> I already moved some of them to a separate LNS, to have a better
debugging
> chance.
> 
> I'm talking about L2TP encapsulated PPP bundles all coming from a large
> national carrier, originated by linux based CPEs.
> All of them are directly terminated on the LNS the carrier sends the
> tunnels to.
> No forwarding, no mmlp/sgbp on these bundles.
> 
> At this debugging level, there are only two bundles with two links each,
> but the problem is exactly the same when there are more bundles (same
> carrier).
> 
> Continuously pinging the CPEs at the other end of the bundle shows peaks
> of >1000ms (up to >9000ms) and sometimes single to few packets get lost.
> 
> debug ppp multilink events at these times says:
> 
> 
> Mar 23 13:56:54: Vi128 MLP: Lost fragment timeout, seq 7BEAD
> Mar 23 13:56:54: Vi128 MLP: Discard reassembled packet
> Mar 23 13:56:54: Vi148 MLP: Lost fragment timeout, seq 6A3CF
> Mar 23 13:56:54: Vi148 MLP: Discard reassembled packet
> Mar 23 13:57:04: Vi148 MLP: Lost fragment timeout, seq 6A41F
> Mar 23 13:57:04: Vi148 MLP: Discard reassembled packet
> Mar 23 13:57:10: Vi128 MLP: Lost fragment timeout, seq 7BFDF
> Mar 23 13:57:10: Vi128 MLP: Discard reassembled packet
> Mar 23 13:57:11: Vi128 MLP: Lost fragment timeout, seq 7BFE1
> Mar 23 13:57:11: Vi128 MLP: Discard reassembled packet
> Mar 23 13:57:12: Vi128 MLP: Lost fragment timeout, seq 7BFEA
> Mar 23 13:57:12: Vi128 MLP: Begin bit lost, discard fragment 7BFEB
> 
> 
> 
> lost fragment counters increase at these times, reordered counter
steadily
> increases all the time.
> 
> 
> 
> What drives me crazy is that it's not one single bundle that is having
the
> problem at a time, but it's *all* of the bundles on this LNS
> simultaneously.
> It's not a constant problem and happens from time to time. Completely
> unpredictable (duration and interval)!
> 
> 
> 
> 
> When forwarding those links to a linux LNS running rp-l2tpd, those links
> get bundled just fine, no such problem, everything seems to be ok.
> => neither the individual lines nor the carrier itself is the problem
> here.
> 
> 
> So I wonder whether there is a common buffer that all bundles share or
if
> you guys know of any other thing I could check?
> I already tried setting 
> 
> ppp multilink queue depth qos 3
> ppp multilink queue depth fifo 3 / 50 / 255
> and/or
> ppp multilink slippage mru 16
> 
> 
> without any luck or significant change.
> 
> 
> BTW: where can I check the status of these buffers?
> 
> 
> 
> 
> I already read most of the multilink related messages in the archive and
> considered switching to multipath with packet based forwarding, but
> because
> of the linux kernel on the CPEs only flow based forwarding can be
deployed
> cpe outbound, so the usage of the single links won't be equal enough...
> 
> 
> 
> Any hints, tips, tricks or criticism would be appreciated ;-)
> 
> 
> Thanks in advance,
> 
> 
>    John
> 
> 
> 
> 
> P.S.: The individual links themselves are just fine, when using
multipath
> instead of multilink or just forwarding them to the linux box to let it
> bundle them, everything is fine. Just when the 7200 should do it...
> everything goes crazy...
> 
> Cisco IOS Software, 7200 Software (C7200-A3JK9S-M), Version 12.4(25b),
> RELEASE SOFTWARE (fc1)
> 
> Cisco 7204VXR (NPE300) processor (revision D) with 229376K/65536K bytes
of
> memory.
> Processor board ID 28711625
> R7000 CPU at 262MHz, Implementation 39, Rev 2.1, 256KB L2 Cache
> 4 slot VXR midplane, Version 2.7
> 
> 
> 
> 
> 
> 
> 
> 
> lns3#show int Vi128
> Virtual-Access128 is up, line protocol is up
>   Hardware is Virtual Access interface
>   Interface is unnumbered. Using address of Loopback11 (3.3.3.3)
>   MTU 1454 bytes, BW 2000000 Kbit/sec, DLY 100000 usec,
>      reliability 255/255, txload 1/255, rxload 1/255
>   Encapsulation PPP, LCP Open, multilink Open
>   Listen: IPV6CP
>   Open: IPCP
>   MLP Bundle vaccess, cloned from AAA, Virtual-Template1
>   Vaccess status 0x40, loopback not set
>   Keepalive set (10 sec)
>   DTR is pulsed for 5 seconds on reset
>   Last input 00:00:42, output never, output hang never
>   Last clearing of "show interface" counters 02:52:23
>   Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 0
>   Queueing strategy: fifo
>   Output queue: 0/40 (size/max)
>   5 minute input rate 5000 bits/sec, 9 packets/sec
>   5 minute output rate 6000 bits/sec, 9 packets/sec
>      65649 packets input, 4366426 bytes, 0 no buffer
>      Received 0 broadcasts, 0 runts, 0 giants, 0 throttles
>      80 input errors, 0 CRC, 18 frame, 0 overrun, 0 ignored, 21 abort
>      65710 packets output, 5113033 bytes, 0 underruns
>      0 output errors, 0 collisions, 0 interface resets
>      0 unknown protocol drops
>      0 output buffer failures, 0 output buffers swapped out
>      0 carrier transitions
> 
> lns3#show int Vi148
> Virtual-Access148 is up, line protocol is up
>   Hardware is Virtual Access interface
>   Interface is unnumbered. Using address of Loopback11 (3.3.3.3)
>   MTU 1454 bytes, BW 2000000 Kbit/sec, DLY 100000 usec,
>      reliability 255/255, txload 1/255, rxload 1/255
>   Encapsulation PPP, LCP Open, multilink Open
>   Listen: IPV6CP
>   Open: IPCP
>   MLP Bundle vaccess, cloned from AAA, Virtual-Template1
>   Vaccess status 0x40, loopback not set
>   Keepalive set (10 sec)
>   DTR is pulsed for 5 seconds on reset
>   Last input 00:00:14, output never, output hang never
>   Last clearing of "show interface" counters 02:56:51
>   Input queue: 0/75/0/0 (size/max/drops/flushes); Total output drops: 0
>   Queueing strategy: fifo
>   Output queue: 0/40 (size/max)
>   5 minute input rate 1000 bits/sec, 3 packets/sec
>   5 minute output rate 2000 bits/sec, 3 packets/sec
>      45163 packets input, 3066362 bytes, 0 no buffer
>      Received 0 broadcasts, 0 runts, 0 giants, 0 throttles
>      47 input errors, 0 CRC, 13 frame, 0 overrun, 0 ignored, 10 abort
>      45715 packets output, 3720617 bytes, 0 underruns
>      0 output errors, 0 collisions, 0 interface resets
>      0 unknown protocol drops
>      0 output buffer failures, 0 output buffers swapped out
>      0 carrier transitions
> 
> 
> lns3#show ppp multilink interface vi128
> 
> Virtual-Access128, bundle name is user1 at realm/0002.b641.8893
>   Username is user1 at realm
>   Endpoint discriminator is 0002.b641.8893
>   Bundle up for 16:30:44, total bandwidth 2000000, load 1/255
>   Receive buffer limit 48768 bytes, frag timeout 1000 ms
>   Using relaxed lost fragment detection algorithm.
>     0/0 fragments/bytes in reassembly list
>     41 lost fragments, 37978 reordered
>     39/1245 discarded fragments/bytes, 0 lost received
>     0x7DA00 received sequence, 0x3ED96 sent sequence
>   Member links: 2 (max not set, min not set)
>     upstream:Vi126  (2.2.2.2), since 16:30:44, unsequenced
>     upstream:Vi144  (2.2.2.2), since 16:30:42, unsequenced
> 
> 
> lns3#show ppp multilink interface vi148
> 
> Virtual-Access148, bundle name is user2 at realm/0002.b641.888d
>   Username is user2 at realm
>   Endpoint discriminator is 0002.b641.888d
>   Bundle up for 16:30:51, total bandwidth 2000000, load 1/255
>   Receive buffer limit 48768 bytes, frag timeout 1000 ms
>   Using relaxed lost fragment detection algorithm.
>     0/0 fragments/bytes in reassembly list
>     24 lost fragments, 27218 reordered
>     23/774 discarded fragments/bytes, 0 lost received
>     0x6B390 received sequence, 0x363CB sent sequence
>   Member links: 2 (max not set, min not set)
>     upstream:Vi146  (1.1.1.1), since 16:30:51, unsequenced
>     upstream:Vi147  (1.1.1.1), since 16:30:51, unsequenced
> 
> 
> 
> 
> interface Virtual-Template1
>  description Default Template for L2TP Termination & Multilink Bundles
(!)
>  mtu 1454
>  ip unnumbered Loopback10
>  no ip redirects
>  no ip proxy-arp
>  ip tcp adjust-mss 1414
>  no logging event link-status
>  no snmp trap link-status
>  ipv6 enable
>  ipv6 verify unicast reverse-path
>  no peer default ip address
>  ppp authentication pap chap callin
>  ppp ipcp dns 141.1.1.1 8.8.8.8
>  ppp multilink
>  ppp multilink fragment disable
> end


More information about the cisco-nsp mailing list