[j-nsp] Strange Packet Loss Problem

Christian Koch ckoch at globix.com
Mon Oct 23 08:46:36 EDT 2006


Josef,

I can do the rapids, but we cant take down this OC48 link to put a loop.



-----Original Message-----
From: Josef Buchsteiner [mailto:josefb at juniper.net] 
Sent: Monday, October 23, 2006 8:45 AM
To: Christian Koch
Cc: juniper-nsp at puck.nether.net
Subject: Re: [j-nsp] Strange Packet Loss Problem

    I   see... lets then go back to the rapid pings and see where
    the icmp bad checksum counter is increasing and do then local
    loop actions...

    thanks
    Josef

Monday, October 23, 2006, 2:38:21 PM, you wrote:
CK>    
CK>    
CK> Hey Josef,
CK>  
CK>  Real quick, here is  a regular ping which looks fine, and a 
CK> snapshot of  system icmp stats
CK>  
CK>  ckoch at core2.lhr3> ping 212.71.229.25  PING 212.71.229.25 
CK> (212.71.229.25): 56 data bytes
CK>  64 bytes from 212.71.229.25: icmp_seq=0 ttl=251 time=1.113 ms
CK>  64 bytes from 212.71.229.25: icmp_seq=1 ttl=251 time=1.063 ms
CK>  64 bytes from 212.71.229.25: icmp_seq=2 ttl=251 time=0.957 ms
CK>  64 bytes from 212.71.229.25: icmp_seq=3 ttl=251 time=0.995 ms
CK>  64 bytes from 212.71.229.25: icmp_seq=4 ttl=251 time=0.947 ms
CK>  64 bytes from 212.71.229.25: icmp_seq=5 ttl=251 time=1.429 ms
CK>  64 bytes from 212.71.229.25: icmp_seq=6 ttl=251 time=0.934 ms
CK>  64 bytes from 212.71.229.25: icmp_seq=7 ttl=251 time=1.017 ms
CK>  64 bytes from 212.71.229.25: icmp_seq=8 ttl=251 time=0.935 ms
CK>  64 bytes from 212.71.229.25: icmp_seq=9 ttl=251 time=0.973 ms
CK>  64 bytes from 212.71.229.25: icmp_seq=10 ttl=251 time=1.138 ms
CK>  64 bytes from 212.71.229.25: icmp_seq=11 ttl=251 time=1.003 ms
CK>  64 bytes from 212.71.229.25: icmp_seq=12 ttl=251 time=0.946 ms
CK>  64 bytes from 212.71.229.25: icmp_seq=13 ttl=251 time=1.172 ms
CK>  64 bytes from 212.71.229.25: icmp_seq=14 ttl=251 time=1.002 ms
CK>  64 bytes from 212.71.229.25: icmp_seq=15 ttl=251 time=1.009 ms
CK>  64 bytes from 212.71.229.25: icmp_seq=16 ttl=251 time=0.921 ms
CK>  64 bytes from 212.71.229.25: icmp_seq=17 ttl=251 time=1.009 ms
CK>  64 bytes from 212.71.229.25: icmp_seq=18 ttl=251 time=0.959 ms
CK>  64 bytes from 212.71.229.25: icmp_seq=19 ttl=251 time=0.973 ms
CK>  64 bytes from 212.71.229.25: icmp_seq=20 ttl=251 time=2.362 ms
CK>  64 bytes from 212.71.229.25: icmp_seq=21 ttl=251 time=1.034 ms
CK>  64 bytes from 212.71.229.25: icmp_seq=22 ttl=251 time=1.007 ms
CK>  64 bytes from 212.71.229.25: icmp_seq=23 ttl=251 time=0.913 ms
CK>  64 bytes from 212.71.229.25: icmp_seq=24 ttl=251 time=0.959 ms
CK>  64 bytes from 212.71.229.25: icmp_seq=25 ttl=251 time=1.021 ms
CK>  64 bytes from 212.71.229.25: icmp_seq=26 ttl=251 time=0.996 ms
CK>  64 bytes from 212.71.229.25: icmp_seq=27 ttl=251 time=0.966 ms
CK>  64 bytes from 212.71.229.25: icmp_seq=28 ttl=251 time=3.044 ms
CK>  64 bytes from 212.71.229.25: icmp_seq=29 ttl=251 time=0.940 ms
CK>  64 bytes from 212.71.229.25: icmp_seq=30 ttl=251 time=0.946 ms
CK>  64 bytes from 212.71.229.25: icmp_seq=31 ttl=251 time=1.054 ms
CK>  64 bytes from 212.71.229.25: icmp_seq=32 ttl=251 time=0.989 ms
CK>  64 bytes from 212.71.229.25: icmp_seq=33 ttl=251 time=0.973 ms
CK>  64 bytes from 212.71.229.25: icmp_seq=34 ttl=251 time=0.963 ms
CK>  64 bytes from 212.71.229.25: icmp_seq=35 ttl=251 time=1.028 ms
CK>  64 bytes from 212.71.229.25: icmp_seq=36 ttl=251 time=0.954 ms
CK>  64 bytes from 212.71.229.25: icmp_seq=37 ttl=251 time=0.965 ms
CK>  64 bytes from 212.71.229.25: icmp_seq=38 ttl=251 time=0.999 ms
CK>  64 bytes from 212.71.229.25: icmp_seq=39 ttl=251 time=1.003 ms
CK>  64 bytes from 212.71.229.25: icmp_seq=40 ttl=251 time=0.897 ms
CK>  64 bytes from 212.71.229.25: icmp_seq=41 ttl=251 time=1.126 ms
CK>  64 bytes from 212.71.229.25: icmp_seq=42 ttl=251 time=0.913 ms
CK>  64 bytes from 212.71.229.25: icmp_seq=43 ttl=251 time=1.058 ms
CK>  64 bytes from 212.71.229.25: icmp_seq=44 ttl=251 time=0.942 ms
CK>  64 bytes from 212.71.229.25: icmp_seq=45 ttl=251 time=1.030 ms
CK>  64 bytes from 212.71.229.25: icmp_seq=46 ttl=251 time=0.941 ms
CK>  64 bytes from 212.71.229.25: icmp_seq=47 ttl=251 time=1.047 ms
CK>  64 bytes from 212.71.229.25: icmp_seq=48 ttl=251 time=0.938 ms
CK>  64 bytes from 212.71.229.25: icmp_seq=49 ttl=251 time=0.954 ms
CK>  64 bytes from 212.71.229.25: icmp_seq=50 ttl=251 time=0.964 ms
CK>  64 bytes from 212.71.229.25: icmp_seq=51 ttl=251 time=0.976 ms
CK>  64 bytes from 212.71.229.25: icmp_seq=52 ttl=251 time=0.944 ms
CK>  64 bytes from 212.71.229.25: icmp_seq=53 ttl=251 time=1.006 ms
CK>  64 bytes from 212.71.229.25: icmp_seq=54 ttl=251 time=0.972 ms
CK>  64 bytes from 212.71.229.25: icmp_seq=55 ttl=251 time=1.585 ms
CK>  64 bytes from 212.71.229.25: icmp_seq=56 ttl=251 time=0.990 ms
CK>  64 bytes from 212.71.229.25: icmp_seq=57 ttl=251 time=11.160 ms
CK>  64 bytes from 212.71.229.25: icmp_seq=58 ttl=251 time=0.948 ms
CK>  64 bytes from 212.71.229.25: icmp_seq=59 ttl=251 time=21.018 ms
CK>  64 bytes from 212.71.229.25: icmp_seq=60 ttl=251 time=0.996 ms
CK>  64 bytes from 212.71.229.25: icmp_seq=61 ttl=251 time=30.768 ms  ^C
CK>  --- 212.71.229.25 ping statistics ---
CK>  62 packets transmitted, 62 packets received, 0% packet loss  
CK> round-trip min/avg/max/stddev = 0.897/2.029/30.768/4.634 ms
CK>  
CK>  ckoch at core2.lhr3> show system statistics icmp
CK>  icmp:
CK>          0 drops due to rate limit
CK>          3747728 calls to icmp_error
CK>          0 errors not generated because old message was icmp
CK>          Output histogram:
CK>                  echo reply: 9220380
CK>                  destination unreachable: 3745300
CK>                  time exceeded: 1577
CK>                  time stamp reply: 472
CK>          2 messages with bad code fields
CK>          0 messages less than the minimum length
CK>          338 messages with bad checksum
CK>          2 messages with bad source address
CK>          28 messages with bad length
CK>          15 echo drops with broadcast or multicast destinaton 
CK> address
CK>          0 timestamp drops with broadcast or multicast destination  
CK> address
CK>          Input histogram:
CK>                  echo reply: 50251
CK>                  destination unreachable: 5631
CK>                  source quench: 6
CK>                  routing redirect: 829
CK>                  #7: 2
CK>                  echo: 9220395
CK>                  time exceeded: 1883
CK>                  parameter problem: 1
CK>                  time stamp: 472
CK>                  time stamp reply: 1
CK>                  information request reply: 20
CK>                  address mask reply: 2
CK>          9220852 message responses generated
CK>  
CK>  -----Original Message-----
CK>  From: Josef Buchsteiner [mailto:josefb at juniper.net]
CK>  Sent: Saturday, October 21, 2006 9:27 AM
CK>  To: Christian Koch
CK>  Cc: juniper-nsp at puck.nether.net
CK>  Subject: Re: [j-nsp] Strange Packet Loss Problem
CK>  
CK>  Christian,
CK>  
CK>            if it is a data corruption on certain bit patterns then
CK>            you  should  see this problem also without rapid pings.
CK>            In  order  to  find  out  where  this  happen  I  would
CK>            certainly  perform  a  regular ping ( you may need more
CK>            packets ) and watch the icmp statistics to confirm that
CK>            it  is  data  corruption  in  case  you get icmp chksum
CK>            errors. "show system statistics icmp"
CK>  
CK>            To find out if this is local or remote ( you could make
CK>            some   statements  with  the  icmp  statistics  already
CK>            however I prefer solid results)and you need to put this
CK>            sonet link into local loop and encapsulation cisco-hdlc
CK>            with no-keepalive. Then you perform the ping again with
CK>            bypass  routing  and  interface  knob  to make sure the
CK>            traffic  goes  out  and comes back in and you watch the
CK>            packet  result  and  icmp  statistics. This way you will
CK>            find  out  if this is local M40 or remote M40.
CK>  
CK>            Once  you  know  this  it  becomes kinda bit tricky. It
CK>            could  be  the  FPC in question that is the trigger but
CK>            also  other  FPCs  are  contributing in the shared data
CK>            buffer  which  means  you would need to turn off one by
CK>            one  and  see  if the error stops. If you are left with
CK>            the  OC-48  then  I  would  suggest  to  replace  this
CK>            particular FPC. To sanity check also other links of the
CK>            router if possible.
CK>  
CK>            Check  also  the  message  log  if  you ever have seen
CK>            anytime ECC error reported. Just one of them is already
CK>            enough  to give you some hints as you would not need to
CK>            do above isolation work
CK>  
CK>            hope this helps
CK>            Josef
CK>  
CK>            PS:   yes  I  do  have  seen  such symptoms in the past
CK>            and not only on one vendor.
CK>  
CK>  
CK>           
CK>  Friday, October 20, 2006, 8:36:09 PM, you wrote:
CK>  
 CK>>
 CK>>
 CK>> Hi All,
 CK>>
 CK>>  I am experiencing a strange issue between 2 m40 core routers  CK>> connect  through an oc48 link  CK>>  CK>>  I experience packet loss only when sending packets between 293 and  CK>> 599  bytes.
 CK>>
 CK>>  I am also speaking to cold telecom about the issue to see if it's a
CK>  
 CK>> problem with the sonet link, but as of now the link is not taking  CK>> any  errors or alarms.
 CK>>
 CK>>  Anyone seen anything strange like this before?
 CK>>
 CK>>  ckoch at core2.lhr3> ping rapid 212.71.229.25 count 10000 size 300  CK>> PING 212.71.229.25 (212.71.229.25): 300 data bytes  CK>>  CK>>
CK> 
CK> !.!!.!!.!!!!.!.!....!!.!!!!!..!..!!.!!!..!!.!!....!!.!...!!!...!.!.!..!.
 CK>>  .!!...!....!..!.!.!!.!!.!!....!.!!!!!!..!..!.!!.!..!!.!!^C
 CK>>  --- 212.71.229.25 ping statistics ---  CK>>  129 packets transmitted, 66 packets received, 48% packet loss  CK>> round-trip min/avg/max/stddev = 1.023/1.640/34.095/4.035 ms  CK>>  CK>>  ckoch at core2.lhr3> ping rapid 212.71.229.25 count 10000 size 400  CK>> PING 212.71.229.25 (212.71.229.25): 400 data bytes  CK>> !!!!!!!!.!!!!!...!!^C  CK>>  --- 212.71.229.25 ping statistics ---  20 packets transmitted, 15  CK>> packets received, 25% packet loss  round-trip min/avg/max/stddev =  CK>> 1.074/1.111/1.348/0.071 ms  CK>>  CK>>  ckoch at core2.lhr3> ping rapid 212.71.229.25 count 10000 size 500  CK>> PING 212.71.229.25 (212.71.229.25): 500 data bytes  CK>> !!!!!!!!!!!!!!.!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!.!^C
 CK>>  --- 212.71.229.25 ping statistics ---  CK>>  49 packets transmitted, 46 packets received, 6% packet loss  CK>> round-trip min/avg/max/stddev = 1.141/1.202/2.514/0.210 ms  CK>>  CK>>  ckoch at core2.lhr3> ping rapid 212.71.229.25 count 10000 size 600  CK>> PING 212.71.229.25 (212.71.229.25): 600 data bytes  CK>>  CK>>
CK> 
CK> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
 CK>>
 CK>>
CK> 
CK> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
 CK>>
 CK>>
CK> 
CK> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
 CK>>
 CK>>
CK> 
CK> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
 CK>>
 CK>>
CK> 
CK> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
 CK>>
 CK>>
CK> 
CK> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
 CK>>
 CK>>
CK> 
CK> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
 CK>>
 CK>>
CK> 
CK> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
 CK>>
 CK>>
CK> 
CK> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
 CK>>
 CK>>
CK> 
CK> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
 CK>>
 CK>>
CK> 
CK> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
 CK>>
 CK>>
CK> 
CK> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
 CK>>  !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!^C
 CK>>  --- 212.71.229.25 ping statistics ---  CK>>  955 packets transmitted, 954 packets received, 0% packet loss  CK>> round-trip min/avg/max/stddev = 1.183/1.526/36.436/1.817 ms  CK>>  CK>>  koch at core2.lhr3> ping rapid 212.71.229.25 count 10000 size 292  CK>> PING 212.71.229.25 (212.71.229.25): 292 data bytes  CK>>  CK>>
CK> 
CK> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
 CK>>
 CK>>
CK> 
CK> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
 CK>>
 CK>>
CK> 
CK> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
 CK>>
 CK>>
CK> 
CK> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
 CK>>
 CK>>
CK> 
CK> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
 CK>>
 CK>>
CK> 
CK> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
 CK>>
 CK>>
CK> 
CK> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
 CK>>
 CK>>
CK> 
CK> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
 CK>>
 CK>>
CK> 
CK> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
 CK>>
 CK>>
CK> 
CK> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
 CK>>
 CK>>
CK> 
CK> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
 CK>>
 CK>>
CK> 
CK> !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
 CK>>  !!!!!!!!^C!
 CK>>  --- 212.71.229.25 ping statistics ---  CK>>  916 packets transmitted, 916 packets received, 0% packet loss  CK>> round-trip min/avg/max/stddev = 0.988/1.146/9.918/1.061 ms  CK>>  CK>>  ckoch at core2.lhr3> ping rapid 212.71.229.25 count 10000 size 293  CK>> PING 212.71.229.25 (212.71.229.25): 293 data bytes  ..!!!.!^C  CK>>  CK>>  Christian  CK>>  _______________________________________________
 CK>>  juniper-nsp mailing list juniper-nsp at puck.nether.net  CK>> https://puck.nether.net/mailman/listinfo/juniper-nsp
 CK>>
 CK>>
 CK>>   
CK>  
CK>   
CK>   
CK>    
 




More information about the juniper-nsp mailing list