[j-nsp] M7i/M10i - 8.5R4.3 - cfeb RDP: Keepalive timeout for rdp.(scb:39937)

Gibson, Aaron F aaron.gibson at verizonbusiness.com
Fri Oct 9 17:46:06 EDT 2009


Nilesh-

Per subject line the FEB/CFEB failures have been predominantly on 8.5R4.3.

Thank you for the below infromation. I will forward this onto on our NOC and
have them begin uploading that data to the currently open cases.

Thanks
Aaron

-----Original Message-----
From: Nilesh Khambal [mailto:nkhambal at juniper.net] 
Sent: Friday, October 09, 2009 2:11 PM
To: Gibson, Aaron F; juniper-nsp at puck.nether.net
Subject: Re: [j-nsp] M7i/M10i - 8.5R4.3 - cfeb RDP: Keepalive timeout for
rdp.(scb:39937)

Hi Aaron,

What is the JUNOS version on this router?

RDP is a Reliable Delivery Protocol. Its an internal TCP-like protocol used
between RE and PFE (CFEB board in this case) to communicate with each and
exchange information such as route, stats, interface status etc. This
communication happens over a socket which connects RE and PFE. Periodic
keepalives are exchanged by RE and PFE to check the health of this
connection.

If you are seeing RDP keepalives timeouts and a subsequent CFEB resets, you
should start with checking for the following things.

- is RE CPU running high? (show chassis routing-engine and show system
  processes extensive)
- is CFEB CPU running high? (show chassis cfeb)
- is CFEB running low on heap/buffer memory? (show chassis cfeb)
- is RE running low on system buffers? ("show system virtual-memory" and
  "show system buffers")
- Any type of DDoS attack (ARP storm for example) targeted to the router
  interfaces sending too many packets to RE. Do you have a loopback filter
  configured?
- If sampling is enabled, any high rate of sampling? Sudden increase in
  traffic volume may cause more than normal data to be sampled.
- investigate possible configuration and routing policy changes in your
  network that would cause this router to attract more traffic than it
  earlier did before the problem surfaced.

In Addition gather these commands outputs as well (take at least 2-3
snapshots and 5 mins interval).

- show pfe statistics traffic
- show pfe statistics notification
- show pfe statistics discard
- show pfe statistics error

>From CFEB shell (to log into CFEB, enter shell as root and do "vty cfeb0"

- show packet
- show packet statistics
- show ttp statistics
- show heap 0
- show heap 1
- show route summary
- show syslog messages
- show syslog info
- show nvram

Gather all the information and provide this data to the JTAC case owner.

Thanks,
Nilesh.

On 10/9/09 9:57 AM, "Gibson, Aaron F" <aaron.gibson at verizonbusiness.com>
wrote:

> We having been losing CFEBs like a plague all with the above error (or
> similar) in the logs.  No one at Juniper seems to know what RDP is 
> (nearly 30 JTAC tickets opened in the last few months) does anyone on 
> this list have any insight?
> 
> Aaron



More information about the juniper-nsp mailing list