[j-nsp] Virtual Chassis RPD/BGP Rsync high CPU

Scott Harvanek scott.harvanek at login.com
Wed Sep 24 13:38:32 EDT 2014


Okay so we traced this down to BGP Replication for NSR.  Looks like a 
bad attribute kills the replication process.  Other than blocking the 
received prefix is there a way to fix this:

Sep 24 17:31:06  TUS-2-VC-1 rpd[48424]: Received malformed update from 
xxxxxxxxx
Sep 24 17:31:06  TUS-2-VC-1 rpd[48424]:   Family inet-unicast, prefix 
5.56.168.0/21
Sep 24 17:31:06  TUS-2-VC-1 rpd[48424]:   Malformed Attribute 
AGGREGATOR4(18) flag 0xc0 length 8.
Sep 24 17:31:06  TUS-2-VC-1 rpd[48424]: Total incoming malformed 
attributes from xxxxxxxxxx since last logging
Sep 24 17:31:06  TUS-2-VC-1 rpd[48424]:   Received  1 malformed 
attribute AGGREGATOR4(18)

Mind you, the primary session with the peer stays up, this only kills 
the replication process...

Scott H.

On 9/18/14, 11:38 AM, Scott Harvanek wrote:
> Has anyone had a issue with MX units in a VC where BGP rsync was 
> consuming a boatload of CPU?
>
> Master chassis shows:
> Task                       Started    User Time  System Time Longest Run
> BGP rsync                     9650          10. 0.8          0.0
> ( BGP rsync is the only task with any user time during high user CPU 
> for rpd )
>
> now, that's only like 20% CPU on the master but on the slave it's 
> 90%....  This seems to have happened when our total paths exceeded 2MM 
> but does not seem to be a memory issue:
>
>     Dynamically allocated memory:  411009024      Maximum: 808517632
>          Program data+BSS memory:    5537792      Maximum: 5537792
>               Page data overhead:    1196032      Maximum: 1196032
>              Page directory size:     212992      Maximum: 212992
>                   ----------
>               Total bytes in use:  417955840 (12% of available memory)
>



More information about the juniper-nsp mailing list