[j-nsp] Virtual Chassis RPD/BGP Rsync high CPU
Scott Harvanek
scott.harvanek at login.com
Wed Sep 24 13:38:32 EDT 2014
Okay so we traced this down to BGP Replication for NSR. Looks like a
bad attribute kills the replication process. Other than blocking the
received prefix is there a way to fix this:
Sep 24 17:31:06 TUS-2-VC-1 rpd[48424]: Received malformed update from
xxxxxxxxx
Sep 24 17:31:06 TUS-2-VC-1 rpd[48424]: Family inet-unicast, prefix
5.56.168.0/21
Sep 24 17:31:06 TUS-2-VC-1 rpd[48424]: Malformed Attribute
AGGREGATOR4(18) flag 0xc0 length 8.
Sep 24 17:31:06 TUS-2-VC-1 rpd[48424]: Total incoming malformed
attributes from xxxxxxxxxx since last logging
Sep 24 17:31:06 TUS-2-VC-1 rpd[48424]: Received 1 malformed
attribute AGGREGATOR4(18)
Mind you, the primary session with the peer stays up, this only kills
the replication process...
Scott H.
On 9/18/14, 11:38 AM, Scott Harvanek wrote:
> Has anyone had a issue with MX units in a VC where BGP rsync was
> consuming a boatload of CPU?
>
> Master chassis shows:
> Task Started User Time System Time Longest Run
> BGP rsync 9650 10. 0.8 0.0
> ( BGP rsync is the only task with any user time during high user CPU
> for rpd )
>
> now, that's only like 20% CPU on the master but on the slave it's
> 90%.... This seems to have happened when our total paths exceeded 2MM
> but does not seem to be a memory issue:
>
> Dynamically allocated memory: 411009024 Maximum: 808517632
> Program data+BSS memory: 5537792 Maximum: 5537792
> Page data overhead: 1196032 Maximum: 1196032
> Page directory size: 212992 Maximum: 212992
> ----------
> Total bytes in use: 417955840 (12% of available memory)
>
More information about the juniper-nsp
mailing list