[j-nsp] MX104 with full BGP table problems

Brad Fleming bdflemin at gmail.com
Fri May 16 14:00:05 EDT 2014


We’ve been working with a handful of MX104s on the bench in preparation of putting them into a live network. We started pushing a full BGP table into the device and stumbled across some CPU utilization problems.

We tried pushing a full table into the box three different ways:
1) via an eBGP session
2) via a reflected session on an iBGP session
3) via a full mesh of iBGP sessions (11 other routers)

In situation #1: RE CPU was slightly elevated but remained ~60% idle and 1min load averages were around 0.3.

In situation #2: RE CPU is highly elevated. We maintain actual p-t-p /30s for our next-hops (I know, not best practice for many networks) which results in a total of about 50-65 next-hops network-wide.

In situation #3: RE CPU is saturated at all times. In this case we configured the mesh sessions to advertise routes with “next-hop-self” so the number of next-hops is reduced to 11 total.

It appears that RPD Is the process actually killing the CPU; nearly always running 75+% and in a “RUN” state. If we enable task accounting it shows “Resolve Tree 2” as the task consuming tons of CPU time. (see below) There’s plenty of RAM remaining, we’re not using any swap space, and we’ve not exceed the number of routes licensed for the system; we paid for the full 1Million+ route scaling. Logs are full of lost communication with the backup RE; however, if we disable all the BGP sessions that issue goes away completely (for days on end).

Has anyone else tried shoving a full BGP table into one of these routers yet? Have you noticed anything similar?

I’ve opened a JTAC case for the issue but I’m wondering if anyone with more experience in multi-RE setups has seen similar. Thanks in advance for any thoughts, suggestions, or insights.


Incoming command output dump….

netadm at test-MX104> show chassis routing-engine
Routing Engine status:
  Slot 0:
    Current state                  Master
    Election priority              Master (default)
    Temperature                 39 degrees C / 102 degrees F
    CPU temperature             42 degrees C / 107 degrees F
    DRAM                      3968 MB (4096 MB installed)
    Memory utilization          32 percent
    CPU utilization:
      User                      87 percent
      Background                 0 percent
      Kernel                    11 percent
      Interrupt                  2 percent
      Idle                       0 percent
    Model                          RE-MX-104
    Serial ID                      CACH2444
    Start time                     2009-12-31 18:05:43 CST
    Uptime                         21 hours, 31 minutes, 32 seconds
    Last reboot reason             0x200:normal shutdown
    Load averages:                 1 minute   5 minute  15 minute
                                       1.06       1.12       1.23
Routing Engine status:
  Slot 1:
    Current state                  Backup
    Election priority              Backup (default)
    Temperature                 37 degrees C / 98 degrees F
    CPU temperature             38 degrees C / 100 degrees F
    DRAM                      3968 MB (4096 MB installed)
    Memory utilization          30 percent
    CPU utilization:
      User                      62 percent
      Background                 0 percent
      Kernel                    15 percent
      Interrupt                 24 percent
      Idle                       0 percent
    Model                          RE-MX-104
    Serial ID                      CACD1529
    Start time                     2010-03-18 05:16:34 CDT
    Uptime                         21 hours, 45 minutes, 26 seconds
    Last reboot reason             0x200:normal shutdown
    Load averages:                 1 minute   5 minute  15 minute
                                       1.22       1.19       1.20

netadm at test-MX104> show system processes extensive
last pid: 20303;  load averages:  1.18,  1.14,  1.22  up 0+21:33:35    03:03:41
127 processes: 8 running, 99 sleeping, 20 waiting
Mem: 796M Active, 96M Inact, 308M Wired, 270M Cache, 112M Buf, 2399M Free
Swap: 1025M Total, 1025M Free
  PID USERNAME         THR PRI NICE   SIZE    RES STATE    TIME   WCPU COMMAND
 3217 root               1 132    0   485M   432M RUN    120:56 72.85% rpd

netadm at test-MX104> show task accounting
Task accounting is enabled.

Task                       Started    User Time  System Time  Longest Run
Scheduler                    32294        0.924        0.148        0.000
Memory                          26        0.001        0.000        0.000
RT                            5876        0.947        0.162        0.003
hakr                             6        0.000        0.000        0.000
OSPF I/O./var/run/ppmd_co      117        0.002        0.000        0.000
BGP rsync                      192        0.007        0.001        0.000
BGP_RT_Background               78        0.001        0.000        0.000
BGP_Listen.0.0.0.0+179        2696        1.101        0.218        0.009
PIM I/O./var/run/ppmd_con      117        0.003        0.000        0.000
OSPF                           629        0.005        0.000        0.000
BGP Standby Cache Task          26        0.000        0.000        0.000
BFD I/O./var/run/bfdd_con      117        0.003        0.000        0.000
BGP_2495_2495.164.113.199     1947        0.072        0.012        0.000
BGP_2495_2495.164.113.199     1566        0.056        0.010        0.000
BGP_2495_2495.164.113.199     1388        0.053        0.008        0.000
Resolve tree 3                1421       24.523       13.270        0.102
Resolve tree 2               14019    16:33.079       20.983        0.101
Mirror Task.128.0.0.6+584      464        0.018        0.004        0.000
KRT                           1074        0.157        0.159        0.004
Redirect                         9        0.000        0.000        0.000
MGMT_Listen./var/run/rpd_       54        0.009        0.005        0.000
SNMP Subagent./var/run/sn      258        0.052        0.052        0.001


More information about the juniper-nsp mailing list