[j-nsp] routing updates between PFEs and Kernal

Wed Dec 15 08:30:57 EST 2010

+2 to Mr. Richard.. Could you please explain the same for MX-960 trio-mpc..
Thanks
BR//Masood 

> Date: Wed, 3 Nov 2010 14:31:37 -0500
> From: ras at e-gerbil.net
> To: good1 at live.com
> CC: juniper-nsp at puck.nether.net
> Subject: Re: [j-nsp] routing updates between PFEs and Kernal
> 
> On Wed, Nov 03, 2010 at 11:34:59PM +0500, Good One wrote:
> > 
> > Thanks for an useful information, Richard. Well, a DPC has a 1G ram 
> > inside and if each PFE has a complete copy of the routing table (even 
> > the best route) and you are receiving a full feed of internet and a 
> > thousands of your own routes, then all the 4 PFEs should occupy the 1G 
> > RAM (I assume all 4 PFEs are using/sharing the DPC 1G ram to store 
> > routing table) ... not sure how to connect to a PFE individually, all 
> > I can do is 'start shell pfe network fpc0' which connects you to a DPC 
> > and not the PFEs sitting somewhere on the DPC :)
> 
> Not quite. There are 3 different types of memory on the DPC:
> 
> Slot 0 information:
>   State                                 Online    
>   Temperature                        29 degrees C / 84 degrees F
>   Total CPU DRAM                   1024 MB <--- 1
>   Total RLDRAM                      256 MB <--- 2
>   Total DDR DRAM                   4096 MB <--- 3
> 
> The CPU DRAM is just general purpose RAM like you'd find on any PC. This 
> is where the microkernel runs (which is what you're talking to when you 
> do a "start shell pfe network fpc#"), on a 1.2GHz PowerPC embedded 
> processor. It also handles things like IP options, TTL expiration, ICMP 
> generation from the data plane, and the like.
> 
> The RLDRAM (reduced latency DRAM) is the "lookup memory", this is where 
> the final copy of the routing table used for forwarding (called the FIB) 
> is stored, along with information about firewall rules, etc. This memory 
> is directly accessed by the forwarding ASICs, and needs to be low 
> latency in order to keep up with the number of lookups/sec required on a 
> high speed router.
> 
> On older platforms this would typically have been done with SRAM, which 
> is very fast but also very expensive. On an old M/T box you might have 
> seen 8MB or 16MB of SRAM per PFE, which could do 20Gbps PFEs handling 
> 2xOC192 (50Mpps+ lookups/sec), but with a capacity of well under 500k 
> routes in the FIB. The MX (and M120) introduced a new model for doing 
> routing lookups using RLDRAM, which is much cheaper, and thus you can 
> put a lot more of it on the PFE.
> 
> Each DPC PFE actually has 4x32MB RLDRAM chips, but they run as 2 banks 
> of 2x32MB mirrored blocks. The first bank holds your routing 
> information, the second bank holds your firewall information. The 
> mirroring of 2x32MB is necessary to meet the performance requirements 
> using the slower RLDRAM, since you can do twice as many lookups/sec if 
> you have 2 banks to query from. 
> 
> The MX architecture also makes this easier, since it uses a larger 
> number of relatively low speed PFEs (4 PFEs of 10G/ea), and is ethernet 
> only. To support 10GE or 10x1GE you only need to do 14.8Mpps per PFE, 
> which is a lot easier than the older 20G PFEs on T-series which needed 
> to do 50Mpps+ to support 2xOC192. This is how the MX is implemented 
> economically, and still manages to deliver support for well over 1 
> million FIB entries. The 256MB being reported in the show chassis fpc 
> output is your 4 PFEs * 64MB worth of available memory, which is really 
> mirrored banks of 2x32MB each.
> 
> Finally, the DDR DRAM is the "packet buffering" memory, which holds the 
> copy of the packet as it moves through the system. When you receive a 
> packet, its contents are stored in the packet buffer memory while the 
> headers of the packet are sent to the I-chip for routing/firewall 
> lookups. After the result is returned, the egress interface actually 
> goes out and gathers up all the fragments of the packet necessary to 
> reassemble and transmit it.
> 
> So, your kernel pushes the selected FIB down to the DPC CPU, which in 
> turn programs all 4 PFEs on the DPC, and then each PFE has its own copy 
> of the routing table (in a highly optimized form that is directly 
> accessed by the ASICs) to make decisions from. Also this is completely 
> different from how the new MX (Trio/3D) cards work. :)
> 
> -- 
> Richard A Steenbergen <ras at e-gerbil.net>       http://www.e-gerbil.net/ras
> GPG Key ID: 0xF8B12CBC (7535 7F59 8204 ED1F CC1C 53AF 4C41 5ECA F8B1 2CBC)