[j-nsp] M160/JunOS 7.6R1.10: ae0 fails to install L2 descriptor
Josef Buchsteiner
josefb at juniper.net
Sat Dec 30 11:26:36 EST 2006
Friday, December 29, 2006, 7:53:55 PM, you wrote:
pb>
pb>
pb> Hello Josef,
pb>
pb> thank you very much for your detailed reply.
pb>
>>pb> I have a couple of questions:
>>pb> a) Is it normal to have 32k L2 Descriptors for 8.2k Next-Hop Entries?
>>
>> yes.. since this is ethernet and the layer2 header size is
>> big for ethernet and you most likely have all three links on
>> one FPC. i.e 3 times more resources.
pb>
pb> I had one PIC on fpc0 and 1 PIC on fpc1 (both non-e fpc1). I then added a
pb> second
pb> PIC to fpc1.
pb> I'll insert a spare fpc1 and retry with all links spread out.
pb>
>>pb> b) Is there a way to increase the number of available L2 Descriptors?
>>(how
>>pb> many L2 Descriptors does a SFM-16 support?)
>>
>> this has nothing to do with the SFM.. Enhanced FPC will have
>> about 160K space... its all about memory
pb>
pb> Oh okay, thank you. Is there a linear correlation between L2 Descriptors and
pb> Next-Hop
pb> Entries?
if the L2 data size is the same yes.
pb> Given that I move the third (and maybe someday fourth) link to separate
pb> non-e fpcs, and the usage on other interfaces stays about the same, can I
pb> calculate that we'll be
pb> able to support about 50k / (32k / 8.2k) = 12.8k Next-Hops over that ae
pb> interface before we have to move the individual ae-links over to fpc-es?
pb>
you need a next-hop on ethernet all the time you want to
deliver to a destination. If the L2 portion is different you
will then created a different next-hop.We also consume a
next-hop called resolve next-hop for each ethernet segment
however this does not consume L2 Descriptor space. So if you
have 3000 vlans you would have also 3000 resolve
next-hops.Since Ethernet is point to multipoint a next-hop
is created
o for every arp entry a unicat next-hop
o each multicast group has a different next-hop since
the ethernet destination address is mapped to the
mcast group
o Junos does also treat a mpls label as next-hop so
we would generate a next-hop if you have a
different label.
For ethernet you need 3 chunks of L2 Descriptor and for vlan
you need 4 chunks which is all in words. The reason why we
need 4 for vlans is simply the L2 data portion is bigger due
to the vlan header. If you have also mpls labels you would
need to calculate 5 chunks.
this how you can easily calculate and make your math.
FPC has 52252 L2 Descriptor /4 is 13063 vlan arp entries aka
next-hops per FPC. For non-vlan it would be 16750 entries.
E-FPC has 162891 L2 Descriptor /4 is 40722 vlan arp entries.
Enhanced Plus FPC or M10i/M7i has 362571 L2
Descriptors. There is an upper
boundary of 61183 next-hops per FPC
>>pb> c) Is there a way to make the router fail with less impact to the
>>network
>>pb> (for example simply shutting down the new interface automatically
>>instead of
>>pb> refusing to update the next-hop table until the interface is taken
>>down and
>>pb> all sfms are restarted manually)
>>
>> there would have been never a need to restart any SFM. all
>> you would have need to do is to deactivate the aggregate
>> interface and enable it again without the third member link.
pb>
pb> I tried removing the one new link but I still got
pb> Dec 29 03:58:12 ham-cr2-re1 /kernel: ae_link_op: link ge-1/3/0.2 (lidx=2)
pb> detached from bundle ae0.2
pb> Dec 29 03:59:36 ham-cr2-re1 /kernel: RT_PFE: NH IPC op 31 (CHANGE AGGREGATE
pb> NEXTHOP) failed, err 5 (Invalid)
pb>
pb> in the logfiles. This is when I decided to restart the SFMs.
the upper layer still thinks the next-hop is installed and
requests to the PFE to remove this entry however the PFE is
complaining that it does not have such an entry therefor
removing will also yield to an error entry.
pb> Taking the entire aggregate interface down amounts to more impact to the
pb> network
pb> imho (at least in our setup). With SFMs restarting I've got a couple seconds
pb> of little
pb> packetloss, while a deactivated ae0 would mean x bouncing BGP sessions and
pb> traffic
pb> stopping completely for a short amount of time.
thats ok... the issues is not always straight forward what
the best thing is to do due to the dynamic of the network.
pb>
>> In fact you know only that you run out of resource once you
>> try to program it in hardware and the only way is to refuse
>> it. There is no real good way to know upfront. I believe the
>> RSMON feature is able to monitor such resources and will
>> send an alarm if configured once you reached a certain
>> threshold so you know that you are moving to the limits.
pb>
pb> That makes sense.
pb> I had actually looked at the nhdb stats beforehand but didn't consider 20k
pb> free entries
pb> to be of concern.
aggregation links can consume quit some resources as you
have noticed but you can now make the math how safe it is to
add another link to the bundle.
thanks
Josef
pb>
pb> Best Regards, Peter
pb>
pb>
pb> _________________________________________________________________
pb> FREE pop-up blocking with the new MSN Toolbar - get it now!
pb> http://toolbar.msn.click-url.com/go/onm00200415ave/direct/01/
pb>
pb>
pb>
More information about the juniper-nsp
mailing list