[f-nsp] CES FIB capacity revisited
Tamihiro Yuzawa
tamil at tox.mine.nu
Wed Aug 21 05:25:32 EDT 2013
A couple of things I should have mentioned from the beginning...
- the routers on the subject are running ver V5.4.0cT183,
- and we've got a case opened via a local VAR, but I'm afraid that at best they might not be as responsive as they should, or at worse they might not come up with any solutions...
Expert comments will be appreciated immensely. -tami
tamil> Helllo experts, especially Mr. Greg Hankins!
tamil>
tamil> A datacenter guy in Japan newly joining the list, please correct me in case whatever I'm saying is even slightly inappropriate here.
tamil>
tamil> I'm following this informative thread:
tamil> https://puck.nether.net/pipermail/foundry-nsp/2012-October/003719.html
tamil>
tamil> because there is a particular VRRP pair of CES2048CX boxes on our site that have issues with making FIB entries.
tamil>
tamil> > Hi Rob, the CER routers use SRAM for the IPv4/v6 FIB instead of CAM, so
tamil> > for the CES/CER/CER-RT platforms the scalability is dynamic for all IP
tamil> > routes in the FIB (vs the MLX CAM architecture where we have to choose
tamil> > fixed partition sizes which can support a maximum number).
tamil>
tamil> Thank you Greg for the information.
tamil> We had a Brocade Japan field engineer visiting our office yesterday, and he didn't realize CES would use SRAM for FIB instead of CAM.
tamil>
tamil> > Scalability depends on how the memory is used by IPv4 and IPv6 routes.
tamil> > There is no easy formula to calculate the utilization, so we have tested
tamil> > certain combinations which are officially supported maximums. While a
tamil> > different number of routes might work, it will not be a supported or
tamil> > tested combination.
tamil>
tamil> Very understandable.
tamil>
tamil> > "Internet route mix" is used to indicate that we are using a mix of prefix
tamil> > lengths. No compression is being used.
tamil>
tamil> Understood.
tamil>
tamil> And here is our situation.
tamil> Since the 3rd of August, we have started seeing a "Warning: IPv4 Network Route ADD: CAM entry creation FAILED" message every time we try to add a static IPv4 route, and a "show ip route detail" command for the prefix returns the following:
tamil>
tamil> > U_flags Entry_flags Age Cam:Index HW_Path_count
tamil> > 0000e000 0 INVALID CAM 1
tamil> >
tamil> > CAM Entry Flag: 00000000H
tamil> > No cam index
tamil>
tamil> As such, the routers fail to forward packets destined to this prefix towards the corrrect NH. They just send packets to the deafault NH.
tamil> (While it seems misleading they keep saying CAM when it's actually RAM, I don't mean to argue with that now :P)
tamil>
tamil> However, the number of routes they have is only <1200 (IPv4) and <500 (IPv6), which is far below the "officially supported maximum" that Greg kindly shared in his previous email.
tamil>
tamil> Here is a routing protocol breakdown of IPv4 routes in case it matters:
tamil> D: 21
tamil> O: 860
tamil> O2: 74
tamil> S: 213
tamil>
tamil> I should also mention that they are mostly /27 or longer prefix lengths.
tamil>
tamil> pref route
tamil> len count
tamil> ===== =====
tamil> 0: 2
tamil> 16: 2
tamil> 17: 2
tamil> 18: 2
tamil> 19: 0
tamil> 20: 8
tamil> 21: 2
tamil> 22: 6
tamil> 23: 80
tamil> 24: 22
tamil> 25: 14
tamil> 26: 45
tamil> 27: 134
tamil> 28: 521
tamil> 29: 85
tamil> 30: 95
tamil> 31: 24
tamil> 32: 124
tamil>
tamil> What's more is that except the two default (/0) and a few others, they are all subnets of either of two /21 prefixes.
tamil> So I would say our routes are a far from evenly distributed mix of prefixes which is presumably unlike the "Internet route mix" that Greg used for testing.
tamil>
tamil> I assume if FIB in SRAM is stored in a tree structure, how many routes it can hold pretty much depends on the combination of what kind of prefix mix will be injected and how it's designed to store them.
tamil>
tamil> And my guess is that in our case, the aforementioned unevenness might be hitting some scale limitations on FIB in CES, be it SRAM partitions, leaf split constraints, or something beyond my imagination.
tamil>
tamil> I'd appreciate insights from experts, hints, or suggestions for further investigations because this is becoming a serous issue since it's happening on our production PE routers. Thank you, -tami
More information about the foundry-nsp
mailing list