[j-nsp] FPC<->SFM capacity
Richard A Steenbergen
ras at e-gerbil.net
Wed Feb 22 18:11:06 EST 2006
Question: If you have an M160 with less than 4 SFMs running, is there a
reduced capacity to any one individual FPC or is the capacity reduction
only across the capacity of the entire system.
I've asked this about a dozen times, of a dozen different people at
Juniper, and they have always said "the capacity reduction is only across
the entire system, if you lose 1 SFM you go down from 160Mpps to 120Mpps
and that it is, the DX chip takes care of everything else". I never really
believed that answer, since it didn't seem to jive with how the switch
fabric should work, but I heard it enough that I stopped arguing it.
But, today I saw a box where a 3rd SFM had failed (if you have enough
M160s you'll see the sram/sdram on the SFMs go bad on an alarmingly
regular basis) and traffic off a single FPC2 was being bottlenecked. After
replacing an SFM and bringing a second one back online, traffic
immediately shot up, confirming the problem. There were 5 ports (two 2xGE
and one 1xOC48 PICs) on the affected FPC2, which had the following traffic
utilization post SFM restoration:
Port 1: 600M in / 250M out
Port 2: 50M in / 300M out
Port 3: 750M in / 280M out
Port 4: 100M in / 800M out
Port 5: 550M in / 250M out
----- -----
2050M in / 1880M out = 3930M in+out
The way that I would have figured this would work is, an FPC1 has a single
channel of 3.2Gbps to the switch fabric, and an FPC2 has 4x channels to 4
individual switch fabric modules. Thus while each SFM would have 40Mpps of
lookup capacity and 25.6Gbps (3.2Gbps * 8 slots from the original M40) of
switching capacity, each individual SFM<->FPC link would have a limit of
3.2Gbps, and an M160 running 1 active SFM would have as much switching
capacity to an FPC2 as it does to an FPC1 (hence explaining why 3 of the
slots on an M40e FPC2 are forcably blocked).
After examining the profile of the traffic that was being bottlenecked
while only one SFM was online, the only traffic that appeared to be really
limited (hard flatline, as opposed to just reduced in utilization because
of the reduction in traffic coming in to the system) was the INGRESS on
the GE ports (ports 1 and 3 in this case). During the bottleneck, these
ports were not accepting one bit past 500Mbps, with a cooresponding
dropoff in outbound traffic that brought the in+out total to roughly
3.2Gbps. Port was was in the middle of a normal traffic slope down from
1000M to 600M during the bottlenecked period, so you can really see that
the only bottleneck was on the ingress of the GE's.
So, this would appear to confirm the theory about the 3.2Gbps per
FPC<->SFM channel limitation, but it doesn't answer every question. It was
my understanding that the 3.2Gbps channel was bidirectional (3.2Gbps each
way), so that the bottleneck should not have been hit with the above
traffic configuration. Is the reality just that since the packets have to
be sprayed across all the FPC's for buffering, realistically the
limitation WILL be in+out? Since there isn't any active way to measure the
utilization on these channels, it would be nice if someone could explain
exactly what the limitations are in a little more detail. It would be
nicer still if those answers were accurate. :)
And just to add mystery to the whole thing, the show bchip output from an
SFM, for an FPC2 (pic 2 == 2GE, pic 3 == OC48):
Pic 2: 8 bit stream @ 125 MHz.
Stream 8: 155520 Kbits/sec, enabled.
Stream 9: 155520 Kbits/sec, enabled.
Stream 10: Not present, disabled.
Stream 11: Not present, disabled.
Pic 3: 8 bit stream @ 125 MHz.
Stream 12: 622080 Kbits/sec, enabled.
Stream 13: Not present, disabled.
Stream 14: Not present, disabled.
Stream 15: Not present, disabled.
And from an FPC1 (pic 1 == 1GE):
Pic 1: 8 bit stream @ 125 MHz.
Stream 4: 622080 Kbits/sec, enabled.
Stream 5: Not present, disabled.
Stream 6: Not present, disabled.
Stream 7: Not present, disabled.
Is the bandwidth limitation actually per PIC not per FPC? How the heck do
those numbers jive with the actual capacity of the PIC? Someone, please
shed some light on this. :)
--
Richard A Steenbergen <ras at e-gerbil.net> http://www.e-gerbil.net/ras
GPG Key ID: 0xF8B12CBC (7535 7F59 8204 ED1F CC1C 53AF 4C41 5ECA F8B1 2CBC)
More information about the juniper-nsp
mailing list