[j-nsp] LAG/ECMP hash performance
Thomas Bellman
bellman at nsc.liu.se
Thu Aug 29 15:16:28 EDT 2019
On 2019-08-29 17:31 +0200, Robert Raszuk wrote:
> You are very correct. I was very highly surprised to read Saku mentioning
> use of CRC for hashing but then quick google revealed this link:
>
> https://www.juniper.net/documentation/en_US/junos/topics/reference/configuration-statement/hash-parameters-edit-forwarding-options.html
>
> Looks like ECMP and LAG hashing may seriously spread your flows as clearly
> CRC includes payload and payload is likely to be different with every
> packet.
On what basis do you figure CRC "clearly" includes payload? I see
no indication on that page, or a few other pages close by, that
anything but select layer 2 or layer 3/4 headers are used in the
hashes for LAG and ECMP.
Are you perhaps mislead by the 'forwarding-options enhanced-hash-key
hash-mode layer2-payload' setting? My understanding is that its
meaning is to use select L3 and/or L4 headers, as opposed to using
select L2 headers, as input to the CRC function. A better name for
that setting would probably be 'layer2/3-headers'.
https://www.juniper.net/documentation/en_US/junos/topics/reference/configuration-statement/hash-mode-edit-forwarding-options-ex-series.html
says:
If the hash mode is set to layer2-payload, you can set the fields
used by the hashing algorithm to hash IPv4 traffic using the set
forwarding-options enhanced-hash-key inet statement. You can set
the fields used by the hashing algorithm to hash IPv6 traffic using
the set forwarding-options enhanced-hash-key inet6 statement.
The fields you can select/deselect are:
- Source IPv4/IPv6 address
- Destination IPv4/IPv6 address
- Source L4 port
- Destination L4 port
- IPv4 protocol / IPv6 NextHdr
- VLAN-id (on EX and QFX 5k)
- Incoming port (on QFX 10k)
- IPv6 flow label (on QFX 10k)
- GPRS Tunneling Protocol endpoint id
> Good that this is only for QFX though :-)
The 'hash-parameter' settings are not even valid on all QFX:es. At
least Trident II (QFX 51x0) uses a Broadcom-proprietary hash called
RTAG7. I'm guessing that using CRC16 or CRC32 for LAG/ECMP hasing
is just used on QFX 10k, not any of the Trident- or Tomahawk-based
routers/switches.
> For MX I recall that the hash is not computed with entire packet. The
> specific packet's fields are taken as input (per configuration) and CRC
> functions are used to mangle them - which is very different from saying
> that packet's CRC is used as input.
I don't think anyone has said that any product use the ethernet
packet's CRC for LAG/ECMP hashing. Just that they might reuse
the CRC circuitry in the NPU/ASIC for calculating this hash, but
based on different inputs.
--
Thomas Bellman, National Supercomputer Centre, Linköping Univ., Sweden
"We don't understand the software, and sometimes we don't understand
the hardware, but we can *see* the blinking lights!"
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: OpenPGP digital signature
URL: <https://puck.nether.net/pipermail/juniper-nsp/attachments/20190829/cc9b6fa3/attachment.sig>
More information about the juniper-nsp
mailing list