[c-nsp] IPerf alternative

Mon Aug 7 17:34:15 EDT 2017

On 7 August 2017 at 11:40, Raymond Burkholder <ray at oneunified.net> wrote:
>
>  on some platforms, like linux, you need to check ‘ethtool -S’ to see if the  operating system is dropping packets (on tx or rx).  which may require some performance tuning of the network interfaces.

Yeah ethtool -C is import to set the minimum RX IRQ (NET_RX) as low as you can.

Without using one of the third party libraries like Netmap, DPDK or
VPP, or similar to implement Kernel bypass techniques, or a tool that
uses them, you have to make lots of “tweaks” to get even a fraction of
that bandwidth or pps rates. EtherateMT uses Tx and Rx ring buffers
(using PACKET_MMAP_TX/PACKET_MMAP_RX), with AF_PACKET to dump the ring
with a single syscall and single context switch, it forcefully
increases the OS socket send/receive buffer size, it uses
PACKET_QDISC_BYPASS to bypass the Linux queuing discipline sub-system
(skipping and QoS configuration basically), it ignores dropped packets
using PACKET_LOSS, and can use FANOUT groups to spray traffic over all
Tx/Rx queues in the NIC. One can also use isolcpus and nohz_full. I
have some noted on host tuning I can share if anyone is interested,
I’d just need to dig them out. However even with all those, DPDK et al
are still much faster.

> also, on a linux platform, the kernel guys use some trace tools, one of which will create one buffer, and copy it to the network interface, making a very effective high bandwidth tester, with some purporting to fill a 10g link.  I don’t have the name off the top of my head.

You might be thinking of pktgen (the Kernel module and not the DPDK
based app!) which I believe can do 10Gbps using 64 byte packets. I
think (could be wrong here) over the years that morphed into trafgen
in the netsniff package: http://netsniff-ng.org/

By loading it into the kernel there is arguably one less copy from
user land process into kernel memory (as is the case with sendto() for
example; https://linux.die.net/man/2/sendto) and but using ring
buffers one syscall can be used to send or receive many packets from
the user land process into sk_buffs in Kernel memory and into DMA
space. DPDK uses similar ideas but it has something called the EAL
(environment abstraction layer) which can provide XSS within minimal
effort from the user and it can use it can DMA directly from it’s ring
buffer removing another copy-per-packet over Linux’s AF_PACKET module
(as well as loads of other cool shit).

VPP which builds on DPDK recently passed the 1Tbps mark (10x100Gbps
interfaces with like 1M routes in FIB) using the new Intel SkyLake
CPU. They have achieved a PPS budget per packet that was stupidly low,
like 200 instructions per packet.

> this being a cisco list, some cisco platforms have built in ttcp performance testers.

I always forget about that but I've never had a particularly great
experience with it. It's there on some ISR models, I also used it on
the ME3x00 switches once, but the throughput was like 20Mbps and I
found it quite flaky.

I think I'm hijacking this thread a bit with my own rants.
Sorry about that,
James.