98. DPDK PMD for AF_XDP Tests¶
98.1. Description¶
AF_XDP is a proposed faster version of AF_PACKET interface in Linux. This test plan is to analysis the performance of DPDK PMD for AF_XDP.
98.2. Prerequisites¶
Hardware:
I40e 25G*2 enp216s0f0 <---> IXIA_port_0 enp216s0f1 <---> IXIA_port_1
The NIC is located on the socket 1, so we define the cores of socket 1.
Take the kernel >= v5.2-rc2, build kernel and replace your host kernel with it. Update compiler to the proper version. Make sure you turn on XDP sockets when compiling:
Networking support --> Networking options --> [ * ] XDP sockets
Then compile the kernel:
make -j16 make modules_install install
Build libbpf in tools/lib/bpf:
cd tools/lib/bpf make install_lib prefix=/usr make install_headers prefix=/usr
Explicitly enable AF_XDP pmd by adding below line to config/common_linux:
CONFIG_RTE_LIBRTE_PMD_AF_XDP=y
Then build DPDK.
Set DUT port only has one queue:
ethtool -L enp216s0f0 combined 1 ethtool -L enp216s0f1 combined 1
98.3. Test case 1: single port¶
Start the testpmd:
./testpmd -l 29,30 -n 6 --no-pci --vdev net_af_xdp0,iface=enp216s0f0 \ -- -i --nb-cores=1 --rxq=1 --txq=1 --port-topology=loop
Assign the kernel core:
./set_irq_affinity 34 enp216s0f0
Send packets by packet generator with different packet size, from 64 bytes to 1518 bytes, check the throughput.
98.4. Test case 2: two ports¶
Start the testpmd:
./x86_64-native-linuxapp-gcc/app/testpmd -l 29,30-31 --no-pci -n 6 \ --vdev net_af_xdp0,iface=enp216s0f0 --vdev net_af_xdp1,iface=enp216s0f1 \ -- -i --nb-cores=2 --rxq=1 --txq=1
Assign the kernel core:
./set_irq_affinity 33 enp216s0f0 ./set_irq_affinity 34 enp216s0f1
Send packets by packet generator port0 with different packet size, from 64 bytes to 1518 bytes, check the throughput at port1.
Send packets by packet generator port0 and port1 with different packet size, from 64 bytes to 1518 bytes, check the throughput at port0 and port1.
98.5. Test case 3: zero copy¶
Start the testpmd:
./testpmd -l 29,30 -n 6 --no-pci \ --vdev net_af_xdp0,iface=enp216s0f0,pmd_zero_copy=1 \ -- -i --nb-cores=1 --rxq=1 --txq=1 --port-topology=loop
Assign the kernel core:
./set_irq_affinity 34 enp216s0f0
Send packets by packet generator with different packet size, from 64 bytes to 1518 bytes, check the throughput.
98.6. Test case 4: multiqueue¶
- One queue.
Start the testpmd with one queue:
./testpmd -l 29,30 -n 6 --no-pci \ --vdev net_af_xdp0,iface=enp216s0f0,start_queue=0,queue_count=1 \ -- -i --nb-cores=1 --rxq=1 --txq=1 --port-topology=loopAssign the kernel core:
./set_irq_affinity 34 enp216s0f0Send packets with different dst IP address by packet generator with different packet size from 64 bytes to 1518 bytes, check the throughput.
- Four queues.
Set hardware queue:
ethtool -L enp216s0f0 combined 4Start the testpmd with four queues:
./testpmd -l 29,30-33 -n 6 --no-pci \ --vdev net_af_xdp0,iface=enp216s0f0,start_queue=0,queue_count=4 \ -- -i --nb-cores=4 --rxq=4 --txq=4 --port-topology=loopAssign the kernel core:
./set_irq_affinity 34-37 enp216s0f0
- Send packets with different dst IP address by packet generator
with different packet size from 64 bytes to 1518 bytes, check the throughput. The packets were distributed to the four queues.
98.7. Test case 5: multiqueue and zero copy¶
- One queue and zero copy.
Set hardware queue:
ethtool -L enp216s0f0 combined 1Start the testpmd with one queue:
./testpmd -l 29,30 -n 6 --no-pci \ --vdev net_af_xdp0,iface=enp216s0f0,start_queue=0,queue_count=1,pmd_zero_copy=1 \ -- -i --nb-cores=1 --rxq=1 --txq=1 --port-topology=loopAssign the kernel core:
./set_irq_affinity 34 enp216s0f0Send packets with different dst IP address by packet generator with different packet size from 64 bytes to 1518 bytes, check the throughput. Expect the performance is better than non-zero-copy.
- Four queues and zero copy.
Set hardware queue:
ethtool -L enp216s0f0 combined 4Start the testpmd with four queues:
./testpmd -l 29,30-33 -n 6 --no-pci \ --vdev net_af_xdp0,iface=enp216s0f0,start_queue=0,queue_count=4,pmd_zero_copy=1 \ -- -i --nb-cores=4 --rxq=4 --txq=4 --port-topology=loopAssign the kernel core:
./set_irq_affinity 34-37 enp216s0f0Send packets with different dst IP address by packet generator with different packet size from 64 bytes to 1518 bytes, check the throughput. The packets were distributed to the four queues. Expect the performance of four queues is better than one queue. Expect the performance is better than non-zero-copy.
98.8. Test case 6: need_wakeup¶
Set hardware queue:
ethtool -L enp216s0f0 combined 1
Start the testpmd with one queue:
./testpmd -l 29,30 -n 6 --no-pci --vdev net_af_xdp0,iface=enp216s0f0 \ -- -i --nb-cores=1 --rxq=1 --txq=1 --port-topology=loop
Assign the same core:
./set_irq_affinity 30 enp216s0f0
Send packets by packet generator with different packet size from 64 bytes to 1518 bytes, check the throughput. Expect the performance is better than no need_wakeup.
98.9. Test case 7: xdpsock sample performance¶
- One queue.
Set hardware queue:
ethtool -L enp216s0f0 combined 1Start the xdp socket with one queue:
#taskset -c 30 ./xdpsock -l -i enp216s0f0
Assign the kernel core:
./set_irq_affinity 34 enp216s0f0Send packets with different dst IP address by packet generator with different packet size from 64 bytes to 1518 bytes, check the throughput.
- Four queues.
Set hardware queue:
ethtool -L enp216s0f0 combined 4Start the xdp socket with four queues:
#taskset -c 30 ./xdpsock -l -i enp216s0f0 -q 0 #taskset -c 31 ./xdpsock -l -i enp216s0f0 -q 1 #taskset -c 32 ./xdpsock -l -i enp216s0f0 -q 2 #taskset -c 33 ./xdpsock -l -i enp216s0f0 -q 3Assign the kernel core:
./set_irq_affinity 34-37 enp216s0f0Send packets with different dst IP address by packet generator with different packet size from 64 bytes to 1518 bytes, check the throughput. The packets were distributed to the four queues. Expect the performance of four queues is better than one queue.
- Need_wakeup.
Set hardware queue:
ethtool -L enp216s0f0 combined 1Start the xdp socket with four queues:
#taskset -c 30 ./xdpsock -l -i enp216s0f0
Assign the kernel core:
./set_irq_affinity 30 enp216s0f0Send packets by packet generator with different packet size from 64 bytes to 1518 bytes, check the throughput. Expect the performance is better than no need_wakeup.