44. NFP poll mode driver library
Netronome and Corigine’s sixth generation of flow processors pack 216 programmable cores and over 100 hardware accelerators that uniquely combine packet, flow, security and content processing in a single device that scales up to 400-Gb/s.
This document explains how to use DPDK with the Network Flow Processor (NFP) Poll Mode Driver (PMD) supporting Netronome and Corigine’s NFP-6xxx, NFP-4xxx and NFP-38xx product lines.
NFP is a SR-IOV capable device and the PMD supports the physical function (PF) and the virtual functions (VFs).
44.1. Dependencies
Before using the NFP DPDK PMD some NFP configuration, which is not related to DPDK, is required. The system requires installation of the nfp-bsp (Board Support Package) along with a specific NFP firmware application. The NSP ABI version should be 0.20 or higher.
If you have a NFP device you should already have the documentation to perform this configuration. Contact support@netronome.com (for Netronome products) or smartnic-support@corigine.com (for Corigine products) to obtain the latest available firmware.
The NFP Linux netdev kernel driver for VFs has been a part of the vanilla kernel since kernel version 4.5, and support for the PF since kernel version 4.11. Support for older kernels can be obtained on Github at https://github.com/Netronome/nfp-drv-kmods along with the build instructions.
NFP PMD needs to be used along with UIO igb_uio
or VFIO (vfio-pci
)
Linux kernel driver.
44.2. Building the software
The NFP PMD code is provided in the drivers/net/nfp directory. Although NFP PMD has BSP dependencies, it is possible to compile it along with other DPDK PMDs even if no BSP was installed previously. Of course, a DPDK app will require such a BSP installed for using the NFP PMD, along with a specific NFP firmware application.
Once the DPDK is built all the DPDK apps and examples include support for the NFP PMD.
44.3. Driver compilation and testing
Refer to the document compiling and testing a PMD for a NIC for details.
44.4. Using the PF
The PMD PF has extra work to do which will delay the DPDK app initialization like uploading the firmware and configure the Link state properly when starting or stopping a PF port. Since DPDK 18.05 the firmware upload happens when a PF is initialized, which was not always true with older DPDK versions.
Depending on the product installed in the system, firmware files should be
available under /lib/firmware/netronome
. DPDK PMD supporting the PF looks
for a firmware file in this order:
First try to find a firmware image specific for this device using the NFP serial number:
serial-00-15-4d-12-20-65-10-ff.nffw
Then try the PCI name:
pci-0000:04:00.0.nffw
Finally try the card type and media:
nic_AMDA0099-0001_2x25.nffw
Netronome and Corigine’s software packages install firmware files under
/lib/firmware/netronome
to support all the Netronome and Corigine SmartNICs
and different firmware applications. This is usually done using file names
based on SmartNIC type and media and with a directory per firmware application.
Options 1 and 2 for firmware filenames allow more than one SmartNIC, same type
of SmartNIC or different ones, and to upload a different firmware to each
SmartNIC.
Note
Currently the NFP PMD supports using the PF with Agilio Firmware with NFD3 and Agilio Firmware with NFDk. See Netronome Support. for more information on the various firmwares supported by the Netronome Agilio SmartNIC range, or Corigine Support. for more information about Corigine’s range.
44.5. PF multiport support
The NFP PMD can work with up to 8 ports on the same PF device. The number of available ports is firmware and hardware dependent, and the driver looks for a firmware symbol during initialization to know how many can be used.
DPDK apps work with ports, and a port is usually a PF or a VF PCI device. However, with the NFP PF multiport there is just one PF PCI device. Supporting this particular configuration requires the PMD to create ports in a special way, although once they are created, DPDK apps should be able to use them as normal PCI ports.
NFP ports belonging to the same PF can be seen inside PMD initialization with a suffix added to the PCI ID: wwww:xx:yy.z_portn. For example, a PF with PCI ID 0000:03:00.0 and four ports is seen by the PMD code as:
0000:03:00.0_port0 0000:03:00.0_port1 0000:03:00.0_port2 0000:03:00.0_port3Note
There are some limitations with multiport support: RX interrupts and device hot-plugging are not supported.
44.6. PF multiprocess support
The NFP PMD supports the PF multiprocess. Having basic multiprocess support is important for allowing development and debugging through the PF using a secondary process, which will create a CPP bridge for user space tools accessing the NFP.
44.7. System configuration
Enable SR-IOV on the NFP device: The current NFP PMD supports the PF and the VFs on a NFP device. However, it is not possible to work with both at the same time when using the
nfp
Linux netdev kernel driver. If the PF is bound to thenfp
kernel module, and VFs are created, the VFs may be bound to thevfio-pci
kernel module. It is also possible to bind the PF to thevfio-pci
kernel module, and create VFs afterwards. This requires loading thevfio-pci
module with the following parameters:modprobe vfio-pci enable_sriov=1 disable_idle_d3=1
VFs need to be enabled before they can be used with the PMD. Before enabling the VFs it is useful to obtain information about the current NFP PCI device detected by the system. This can be done on Netronome SmartNICs using:
lspci -d 19ee:
and on Corigine SmartNICs using:
lspci -d 1da8:
Now, for example, to configure two virtual functions on a NFP device whose PCI system identity is “0000:03:00.0”:
echo 2 > /sys/bus/pci/devices/0000:03:00.0/sriov_numvfs
The result of this command may be shown using lspci again on Netronome SmartNICs:
lspci -kd 19ee:
and on Corigine SmartNICs:
lspci -kd 1da8:
Two new PCI devices should appear in the output of the above command. The -k option shows the device driver, if any, that the devices are bound to. Depending on the modules loaded, at this point the new PCI devices may be bound to the
nfp
kernel driver orvfio-pci
.
44.8. Flow offload
Using the flower firmware application, some types of Netronome or Corigine SmartNICs can offload the flows onto the cards.
The flower firmware application requires the PMD running two services:
- PF vNIC service: handling the feedback traffic.
- ctrl vNIC service: communicate between PMD and firmware through control messages.
To achieve the offload of flow, the representor ports are exposed to OVS. The flower firmware application supports representor port for VF and physical port. There will always exist a representor port for each physical port, and the number of the representor port for VF is specified by the user through a parameter.
In the Rx direction, the flower firmware application will prepend the input port information into metadata for each packet which can’t offloaded. The PF vNIC service will keep polling packets from the firmware, and multiplex them to the corresponding representor port.
In the Tx direction, the representor port will prepend the output port information into metadata for each packet, and then send it to the firmware through the PF vNIC.
The ctrl vNIC service handles various control messages, for example, the creation and configuration of representor port, the pattern and action of flow rules, the statistics of flow rules, etc.
44.9. NFP devargs
force_reload_fw
(default 0)The NFP PF PMD supports force reload the firmware and ignore the firmware version. For example, user can force a PF with PCI ID 0000:af:00.0 reload firmware by:
-a af:00.0,force_reload_fw=1 -- -i
Note
This parameter can be ignored in some case.
For example: when using a 2-port NFP card and both with this reload firmware option, only the first one will cause the firmware reload and the second one will be ignored.
44.10. Metadata Format
The NFP packet metadata format
44.10.1. NFD3
The packet metadata starts with a field type header that can contain up-to 8 4-bit datatype specifiers (32-bits in total). This is followed by up to 8 32-bit words of data for each field described in the header. And directly following the metadata (header and data) comes the packet.
The order of type is correspond with the data, but the nums of data field are decided by the corresponding type, if the type need N data field, it need to be wrote N times in the heads.
3 2 1 0
2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Type7 | Type6 | Type5 | Type4 | Type3 | Type2 | Type1 | Type0 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Data for field 0 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Data for field 1 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Data for field 2 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Data for field 3 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Data for field 4 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Data for field 5 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Data for field 6 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Data for field 7 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Packet Data |
| ... |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
44.10.2. NFDk
The packet metadata starts with a field type header that can contain 8 bit metadata length and 6 4-bit datatype specifiers (32-bits in total). This is followed by up to 6 32-bit words of data for each field described in the header. And directly following the metadata (header and data) comes the packet.
The order of type is correspond with the data, but the nums of data field are decided by the corresponding type, if the type need N data field, it need to be wrote N times in the heads. It is the same with NFD3.
3 2 1 0
1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Type5 | Type4 | Type3 | Type2 | Type1 | Type0 |metadata length|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Data for field 0 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Data for field 1 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Data for field 2 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Data for field 3 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Data for field 4 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Data for field 5 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Data for field 6 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Data for field 7 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Packet Data |
| ... |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
There are two classes of metadata one for ingress and one for egress. In each class the supported NFP types are:
44.10.3. RX
NFP_NET_META_HASH The hash type is 4 bit which is next field type after NFP_NET_META_HASH in the header. The hash value is 32 bit which need 1 data field.
-----------------------------------------------------------------
3 2 1 0
2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Hash value |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
NFP_NET_META_VLAN Metadata with L2 (1W/4B)
----------------------------------------------------------------
3 2 1 0
1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|f| reserved | tpid| PCP |p| vlan outermost VID |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
^ ^
NOTE: | TCI |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
f 0 -> not stripping
1 -> stripping
tpid 0 -> RTE_ETHER_TYPE_VLAN 0x8100 IEEE 802.1Q VLAN tagging
1 -> RTE_ETHER_TYPE_QINQ 0x88a8 IEEE 802.1ad QINQ tagging
Tpid just be stored, now we don't handle it
The vlan[0] is the innermost VLAN
The vlan[1] is the QinQ info
NFP_NET_META_IPSEC The IPsec type requires 4 bit. The SA index value is 32 bit which need 1 data field.
----------------------------------------------------------------
3 2 1 0
1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| sa_idx |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
44.10.4. TX
NFP_NET_META_VLAN
-----------------------------------------------------------------
3 2 1 0
1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| TPID | PCP |p| vlan outermost VID |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
^ ^
NOTE: | TCI |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
NFP_NET_META_IPSEC The IPsec type requires 12 bit, because it requires three data fields.
----------------------------------------------------------------
3 2 1 0
1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0 9 8 7 6 5 4 3 2 1 0
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| sa_idx |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| nfp_ipsec_force_seq_low |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| nfp_ipsec_force_seq_hi |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
The sa_idx is 32 bit which need 1 data field.
The nfp_ipsec_force_seq_low & nfp_ipsec_force_seq_hi is Anti-re-anti-count,
which is 64 bit need two data fields.