21. L3 Forwarding Graph Sample Application

The L3 Forwarding Graph application is a simple example of packet processing using the DPDK Graph framework. The application performs L3 forwarding using Graph framework and nodes written for graph framework.

21.1. Overview

The application demonstrates the use of the graph framework and graph nodes ethdev_rx, pkt_cls, ip4_lookup/ip6_lookup, ip4_rewrite/ip6_rewrite, ethdev_tx and pkt_drop in DPDK to implement packet forwarding.

The initialization is very similar to those of the L3 Forwarding Sample Application. There is also additional initialization of graph for graph object creation and configuration per lcore. Run-time path is main thing that differs from L3 forwarding sample application. Difference is that forwarding logic starting from Rx, followed by LPM lookup, TTL update and finally Tx is implemented inside graph nodes. These nodes are interconnected in graph framework. Application main loop needs to walk over graph using rte_graph_walk() with graph objects created one per worker lcore.

The lookup method is as per implementation of ip4_lookup/ip6_lookup graph node. The ID of the output interface for the input packet is the next hop returned by the LPM lookup. The set of LPM rules used by the application is statically configured and provided to ip4_lookup/ip6_lookup graph node and ip4_rewrite/ip6_rewrite graph node using node control API rte_node_ip4_route_add()/rte_node_ip6_route_add and rte_node_ip4_rewrite_add()/rte_node_ip6_rewrite_add.

In the sample application, IPv4 and IPv6 forwarding is supported.

21.2. Compiling the Application

To compile the sample application see Compiling the Sample Applications.

The application is located in the l3fwd-graph sub-directory.

21.3. Running the Application

The application has a number of command line options similar to l3fwd:

./dpdk-l3fwd-graph [EAL options] -- -p PORTMASK
                               [-P]
                               --config(port,queue,lcore)[,(port,queue,lcore)]
                               [--eth-dest=X,MM:MM:MM:MM:MM:MM]
                               [--max-pkt-len PKTLEN]
                               [--no-numa]
                               [--per-port-pool]
                               [--pcap-enable]
                               [--pcap-num-cap]
                               [--pcap-file-name]
                               [--model]

Where,

  • -p PORTMASK: Hexadecimal bitmask of ports to configure

  • -P: Optional, sets all ports to promiscuous mode so that packets are accepted regardless of the packet’s Ethernet MAC destination address. Without this option, only packets with the Ethernet MAC destination address set to the Ethernet address of the port are accepted.

  • --config (port,queue,lcore)[,(port,queue,lcore)]: Determines which queues from which ports are mapped to which cores.

  • --eth-dest=X,MM:MM:MM:MM:MM:MM: Optional, ethernet destination for port X.

  • --max-pkt-len: Optional, maximum packet length in decimal (64-9600).

  • --no-numa: Optional, disables numa awareness.

  • --per-port-pool: Optional, set to use independent buffer pools per port. Without this option, single buffer pool is used for all ports.

  • --pcap-enable: Optional, Enables packet capture in pcap format on each node with mbuf and node metadata.

  • --pcap-num-cap: Optional, Number of packets to be captured per core.

  • --pcap-file-name: Optional, Pcap filename to capture packets in.

  • --model: Optional, select graph walking model.

For example, consider a dual processor socket platform with 8 physical cores, where cores 0-7 and 16-23 appear on socket 0, while cores 8-15 and 24-31 appear on socket 1.

To enable L3 forwarding between two ports, assuming that both ports are in the same socket, using two cores, cores 1 and 2, (which are in the same socket too), use the following command:

./<build_dir>/examples/dpdk-l3fwd-graph -l 1,2 -n 4 -- -p 0x3 --config="(0,0,1),(1,0,2)"

In this command:

  • The -l option enables cores 1, 2

  • The -p option enables ports 0 and 1

  • The –config option enables one queue on each port and maps each (port,queue) pair to a specific core. The following table shows the mapping in this example:

Port

Queue

lcore

Description

0

0

1

Map queue 0 from port 0 to lcore 1.

1

0

2

Map queue 0 from port 1 to lcore 2.

To enable pcap trace on each graph, use following command:

./<build_dir>/examples/dpdk-l3fwd-graph -l 1,2 -n 4 -- -p 0x3 --config="(0,0,1),(1,0,2)" --pcap-enable --pcap-num-cap=<number of packets> --pcap-file-name "</path/to/file>"

In this command:

  • The –pcap-enable option enables pcap trace on graph nodes.

  • The –pcap-num-cap option enables user to configure number packets to be captured per graph. Default 1024 packets per graph are captured.

  • The –pcap-file-name option enables user to give filename in which packets are to be captured.

To enable mcore dispatch model, the application need change RTE_GRAPH_MODEL_SELECT to #define RTE_GRAPH_MODEL_SELECT RTE_GRAPH_MODEL_MCORE_DISPATCH before including rte_graph_worker.h. Recompile and use following command:

./<build_dir>/examples/dpdk-l3fwd-graph -l 1,2,3,4 -n 4 -- -p 0x1 --config="(0,0,1)" -P --model="dispatch"

To enable graph walking model selection in run-time, remove the define of RTE_GRAPH_MODEL_SELECT. Recompile and use the same command.

In this command:

  • The –model option enables user to select rtc or dispatch model.

Refer to the DPDK Getting Started Guide for general information on running applications and the Environment Abstraction Layer (EAL) options.

21.4. Explanation

The following sections provide some explanation of the sample application code. As mentioned in the overview section, the initialization is similar to that of the L3 Forwarding Sample Application. Run-time path though similar in functionality to that of L3 Forwarding Sample Application, major part of the implementation is in graph nodes via used via librte_node library. The following sections describe aspects that are specific to the L3 Forwarding Graph sample application.

21.4.1. Graph Node Pre-Init Configuration

After device configuration and device Rx, Tx queue setup is complete, a minimal config of port id, num_rx_queues, num_tx_queues, mempools etc will be passed to ethdev_* node ctrl API rte_node_eth_config(). This will be lead to the clone of ethdev_rx and ethdev_tx nodes as ethdev_rx-X-Y and ethdev_tx-X where X, Y represent port id and queue id associated with them. In case of ethdev_tx-X nodes, tx queue id assigned per instance of the node is same as graph id.

These cloned nodes along with existing static nodes such as ip4_lookup/ip6_lookup and ip4_rewrite/ip6_rewrite will be used in graph creation to associate node’s to lcore specific graph object.

RTE_ETH_FOREACH_DEV(portid)
{
	struct rte_eth_conf local_port_conf = port_conf;

	/* Skip ports that are not enabled */
	if ((enabled_port_mask & (1 << portid)) == 0) {
		printf("\nSkipping disabled port %d\n", portid);
		continue;
	}

	/* Init port */
	printf("Initializing port %d ... ", portid);
	fflush(stdout);

	nb_rx_queue = get_port_n_rx_queues(portid);
	n_tx_queue = nb_lcores;
	if (n_tx_queue > MAX_TX_QUEUE_PER_PORT)
		n_tx_queue = MAX_TX_QUEUE_PER_PORT;
	printf("Creating queues: nb_rxq=%d nb_txq=%u... ",
	       nb_rx_queue, n_tx_queue);

	rte_eth_dev_info_get(portid, &dev_info);

	ret = config_port_max_pkt_len(&local_port_conf, &dev_info);
	if (ret != 0)
		rte_exit(EXIT_FAILURE,
			"Invalid max packet length: %u (port %u)\n",
			max_pkt_len, portid);

	if (dev_info.tx_offload_capa & RTE_ETH_TX_OFFLOAD_MBUF_FAST_FREE)
		local_port_conf.txmode.offloads |=
			RTE_ETH_TX_OFFLOAD_MBUF_FAST_FREE;

	local_port_conf.rx_adv_conf.rss_conf.rss_hf &=
		dev_info.flow_type_rss_offloads;
	if (local_port_conf.rx_adv_conf.rss_conf.rss_hf !=
	    port_conf.rx_adv_conf.rss_conf.rss_hf) {
		printf("Port %u modified RSS hash function based on "
		       "hardware support,"
		       "requested:%#" PRIx64 " configured:%#" PRIx64
		       "\n",
		       portid, port_conf.rx_adv_conf.rss_conf.rss_hf,
		       local_port_conf.rx_adv_conf.rss_conf.rss_hf);
	}

	ret = rte_eth_dev_configure(portid, nb_rx_queue,
				    n_tx_queue, &local_port_conf);
	if (ret < 0)
		rte_exit(EXIT_FAILURE,
			 "Cannot configure device: err=%d, port=%d\n",
			 ret, portid);

	ret = rte_eth_dev_adjust_nb_rx_tx_desc(portid, &nb_rxd,
					       &nb_txd);
	if (ret < 0)
		rte_exit(EXIT_FAILURE,
			 "Cannot adjust number of descriptors: err=%d, "
			 "port=%d\n",
			 ret, portid);

	rte_eth_macaddr_get(portid, &ports_eth_addr[portid]);
	print_ethaddr(" Address:", &ports_eth_addr[portid]);
	printf(", ");
	print_ethaddr(
		"Destination:",
		(const struct rte_ether_addr *)&dest_eth_addr[portid]);
	printf(", ");

	/*
	 * prepare src MACs for each port.
	 */
	rte_ether_addr_copy(
		&ports_eth_addr[portid],
		(struct rte_ether_addr *)(val_eth + portid) + 1);

	/* Init memory */
	if (!per_port_pool) {
		/* portid = 0; this is *not* signifying the first port,
		 * rather, it signifies that portid is ignored.
		 */
		ret = init_mem(0, NB_MBUF(nb_ports));
	} else {
		ret = init_mem(portid, NB_MBUF(1));
	}
	if (ret < 0)
		rte_exit(EXIT_FAILURE, "init_mem() failed\n");

	/* Init one TX queue per couple (lcore,port) */
	queueid = 0;
	for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++) {
		if (rte_lcore_is_enabled(lcore_id) == 0)
			continue;

		qconf = &lcore_conf[lcore_id];

		if (numa_on)
			socketid = (uint8_t)rte_lcore_to_socket_id(
				lcore_id);
		else
			socketid = 0;

		printf("txq=%u,%d,%d ", lcore_id, queueid, socketid);
		fflush(stdout);

		txconf = &dev_info.default_txconf;
		txconf->offloads = local_port_conf.txmode.offloads;
		ret = rte_eth_tx_queue_setup(portid, queueid, nb_txd,
					     socketid, txconf);
		if (ret < 0)
			rte_exit(EXIT_FAILURE,
				 "rte_eth_tx_queue_setup: err=%d, "
				 "port=%d\n",
				 ret, portid);
		queueid++;
	}

	/* Setup ethdev node config */
	ethdev_conf[nb_conf].port_id = portid;
	ethdev_conf[nb_conf].num_rx_queues = nb_rx_queue;
	ethdev_conf[nb_conf].num_tx_queues = n_tx_queue;
	if (!per_port_pool)
		ethdev_conf[nb_conf].mp = pktmbuf_pool[0];

	else
		ethdev_conf[nb_conf].mp = pktmbuf_pool[portid];
	ethdev_conf[nb_conf].mp_count = NB_SOCKETS;

	nb_conf++;
	printf("\n");
}

for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++) {
	if (rte_lcore_is_enabled(lcore_id) == 0)
		continue;
	qconf = &lcore_conf[lcore_id];
	printf("\nInitializing rx queues on lcore %u ... ", lcore_id);
	fflush(stdout);
	/* Init RX queues */
	for (queue = 0; queue < qconf->n_rx_queue; ++queue) {
		struct rte_eth_rxconf rxq_conf;

		portid = qconf->rx_queue_list[queue].port_id;
		queueid = qconf->rx_queue_list[queue].queue_id;

		if (numa_on)
			socketid = (uint8_t)rte_lcore_to_socket_id(
				lcore_id);
		else
			socketid = 0;

		printf("rxq=%d,%d,%d ", portid, queueid, socketid);
		fflush(stdout);

		rte_eth_dev_info_get(portid, &dev_info);
		rxq_conf = dev_info.default_rxconf;
		rxq_conf.offloads = port_conf.rxmode.offloads;
		if (!per_port_pool)
			ret = rte_eth_rx_queue_setup(
				portid, queueid, nb_rxd, socketid,
				&rxq_conf, pktmbuf_pool[0][socketid]);
		else
			ret = rte_eth_rx_queue_setup(
				portid, queueid, nb_rxd, socketid,
				&rxq_conf,
				pktmbuf_pool[portid][socketid]);
		if (ret < 0)
			rte_exit(EXIT_FAILURE,
				 "rte_eth_rx_queue_setup: err=%d, "
				 "port=%d\n",
				 ret, portid);

		/* Add this queue node to its graph */
		snprintf(qconf->rx_queue_list[queue].node_name,
			 RTE_NODE_NAMESIZE, "ethdev_rx-%u-%u", portid,
			 queueid);
	}

	/* Alloc a graph to this lcore only if source exists  */
	if (qconf->n_rx_queue)
		nb_graphs++;
}

printf("\n");

/* Ethdev node config, skip rx queue mapping */
ret = rte_node_eth_config(ethdev_conf, nb_conf, nb_graphs);

21.4.2. Graph Initialization

Now a graph needs to be created with a specific set of nodes for every lcore. A graph object returned after graph creation is a per lcore object and cannot be shared between lcores. Since ethdev_tx-X node is per port node, it can be associated with all the graphs created as all the lcores should have Tx capability for every port. But ethdev_rx-X-Y node is created per (port, rx_queue_id), so they should be associated with a graph based on the application argument --config specifying rx queue mapping to lcore.

Note

The Graph creation will fail if the passed set of shell node pattern’s are not sufficient to meet their inter-dependency or even one node is not found with a given regex node pattern.

static const char * const default_patterns[] = {
	"ip4*",
	"ethdev_tx-*",
	"pkt_drop",
};
uint8_t socketid;
uint16_t nb_rx_queue, queue;
struct rte_graph_param graph_conf;
struct rte_eth_dev_info dev_info;
uint32_t nb_ports, nb_conf = 0;
uint32_t n_tx_queue, nb_lcores;
struct rte_eth_txconf *txconf;
uint16_t queueid, portid, i;
const char **node_patterns;
struct lcore_conf *qconf;
uint16_t nb_graphs = 0;
uint16_t nb_patterns;
uint8_t rewrite_len;
uint32_t lcore_id;
int ret;

/* Init EAL */
ret = rte_eal_init(argc, argv);
if (ret < 0)
	rte_exit(EXIT_FAILURE, "Invalid EAL parameters\n");
argc -= ret;
argv += ret;

force_quit = false;
signal(SIGINT, signal_handler);
signal(SIGTERM, signal_handler);

/* Pre-init dst MACs for all ports to 02:00:00:00:00:xx */
for (portid = 0; portid < RTE_MAX_ETHPORTS; portid++) {
	dest_eth_addr[portid] =
		RTE_ETHER_LOCAL_ADMIN_ADDR + ((uint64_t)portid << 40);
	*(uint64_t *)(val_eth + portid) = dest_eth_addr[portid];
}

/* Parse application arguments (after the EAL ones) */
ret = parse_args(argc, argv);
if (ret < 0)
	rte_exit(EXIT_FAILURE, "Invalid L3FWD_GRAPH parameters\n");

if (check_lcore_params() < 0)
	rte_exit(EXIT_FAILURE, "check_lcore_params() failed\n");

if (check_worker_model_params() < 0)
	rte_exit(EXIT_FAILURE, "check_worker_model_params() failed\n");

ret = init_lcore_rx_queues();
if (ret < 0)
	rte_exit(EXIT_FAILURE, "init_lcore_rx_queues() failed\n");

if (check_port_config() < 0)
	rte_exit(EXIT_FAILURE, "check_port_config() failed\n");

nb_ports = rte_eth_dev_count_avail();
nb_lcores = rte_lcore_count();

/* Initialize all ports. 8< */
RTE_ETH_FOREACH_DEV(portid)
{
	struct rte_eth_conf local_port_conf = port_conf;

	/* Skip ports that are not enabled */
	if ((enabled_port_mask & (1 << portid)) == 0) {
		printf("\nSkipping disabled port %d\n", portid);
		continue;
	}

	/* Init port */
	printf("Initializing port %d ... ", portid);
	fflush(stdout);

	nb_rx_queue = get_port_n_rx_queues(portid);
	n_tx_queue = nb_lcores;
	if (n_tx_queue > MAX_TX_QUEUE_PER_PORT)
		n_tx_queue = MAX_TX_QUEUE_PER_PORT;
	printf("Creating queues: nb_rxq=%d nb_txq=%u... ",
	       nb_rx_queue, n_tx_queue);

	rte_eth_dev_info_get(portid, &dev_info);

	ret = config_port_max_pkt_len(&local_port_conf, &dev_info);
	if (ret != 0)
		rte_exit(EXIT_FAILURE,
			"Invalid max packet length: %u (port %u)\n",
			max_pkt_len, portid);

	if (dev_info.tx_offload_capa & RTE_ETH_TX_OFFLOAD_MBUF_FAST_FREE)
		local_port_conf.txmode.offloads |=
			RTE_ETH_TX_OFFLOAD_MBUF_FAST_FREE;

	local_port_conf.rx_adv_conf.rss_conf.rss_hf &=
		dev_info.flow_type_rss_offloads;
	if (local_port_conf.rx_adv_conf.rss_conf.rss_hf !=
	    port_conf.rx_adv_conf.rss_conf.rss_hf) {
		printf("Port %u modified RSS hash function based on "
		       "hardware support,"
		       "requested:%#" PRIx64 " configured:%#" PRIx64
		       "\n",
		       portid, port_conf.rx_adv_conf.rss_conf.rss_hf,
		       local_port_conf.rx_adv_conf.rss_conf.rss_hf);
	}

	ret = rte_eth_dev_configure(portid, nb_rx_queue,
				    n_tx_queue, &local_port_conf);
	if (ret < 0)
		rte_exit(EXIT_FAILURE,
			 "Cannot configure device: err=%d, port=%d\n",
			 ret, portid);

	ret = rte_eth_dev_adjust_nb_rx_tx_desc(portid, &nb_rxd,
					       &nb_txd);
	if (ret < 0)
		rte_exit(EXIT_FAILURE,
			 "Cannot adjust number of descriptors: err=%d, "
			 "port=%d\n",
			 ret, portid);

	rte_eth_macaddr_get(portid, &ports_eth_addr[portid]);
	print_ethaddr(" Address:", &ports_eth_addr[portid]);
	printf(", ");
	print_ethaddr(
		"Destination:",
		(const struct rte_ether_addr *)&dest_eth_addr[portid]);
	printf(", ");

	/*
	 * prepare src MACs for each port.
	 */
	rte_ether_addr_copy(
		&ports_eth_addr[portid],
		(struct rte_ether_addr *)(val_eth + portid) + 1);

	/* Init memory */
	if (!per_port_pool) {
		/* portid = 0; this is *not* signifying the first port,
		 * rather, it signifies that portid is ignored.
		 */
		ret = init_mem(0, NB_MBUF(nb_ports));
	} else {
		ret = init_mem(portid, NB_MBUF(1));
	}
	if (ret < 0)
		rte_exit(EXIT_FAILURE, "init_mem() failed\n");

	/* Init one TX queue per couple (lcore,port) */
	queueid = 0;
	for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++) {
		if (rte_lcore_is_enabled(lcore_id) == 0)
			continue;

		qconf = &lcore_conf[lcore_id];

		if (numa_on)
			socketid = (uint8_t)rte_lcore_to_socket_id(
				lcore_id);
		else
			socketid = 0;

		printf("txq=%u,%d,%d ", lcore_id, queueid, socketid);
		fflush(stdout);

		txconf = &dev_info.default_txconf;
		txconf->offloads = local_port_conf.txmode.offloads;
		ret = rte_eth_tx_queue_setup(portid, queueid, nb_txd,
					     socketid, txconf);
		if (ret < 0)
			rte_exit(EXIT_FAILURE,
				 "rte_eth_tx_queue_setup: err=%d, "
				 "port=%d\n",
				 ret, portid);
		queueid++;
	}

	/* Setup ethdev node config */
	ethdev_conf[nb_conf].port_id = portid;
	ethdev_conf[nb_conf].num_rx_queues = nb_rx_queue;
	ethdev_conf[nb_conf].num_tx_queues = n_tx_queue;
	if (!per_port_pool)
		ethdev_conf[nb_conf].mp = pktmbuf_pool[0];

	else
		ethdev_conf[nb_conf].mp = pktmbuf_pool[portid];
	ethdev_conf[nb_conf].mp_count = NB_SOCKETS;

	nb_conf++;
	printf("\n");
}

for (lcore_id = 0; lcore_id < RTE_MAX_LCORE; lcore_id++) {
	if (rte_lcore_is_enabled(lcore_id) == 0)
		continue;
	qconf = &lcore_conf[lcore_id];
	printf("\nInitializing rx queues on lcore %u ... ", lcore_id);
	fflush(stdout);
	/* Init RX queues */
	for (queue = 0; queue < qconf->n_rx_queue; ++queue) {
		struct rte_eth_rxconf rxq_conf;

		portid = qconf->rx_queue_list[queue].port_id;
		queueid = qconf->rx_queue_list[queue].queue_id;

		if (numa_on)
			socketid = (uint8_t)rte_lcore_to_socket_id(
				lcore_id);
		else
			socketid = 0;

		printf("rxq=%d,%d,%d ", portid, queueid, socketid);
		fflush(stdout);

		rte_eth_dev_info_get(portid, &dev_info);
		rxq_conf = dev_info.default_rxconf;
		rxq_conf.offloads = port_conf.rxmode.offloads;
		if (!per_port_pool)
			ret = rte_eth_rx_queue_setup(
				portid, queueid, nb_rxd, socketid,
				&rxq_conf, pktmbuf_pool[0][socketid]);
		else
			ret = rte_eth_rx_queue_setup(
				portid, queueid, nb_rxd, socketid,
				&rxq_conf,
				pktmbuf_pool[portid][socketid]);
		if (ret < 0)
			rte_exit(EXIT_FAILURE,
				 "rte_eth_rx_queue_setup: err=%d, "
				 "port=%d\n",
				 ret, portid);

		/* Add this queue node to its graph */
		snprintf(qconf->rx_queue_list[queue].node_name,
			 RTE_NODE_NAMESIZE, "ethdev_rx-%u-%u", portid,
			 queueid);
	}

	/* Alloc a graph to this lcore only if source exists  */
	if (qconf->n_rx_queue)
		nb_graphs++;
}

printf("\n");

/* Ethdev node config, skip rx queue mapping */
ret = rte_node_eth_config(ethdev_conf, nb_conf, nb_graphs);
/* >8 End of graph creation. */
if (ret)
	rte_exit(EXIT_FAILURE, "rte_node_eth_config: err=%d\n", ret);

/* Start ports */
RTE_ETH_FOREACH_DEV(portid)
{
	if ((enabled_port_mask & (1 << portid)) == 0)
		continue;

	/* Start device */
	ret = rte_eth_dev_start(portid);
	if (ret < 0)
		rte_exit(EXIT_FAILURE,
			 "rte_eth_dev_start: err=%d, port=%d\n", ret,
			 portid);

	/*
	 * If enabled, put device in promiscuous mode.
	 * This allows IO forwarding mode to forward packets
	 * to itself through 2 cross-connected  ports of the
	 * target machine.
	 */
	if (promiscuous_on)
		rte_eth_promiscuous_enable(portid);
}

printf("\n");

check_all_ports_link_status(enabled_port_mask);

/* Graph Initialization */
nb_patterns = RTE_DIM(default_patterns);
node_patterns = malloc((MAX_RX_QUEUE_PER_LCORE + nb_patterns) *
		       sizeof(*node_patterns));
if (!node_patterns)
	return -ENOMEM;
memcpy(node_patterns, default_patterns,
       nb_patterns * sizeof(*node_patterns));

memset(&graph_conf, 0, sizeof(graph_conf));
graph_conf.node_patterns = node_patterns;
graph_conf.nb_node_patterns = nb_patterns;

/* Pcap config */
graph_conf.pcap_enable = pcap_trace_enable;
graph_conf.num_pkt_to_capture = packet_to_capture;
graph_conf.pcap_filename = pcap_filename;

if (model_conf == RTE_GRAPH_MODEL_MCORE_DISPATCH)
	graph_config_mcore_dispatch(graph_conf);
else
	graph_config_rtc(graph_conf);

rte_graph_worker_model_set(model_conf);

21.4.3. Forwarding data(Route, Next-Hop) addition

Once graph objects are created, node specific info like routes and rewrite headers will be provided run-time using rte_node_ip4_route_add()/ rte_node_ip6_route_add and rte_node_ip4_rewrite_add()/rte_node_ip6_rewrite_add API.

Note

Since currently ip4_lookup/ip6_lookup and ip4_rewrite/ip6_rewrite nodes don’t support lock-less mechanisms(RCU, etc) to add run-time forwarding data like route and rewrite data, forwarding data is added before packet processing loop is launched on worker lcore.

for (i = 0; i < IPV4_L3FWD_LPM_NUM_ROUTES; i++) {
	char route_str[INET6_ADDRSTRLEN * 4];
	char abuf[INET6_ADDRSTRLEN];
	struct in_addr in;
	uint32_t dst_port;

	/* Skip unused ports */
	if ((1 << ipv4_l3fwd_lpm_route_array[i].if_out &
	     enabled_port_mask) == 0)
		continue;

	dst_port = ipv4_l3fwd_lpm_route_array[i].if_out;

	in.s_addr = htonl(ipv4_l3fwd_lpm_route_array[i].ip);
	snprintf(route_str, sizeof(route_str), "%s / %d (%d)",
		 inet_ntop(AF_INET, &in, abuf, sizeof(abuf)),
		 ipv4_l3fwd_lpm_route_array[i].depth,
		 ipv4_l3fwd_lpm_route_array[i].if_out);

	/* Use route index 'i' as next hop id */
	ret = rte_node_ip4_route_add(
		ipv4_l3fwd_lpm_route_array[i].ip,
		ipv4_l3fwd_lpm_route_array[i].depth, i,
		RTE_NODE_IP4_LOOKUP_NEXT_REWRITE);

	if (ret < 0)
		rte_exit(EXIT_FAILURE,
			 "Unable to add ip4 route %s to graph\n",
			 route_str);

	memcpy(rewrite_data, val_eth + dst_port, rewrite_len);

	/* Add next hop rewrite data for id 'i' */
	ret = rte_node_ip4_rewrite_add(i, rewrite_data,
				       rewrite_len, dst_port);
	if (ret < 0)
		rte_exit(EXIT_FAILURE,
			 "Unable to add next hop %u for "
			 "route %s\n", i, route_str);

	RTE_LOG(INFO, L3FWD_GRAPH, "Added route %s, next_hop %u\n",
		route_str, i);
}

for (i = 0; i < IPV6_L3FWD_LPM_NUM_ROUTES; i++) {
	char route_str[INET6_ADDRSTRLEN * 4];
	char abuf[INET6_ADDRSTRLEN];
	struct in6_addr in6;
	uint32_t dst_port;

	/* Skip unused ports */
	if ((1 << ipv6_l3fwd_lpm_route_array[i].if_out &
	     enabled_port_mask) == 0)
		continue;

	dst_port = ipv6_l3fwd_lpm_route_array[i].if_out;

	memcpy(in6.s6_addr, ipv6_l3fwd_lpm_route_array[i].ip, RTE_LPM6_IPV6_ADDR_SIZE);
	snprintf(route_str, sizeof(route_str), "%s / %d (%d)",
		 inet_ntop(AF_INET6, &in6, abuf, sizeof(abuf)),
		 ipv6_l3fwd_lpm_route_array[i].depth,
		 ipv6_l3fwd_lpm_route_array[i].if_out);

	/* Use route index 'i' as next hop id */
	ret = rte_node_ip6_route_add(ipv6_l3fwd_lpm_route_array[i].ip,
		ipv6_l3fwd_lpm_route_array[i].depth, i,
		RTE_NODE_IP6_LOOKUP_NEXT_REWRITE);

	if (ret < 0)
		rte_exit(EXIT_FAILURE,
			 "Unable to add ip6 route %s to graph\n",
			 route_str);

	memcpy(rewrite_data, val_eth + dst_port, rewrite_len);

	/* Add next hop rewrite data for id 'i' */
	ret = rte_node_ip6_rewrite_add(i, rewrite_data,
				       rewrite_len, dst_port);
	if (ret < 0)
		rte_exit(EXIT_FAILURE,
			 "Unable to add next hop %u for "
			 "route %s\n", i, route_str);

	RTE_LOG(INFO, L3FWD_GRAPH, "Added route %s, next_hop %u\n",
		route_str, i);
}

21.4.4. Packet Forwarding using Graph Walk

Now that all the device configurations are done, graph creations are done and forwarding data is updated with nodes, worker lcores will be launched with graph main loop. Graph main loop is very simple in the sense that it needs to continuously call a non-blocking API rte_graph_walk() with it’s lcore specific graph object that was already created.

Note

rte_graph_walk() will walk over all the sources nodes i.e ethdev_rx-X-Y associated with a given graph and Rx the available packets and enqueue them to the following node pkt_cls which based on the packet type will enqueue them to ip4_lookup/ip6_lookup which then will enqueue them to ip4_rewrite/ip6_rewrite node if LPM lookup succeeds. ip4_rewrite/ip6_rewrite node then will update Ethernet header as per next-hop data and transmit the packet via port ‘Z’ by enqueuing to ethdev_tx-Z node instance in its graph object.

static int
graph_main_loop(void *conf)
{
	struct lcore_conf *qconf;
	struct rte_graph *graph;
	uint32_t lcore_id;

	RTE_SET_USED(conf);

	lcore_id = rte_lcore_id();
	qconf = &lcore_conf[lcore_id];
	graph = qconf->graph;

	if (!graph) {
		RTE_LOG(INFO, L3FWD_GRAPH, "Lcore %u has nothing to do\n",
			lcore_id);
		return 0;
	}

	RTE_LOG(INFO, L3FWD_GRAPH,
		"Entering main loop on lcore %u, graph %s(%p)\n", lcore_id,
		qconf->name, graph);

	while (likely(!force_quit))
		rte_graph_walk(graph);

	return 0;
}