Going against the flow

January 10, 2019
Alex Omø Agerholm
Alex Omø Agerholm, Blog, Blog Post, Flow management, FPGA
0 Comments

Most network applications and appliances are limited in their performance because packet processing is a very heavy and CPU intensive task. Packet processing normally includes activities like decoding (at least up to layer 4; sometimes even up to layer 7) which enables a stateful evaluation of each packet based on the flow context it belongs to.

Flow shunting

Some applications only require this cumbersome inspection for the first few packets, and the remaining packets in the same flows can then be handled by a flow table lookup or a flow cache. This can significantly increase the performance of the application as a lookup into a flow table is much faster than the full inspection and evaluation of each packet. This flow caching is often referred to as flow shunting. But even with flow shunting, the performance of modern inline security applications remains limited as all packets still have to be received by the application, looked up in the flow cache and, based on the flow context information, be retransmitted or dropped.

Application acceleration

To achieve the next level of acceleration, we therefore need to move the flow shunting and the entire flow table into a SmartNIC, thereby enabling packet management directly in the SmartNIC for flows that have already been classified by the application. That way, we remove the need to bring the packet data into the application space for the lookup and, potentially, the retransmission. This approach will significantly increase the performance of the application, which now only needs to handle packets for flows that have not yet been classified, i.e. unlearned flows. In other words, we enable the application to focus only on packet/flow classification, moving the responsibility of the packet-by-packet forwarding/dropping to the SmartNIC.

Take a network where only 10% of the packets belong to new or unclassified flows. When offloaded by a flow shunting SmartNIC, the application would then only have to process a tenth of the packets, which in the best case will increase the performance by a factor 10. This example demonstrates why flow shunting in a SmartNIC is a gamechanger. As more and more actions are added to the flow shunting mechanism – e.g. NAT, VLAN and VxLAN tagging/de-tagging, and IPSec en-/decryption – the performance gain will increase and make the SmartNIC continually more interesting for flow-based applications.

At Napatech, we have added a flow management feature to the data pipeline inside our SmartNICs. This flow table – or flow cache – has been implemented in the SmartNIC to offload and accelerate any flow-based applications. Based on the flow identified through the lookup, this new feature can perform a number of actions on the packet without involving the application. The first version, demoed in mid-October 2018, included the two basic actions: drop and forward. Based on these functionalities, we can provide flow shunting directly in the SmartNIC.

On the horizon

In the future, we will add more actions and thereby increase the number of use cases where we can do full SmartNIC offload thus providing even greater acceleration. We are currently working towards adding writeback into the flow table on a per-packet basis, thereby enabling metrics as well as stateful updates on a per-packet basis. Other potential actions to be added include:

VLAN and VxLAN tagging and untagging
NAT and NAPT
IPSec en-/decryption

All these ideas will be implemented inline on a per-flow basis and without application involvement, except for the configuration.

This type of SmartNIC-based flow shunting has existed for some time, but Napatech is taking it to the next level with support for up to 2x100G connectivity, 100 Million bi-directional flows, a learning rate of 1-2 million new flows per second, metrics update on a per-packet basis – and adding more innovative actions over time.

Flow shunting

Application acceleration

On the horizon

Alex Omø Agerholm

Related Posts