For efficiency, a lot of monitoring and analysis tools focus on flow records to detect anomalies in the network - and only dive to the underlying packet level if needed. In this blog, Chief Product Architect Alex Agerholm discusses typical use cases, looks at NetFlow/IPFIX acceleration and speculates on the imminent future of flow processing.
What is a flow?
In simple terms, a network flow is a series of communications between two endpoints. Beyond these characteristics, however, the definition of a flow may not be totally clear for everyone. When utilized in the context of NetFlow or IPFIX records, most people can agree that we typically define a flow by its 5-tuple attributes (source and destination IP, source and destination port and the protocol field). But it is also common to use either a 4-tuple, dropping the protocol field, or even a 2-tuple, using only the IP addresses. The latter has the advantage that it does work for IP fragments as well.
Another de facto standard using flexible flow definitions is OpenFlow, which defines a flow as any combination of up to 45 fields (see complete list at the end of this blog), many of which are included in the packet itself and a few more abstract fields (ports, metadata, etc.) for Ethernet, IPv4, IPv6, and tunneled traffic. These OpenFlow constructs are inherently used in many flow-based forwarding paradigms like Open vSwitch (OVS), which is commonly used as the switching layer in virtualized environments for cloud data centers and telco NFV deployments.
A flow, on the other hand, could also be used in a more abstract context where it covers only a subset of a 5-tuple flow or alternatively a group of 5-tuple flows. And is it then a flow or maybe something else? That is a matter of definition. In other words, you could make your flow definition more specific by tightening your criteria – or more general by widening your criteria. In either case, you will get a collection of network packets with some common characteristics.
Above and beyond basic 5-tuples
At Napatech, we have been much more flexible in the way we have implemented the flow lookup function in our SmartNICs. We can extract several individual fields from the packet based on output from our packet decoder, and combine these in the key we use for our internal flow lookup. This is a very intelligent and flexible architecture, which broadens the scope of the feature far beyond 5-tuple flow matching. We can in fact extract up to 4 elements from anywhere in the packet and combine these into the key we use to look up our flow record. This enables a lot of different use cases and offers the user a very high level of flexibility and freedom to build very advanced solutions.
In addition to the “standard” 5-tuple example, typical use cases include MAC addresses or VLAN or VxLAN tags combined with IP addresses or even multiple VLAN or VxLAN tags combined with both outer and inner IP addresses. The latter requires the possibility to extract elements from more than 4 different locations in the packet – but here, the flexibility of the FPGA can help us as we will be able to extend the number of supported locations in the packet with a simple firmware upgrade to the FPGA.
At Napatech we refer to the feature as flow management, but this is really a bit too narrow as the feature can perform lookup in the flow table based on any key that can be specified using almost any packet data.
So, what seems to be merely a flow can end up a 5-tuple flow, a 2-tuple flow, a single 5- or 2-tuple flow from within a GTP or GPRS tunnel, a layer-2 connection or even a specific VxLAN tunnel. In other words, the scope of Napatech’s flow management feature goes way beyond the standard, basic flow definition to cover a multitude of other compositions and patterns, enabling a powerful network tool with massive potential.
Complete list of OpenFlow match fields:
0, /* Switch input port.
1, /* Switch physical input port.
2, /* Metadata passed between tables. *
3, /* Ethernet destination address. *
4, /* Ethernet source address.
5, /* Ethernet frame type.
6, /* VLAN id.
7, /* VLAN priority.
8, /* IP DSCP (6 bits in ToS field).
9, /* IP ECN (2 bits in ToS field).
10, /* IP protocol.
11, /* IPv4 source address.
12, /* IPv4 destination address.
13, /* TCP source port.
14, /* TCP destination port.
15, /* UDP source port.
16, /* UDP destination port.
17, /* SCTP source port.
18, /* SCTP destination port.
19, /* ICMP type.
20, /* ICMP code.
21, /* ARP opcode.
22, /* ARP source IPv4 address.
23, /* ARP target IPv4 address.
24, /* ARP source hardware address.
25, /* ARP target hardware address.
26, /* IPv6 source address.
27, /* IPv6 destination address.
28, /* IPv6 Flow Label.
29, /* ICMPv6 type.
30, /* ICMPv6 code.
31, /* Target address for ND.
32, /* Source link-layer for ND.
33, /* Target link-layer for ND.
34, /* MPLS label.
35, /* MPLS TC.
36, /* MPLS BoS bit.
37, /* PBB I-SID.
38, /* Logical Port Metadata.
39, /* IPv6 Extension Header pseudo-field.
41, /* PBB UCA header field.
42, /* TCP flags.
43, /* Output port from action set metadata.
44, /* Packet type value.