Pokemon Go has taken the world by storm. Nintendo have done it again. Just like the Nintendo Wii changed video game players from couch potatoes into physically active game participants...
This is the first post in a series of performance resilience blogs that we will be producing over the coming months. Performance resilience is the ability to ensure the performance of your commercial or home-made appliance in any data center environment. In other words, to ensure that your performance monitoring, cybersecurity or forensics appliance is resilient to common data center issues, such as badly configured networks, inability to specify desired connection type, time sync, power, space, etc.
In this first blog, we will look at deduplication and how support of deduplication in your SmartNIC ensures performance resilience when data center environments are not configured properly – router and switch SPAN ports specifically.
Assume the worst
When designing an appliance to analyze network data for monitoring performance, cybersecurity or forensics, it is natural to assume that the environments where your appliance will be deployed are configured correctly and adhere to best practices. It is also fair to assume that you can get the access and connectivity you need. Why would someone go to the trouble of paying for a commercial appliance or even fund the development of an appliance in-house, if they wouldn’t also ensure that the environment meets minimum requirements?
Unfortunately, it is not always like that, as many veterans of appliance installments will tell you. This is because the team responsible for deploying the appliance is not always the team responsible for running the data center. Appliances are not their first priority. So, what happens in practice, is that the team deploying the appliance is told to install the appliance in a specific location with specific connectivity, and that is that. You might prefer to use a tap, but that might not be available, so you need to use a Switched Port Analyzer (SPAN) port from a switch or router for access to network data.
While this might seem acceptable, it can lead to some unexpected and unwanted behavior that is responsible for those grey hairs on the heads of veterans! An example of this unwanted behavior is duplicate network packets.
How do duplicate packets occur?
Ideally, when performing network monitoring and analysis, you would like to use a tap to get direct access to the real data in real time. However, as we stated above, you can’t always dictate that and sometimes have to settle for connectivity to a SPAN port.
The difference between a tap and a SPAN port is that a tap is a physical device that is installed in the middle of the communication link so that all traffic passes through the tap and is copied to the appliance. Conversely, a SPAN port on a switch or router receives copies of all data passing through the switch, which can then be made available to the appliance through the SPAN port.
When configured properly, a SPAN port works just fine. Modern routers and switches have become better at ensuring that the data provided by SPAN ports is reliable. However, SPAN ports can be configured in a manner that leads to duplicate packets. In some cases, where SPAN ports are misconfigured, up to 50% of the packets provided by the SPAN port can be duplicates.
So, how does this occur? What you need to understand with respect to SPAN ports is that when a packet enters the switch on an ingress port, a copy is created – and when it leaves a switch on an egress port, another copy is created. In this case, duplicates are unavoidable. But it is possible to configure the SPAN to only create copies on ingress or egress from the switch, thus avoiding duplicates.
Nevertheless, it is not uncommon to arrive in a data center environment where SPAN ports are misconfigured and nobody has permission to change the configuration on the switch or router. In other words, there will be duplicates and you just have to live with it!
What is the impact of duplicates?
Duplicates can cause a lot of issues. The obvious issue is that double the amount of data requires double the amount of processing power, memory, power, etc. However, the main issue is false positives: errors that are not really errors or threats that are not really threats. One common way that duplicates affect analysis is by an increase in TCP out-of-order or retransmission warnings. Debugging these issues takes a lot of time, usually time that an overworked, understaffed network operations or security team does not have. In addition, any analysis performed on the basis of this information is probably not reliable, so this only exacerbates the issue.
How to achieve resilience
With deduplication built-in via a SmartNIC in the appliance, it is possible to detect up to 99.99% of duplicate packets produced by SPAN ports. Similar functionality is available on packet brokers, but for a sizeable extra license fee. On Napatech SmartNICs, this is just one of several powerful features delivered at no extra charge.
The solution is ideal for situations where the appliance is connected directly to a SPAN port, dramatically reducing the amount of damage that duplicates can cause. But, it also means that the appliance is resilient to any SPAN misconfigurations or other network architectural issues that can give rise to duplicates – without relying on other costly solutions, such as packet brokers, to provide the necessary functionality to complete the solution.
In other words, it is possible to ensure that the performance of your appliance is resilient to misconfigurations in these all too common situations. Stay tuned as we look at other data center issues and provide guidance on how to achieve the needed resilience.