Link™ NT40E3 SmartNIC
The Link™ NT40E3 SmartNIC provides full packet capture and analysis of Ethernet LAN at 40 Gbps with zero packet loss for all frame sizes. Intelligent features accelerate application performance with extremely low CPU load. Flexible time synchronization support is included with a dedicated PPS/PTP port.
1G/10G port speed automatically selected by the transceiver modules used.
For any link speed at any time
PLUG & PLAY
Out of the box solution
Multiple FPGA SmartNICs in one server
Synchronize multiple servers
Accelerate your application
Full throughput with zero packet loss
Multiple speeds in one server
More powerful server usage
Key Napatech SmartNIC features
Link™ NT40E3 SmartNIC Features
Full line-rate packet capture
Multi-port packet sequence
Multi-port packet sequence and merge
Napatech FPGA SmartNICs typically provide multiple ports. Ports are usually paired, with one port receiving upstream packets and another port receiving downstream packets. Since these two flows going in different directions need to be analyzed as one, packets from both ports must be merged into a single analysis stream. Napatech FPGA SmartNICs can sequence and merge packets received on multiple ports in hardware using the precise time stamps of each Ethernet frame. This is highly efficient and offloads a significant and costly task from the analysis application.
There is a growing need for analysis appliances that are able to monitor and analyze multiple points in the network, and even provide a network-wide view of what is happening. Not only does this require multiple FPGA SmartNICs to be installed in a single appliance, but it also requires that the analysis data from all ports on every accelerator be correlated.
With the Napatech Software Suite, it is possible to sequence and merge the analysis data from multiple FPGA SmartNICs into a single analysis stream. The merging is based on the nanosecond precision time stamps of each Ethernet frame, allowing a time-ordered merge of individual data streams.
Intelligent Multi-CPU distribution
Modern servers provide unprecedented processing power with multi-core CPU implementations. This makes standard servers an ideal platform for appliance development. But, to fully harness the processing power of modern servers, it is important that the analysis application is multi-threaded and that the right Ethernet frames are provided to the right CPU core for processing. Not only that, but the frames must be provided at the right time to ensure that analysis can be performed in real time.
Napatech Multi-CPU distribution is built and optimized from our extensive knowledge of server architecture, as well as real life experience from our customers.
Napatech FPGA SmartNICs ensure that identified flows of related Ethernet frames are distributed in an optimal way to the available CPU cores. This ensures that the processing load is balanced across the available processing resources, and that the right frames are being processed by the right CPU cores.
With flow distribution to multiple CPU cores, the throughput performance of the analysis application can be increased linearly with the number of cores, up to 128. Not only that, but the performance can also be scaled by faster processing cores. This highly flexible mechanism enables many different ways of designing a solution and provides the ability to optimize for cost and/or performance.
Napatech FPGA SmartNICs support different distribution schemes that are fully configurable:
- Distribution per port: all frames captured on a physical port are transferred to the same CPU or a range of CPU cores for processing
- Distribution per traffic type: frames of the same protocol type are transferred to the same CPU or a range of CPU cores for processing
- Distribution by flows: frames with the same hash value are sent to the same CPU or a range of CPU cores for processing
- Combinations of the above
Hardware Time Stamp
The ability to establish the precise time when frames have been captured is critical to many applications.
To achieve this, all Napatech FPGA SmartNICs are capable of providing a high-precision time stamp, sampled with 1 nanosecond resolution, for every frame captured and transmitted.
At 10 Gbps, an Ethernet frame can be received and transmitted every 67 nanoseconds. At 100 Gbps, this time is reduced to 6.7 nanoseconds. This makes nanosecond-precision time-stamping essential for uniquely identifying when a frame is received. This incredible precision also enables you to sequence and merge frames from multiple ports on multiple FPGA SmartNICs into a single, time-ordered analysis stream.
In order to work smoothly in the different operating systems supported, Napatech FPGA SmartNICs support a range of industry standard time stamp formats, and also offer a choice of resolution to suit different types of applications.
64-bit time stamp formats:
- 2 Windows formats with 10-ns or 100-ns resolution
- Native UNIX format with 10-ns resolution
- 2 PCAP formats with 1-ns or 1000-ns resolution
Optimum Cache Utilization
Napatech FPGA SmartNICs use a buffering strategy that allocates a number of large memory buffers where as many packets as possible are placed back-to-back in each buffer. Using this implementation, only the first access to a packet in the buffer is affected by the access time to external memory. Thanks to cache pre-fetch, the subsequent packets are already in the level 1 cache before the CPU needs them. As hundreds or even thousands of packets can be placed in a buffer, a very high CPU cache performance can be achieved leading to application acceleration.
Buffer configuration can have a dramatic effect on the performance of analysis applications. Different applications have different requirements when it comes to latency or processing. It is therefore extremely important that the number and size of buffers can be optimized for the given application. Napatech FPGA SmartNICs make this possible.
The flexible server buffer structure supported by Napatech FPGA SmartNICs can be optimized for different application requirements. For example, applications needing short latency can have frames delivered in small chunks, optionally with a fixed maximum latency. Applications without latency requirements can benefit data delivered in large chunks, providing more effective server CPU processing by having the data. Applications that need to correlate information distributed across packets can configure larger server buffers (up to 128 GB).
Up to 128 buffers can be configured and combined with Napatech multi-CPU distribution (see “Multi-CPU distribution”).
On-Board Packet Buffering
Napatech FPGA SmartNICs provide on-board memory for buffering of Ethernet frames. Buffering assures guaranteed delivery of data, even when there is congestion in the delivery of data to the application. There are three potential sources of congestion: the PCI interface, the server platform, and the analysis application.
PCI interfaces provide a fixed bandwidth for transfer of data from the SmartNIC to the application. This limits the amount of data that can be continuously transferred from the network to the application. For example, a 16-lane PCIe Gen3 interface can transfer up to 115 Gbps of data to the application. If the network speed is 2×100 Gbps, a burst of data cannot be transferred over the PCIe Gen3 interface in real time, since the data rate is twice the maximum PCIe bandwidth. In this case, the onboard packet buffering on the Napatech SmartNIC can absorb the burst and ensure that none of the data is lost, allowing the frames to be transferred once the burst has passed.
Servers and applications can be configured in such a way that congestion can occur in the server infrastructure or in the application itself. The CPU cores can be busy processing or retrieving data from remote caches and memory locations, which means that new Ethernet frames cannot be transferred from the SmartNIC.
In addition, the application can be configured with only one or a few processing threads, which can result in the application being overloaded, meaning that new Ethernet frames cannot be transferred. With onboard packet buffering, the Ethernet frames can be delayed until the server or the application is ready to accept them. This ensures that no Ethernet frames are lost and that all the data is made available for analysis when needed.
In mobile networks, all subscriber Internet traffic is carried in GTP (GPRS Tunneling Protocol) or IP-in-IP tunnels between nodes in the mobile core. IP-in-IP tunnels are also used in enterprise networks. Monitoring traffic over interfaces between these nodes is crucial for assuring Quality of Service (QoS).
Napatech FPGA SmartNICs decode these tunnels, providing the ability to correlate and load balance based on flows inside the tunnels. Analysis applications can use this capability to test, secure, and optimize mobile networks and services. To effectively analyze the multiple services associated with each subscriber, it is important to separate them and analyze each one individually. Napatech FPGA SmartNICs have the capability to identify the contents of tunnels, allowing for analysis of each service used by a subscriber. This quickly provides the needed information to the application, and allows for efficient analysis of network and application traffic. The Napatech features for frame classification, flow identification, filtering, coloring, slicing, and intelligent multi-CPU distribution can thus be applied to the contents of the tunnel rather than the tunnel itself, leading to a more balanced processing and a more efficient analysis.
GTP and IP-in-IP tunneling are powerful features for telecom equipment vendors who need to build mobile network monitoring products. With this feature, Napatech can off-load and accelerate data analysis, allowing customers to focus on optimizing the application, and thereby maximizing the processing resources in standard servers.
IP fragment handling
In-line application support
The Napatech SmartNIC family supports 40 Gbps in-line applications enabling customers to create powerful, yet flexible in-line solutions on standard servers. The more CPU-demanding the application is, and the higher the speeds of links, the higher the value of this solution. Features include:
- Full throughput bidirectional Rx/Tx up to 40G link speed for any packet size
- Multi-core processing support with up to 128 Rx/Tx streams per accelerator
- Customizable hash-based load distribution
- Efficient zero copy roundtrip from Rx to Tx
- Single bit flip selection to discard or forward each individual packet
- Typical 50 us roundtrip latency from Rx to Tx fiber
For network security purposes, different traffic scenarios need to be recreated and simulated to toughen the infrastructure. The packets also need to be replayed to understand delays and disruptions caused by traffic bursts/peaks to improve Quality of Service (QoS). With Napatech FPGA SmartNICs, it is easy to setup and specify the test scenario to replay the same PCAP files from real network events at 10G, 40G and 100G link speeds.
Access control and authentication solutions can now implement full line rate solutions, that can cope with small packets, with a SmartNIC that does robust packet delivery at high network loads. Session control propels traffic in and out of the SmartNIC, at low latency (<5us), while simultaneously copying a subset to the host CPU for analysis. With the session control feature, inline use cases can benefit from low latency at speeds 1-100G.
Get highest precision time stamping for traffic that needs to be redistributed to multiple network devices. Napatech FPGA SmartNICs systems can forward and/or split traffic captured on a single tapping point to a cluster of servers for processing, without using additional equipment. This is achieved by the Napatech FPGA SmartNICs acting as both Smart Taps and packet capture devices and is apt for multi-box solutions with single tapping points. This feature eliminates the need to implement expensive SmartTaps, time stamping switches, packet brokers and other time sync components.
CPU socket load balancer
CPU socket load balancer
Further enhance your CPU utilization with the CPU Socket Load Balancer capability offered by Napatech NT40E3 FPGA SmartNICs. Improve CPU performance by up to 30% per server for 4x10G analysis with Napatech FPGA SmartNICs that can efficiently distribute traffic to 2 CPU sockets, making the packets available to multiple analysis threads on both CPU sockets, simultaneously. This frees up CPU resources needed for copying data between the two sockets and eliminates the need for expensive QPI bus transfers.
Link™ NT40E3-4-PTP SmartNIC
Link™ NT40E3-PTP-NEBS SmartNIC
Napatech Software Suite
Napatech Software Suite provides a well-defined application programming interface as well as support for the well-known, open-source interface libpcap and the Windows variant called WinPcap. This allows programmers to quickly integrate Napatech FPGA SmartNICs for network monitoring and security applications into their system.
A common API is provided for all Napatech FPGA SmartNICs allowing plug-and-play operation. An intuitive, easy-to-learn, yet powerful programming language is also provided to allow dynamic, on-the-fly configuration of filtering and intelligent multi-CPU distribution on Napatech FPGA SmartNICs.
Used across industries
Financial latency measurement
Our solutions deliver data to applications that make delays visible by capturing all transactions and measuring the exact time of each trading event up to the nanosecond. This enables financial institutions to guarantee optimal performance and transparency of their trading infrastructure.
Network performance management
Our solutions deliver data to applications that monitor and troubleshoot all network activity in real time, enabling analysis of network performance metrics from multiple locations in the network. This helps network managers to optimize infrastructure efficiency.
Troubleshooting and compliance
Our solutions deliver data to applications that provide access to all information that has passed through the network in the order it was received. This allows network managers to comply with regulations, as well as analyze problems from historical data. It also allows them to take actions that will prevent problems from recurring in the future.
Revenue and services optimization
Our solutions deliver data to applications that can analyze subscriber behavior as well as specific app usage, enabling operators to adjust their services and business models to maximize value.
Ultimate tech specs
|TECH SPECS||Link™ NT40E3-4-PTP & Link™ NT40E3-4-PTP-NEBS|
|Network Interfaces||• Standard: IEEE 802.3 10 Gbps Ethernet LAN|
• Physical interface: 4 x SFP+ portsSFP28 ports
|Supported Modules||• Supported SFP modules: Multi-mode SX, single-mode LX and ZX, 1000BASE-T or 10/100/1000BASE-T|
• Supported SFP+ modules: Multi-mode SR, singlemode LR and ER, 10GBASE-CR
• Supported dual-rate modules: Multi-mode SR and singlemode LR
|Performance||• Capture rate: From 4 x 1 Gbps to 4 x 10 Gbps|
dependent on transceiver module used
• Transmit rate: From 4 x 1 Gbps to 4 x 10 Gbps
dependent on transceiver module used
• CPU load: < 5%
|On-board IEEE 1588-2008 (PTPV2)||• Full IEEE 1588-2008 stack|
• Packet Delay Variation (PDV) filter
• PTP master and slave in IEEE 1588-2008 default profile
• PTP slave in IEEE 1588-2008 telecom and power profiles
|Hardware Time Stamp||• Resolution: 1 ns|
• Stratum 3 compliant TCXO
|Time Formats||• PCAP-ns/-μs|
• NDIS 10 ns/100 ns
• UNIX 10 ns
|Time Synchronization||• External connectors: Dedicated pluggable|
• Internal connectors: 2 for daisy-chain support
|Pluggable Options for Time Synchronization||• PPS for GPS and CDMA|
• IEEE 1588-2008 (PTP v2)
• NT-TS for accelerator-to-accelerator time sync
|Host Interface and Memory||• Bus type: 8-lane 8 GT/s PCIe Gen3|
• PCIe performance: 48 Gbps full duplex
• Onboard RAM: 4 GB DDR3
• Flash: Supports 2 boot images
|Statistics||• RMON1 counters plus jumbo frame counters per port|
• Frame and byte counters per color (filter) and per host buffer
• Counter sets always delivered as a consistent time-stamped snapshot
|Environment for NT40E3-4-PTP||• Power consumption: 27 Watts including SFP+ SR modules|
• Operating temperature: 0° to 45°C (32° to 113°F)
• Operating humidity: 20% to 80%
• MTBF: 297,993 hours according to UTE C 80-810
|Environment for NT40E3-4-PTP-NEBS||• Operating temperature (up to 1,800 m and airflow of at least 2,5 m/s):|
–5 °C to 55 °C (23 °F to 131 °F) measured around the SmartNIC
• Operating humidity: 5% to 85%
|OS Support||• Linux|
|Software||• Easy-to-integrate NT-API|
• libpcap support
• WinPcap support
• Software PTP stack
|Physical Dimensions||• ½-length PCIe|
• Full-height PCIe
|Regulatory Approvals and Compliances||• PCI-SIG®|
• NEBS level 3
• cURus (UL)