1. Audience and Purpose
The primary audience for this test report are network architects and engineers implementing the Suricata open source intrusion detection (IDS), inline intrusion prevention (IPS), and network security monitoring (NSM) solution. This report provides information on packet processing performance testing for the specified Suricata release on Intel® Programmable Acceleration Card with Intel Arria® 10 GX FPGA with Napatech Link™ Capture software.
The purpose of reporting these tests is not to imply a single "correct" approach, but rather to provide a baseline configuration and setup procedure with reproducible results. This will help guide architects and engineers who are evaluating and implementing IDS/IPS solutions and can assist in achieving optimal system performance.
2. System Specifications
The device under test (DUT) consists of a standard Intel® architecture COTS server populated with the following:
-
single or dual processor
-
DRAM memory
-
Intel® Programmable Acceleration Card with Intel Arria® 10 GX FPGA network interface card
-
Napatech Link™ Capture Software v11.1.4
Connected to the DUT is a traffic generator used to send traffic at a constant controlled rate to the DUT. In this test, Suricata is used in intrusion detection mode; traffic is sent one-way and throughput is measured using statistics reported by the application.
The traffic source can be another COTS server running a traffic generator application, or a dedicated hardware test set (e.g. Ixia).
2.1. DUT System Specifications
Server |
Dell PowerEdge R740 |
CPU |
2x Intel® Xeon® Gold 6138 CPU @ 2.0 GHz, 20 Cores, 40 Threads |
Memory |
128GB RAM (16 x 8GB RDIMM, 2666MT/s) |
PCIe |
PCIe Gen3 x8 slot |
NIC |
Intel® Programmable Acceleration Card with Intel Arria® 10 GX FPGA (1x40 GE) |
NIC |
Intel® Ethernet Network Adapter XL710-QDA2 (2x40 GE) |
Optical Transceiver |
40GBASE-SR4 QSFP+ 850nm 150m MTP/MPO Optical Transceiver Module |
Napatech Link™ Capture software |
v11.1.4 Linux |
Operating System |
CentOS Linux release 7 |
Linux kernel version |
3.10.0-957.1.3.el7.x86_64 |
Suricata version |
4.0.5 |
2.2. Traffic Generator System Specifications
Server |
Dell PowerEdge R740 |
CPU |
Intel® Xeon® Gold 5120 CPU @ 2.2 GHz, 14 Cores, 28 Threads |
Memory |
64GB RAM (8 x 8GB RDIMM, 2666MT/s) |
PCIe |
PCIe Gen3 x8 slot |
NIC |
Intel® Programmable Acceleration Card with Intel Arria® 10 GX FPGA |
Optical Transceiver |
40GBASE-SR4 QSFP+ 850nm 150m MTP/MPO Optical Transceiver Module |
Napatech Link™ Capture software |
v11.1.4 Linux |
Operating System |
CentOS Linux release 7 |
Linux kernel version |
3.10.0-957.1.3.el7.x86_64 |
3. Configuration and Setup
3.1. Suricata Installation and Configuration
Suricata is a free and open source, mature, fast and robust network threat detection engine.
Documentation for users and developers can be found at https://suricata-ids.org/docs/.
Suricata source code is available for download from https://suricata-ids.org/download/.
Detailed instructions for installing and configuring Suricata for use with Napatech Link™ Capture Software are provided in the Quick Guide (DN-1113 Napatech Software for Intel PAC Arria 10 GX.pdf), included with the Napatech software distribution.
To build and run Suricata with Napatech support the software must be compiled from source. Make sure that all dependencies listed in the installation guide are installed before attempting to build Suricata.
After downloading and extracting the Suricata tarball, run configure
to enable Napatech support and prepare for compilation:
./configure --enable-napatech --with-napatech-includes=/opt/napatech3/include \ --with-napatech-libraries=/opt/napatech3/lib make sudo make install-full sudo ldconfig
This will install Suricata into /usr/local/bin/
, use the default configuration in /usr/local/etc/suricata/
and with output log files to /usr/local/var/log/suricata
. Rules are installed in /usr/local/etc/suricata/rules/
.
3.1.1. Suricata Configuration
Suricata configuration is managed via a plain text file with YAML syntax. The default
suricata.yaml
file must be modified to enable Suricata to use the PAC interface and to tune for performance.
Edit the file /usr/local/etc/suricata/suricata.yaml
as follows.
Thread affinity settings
CPU thread affinity settings are configured in the YAML file under the threading:
section.
Modify CPU affinity of Suricata threads by binding different thread groups to specific CPUs.
# Suricata is multi-threaded. Here the threading can be influenced. threading: set-cpu-affinity: yes (1) # Tune cpu affinity of threads. Each family of threads can be bound # on specific CPUs. # cpu-affinity: - management-cpu-set: cpu: [ 0, 2 ] # include only these cpus in affinity settings (2) - receive-cpu-set: cpu: [ 4 ] # include only these cpus in affinity settings (2) - worker-cpu-set: cpu: [ 1, 3, "5-79" ] (3) mode: "exclusive" # Use explicitely 3 threads and don't compute number by using # detect-thread-ratio variable: # threads: 3 prio: #low: [ 0 ] medium: [ 0, 2, 4 ] high: [ 1, 3, "5-79" ] (4) default: "medium" #- verdict-cpu-set: # cpu: [ 0 ] # prio:
1 | set-cpu-affinity: yes |
2 | Management and receive threads are for general housekeeping and should be assigned to specific cores |
3 | Assign worker threads to CPU cores, mode exclusive |
4 | Set thread priority per CPU core |
All 80 compute threads are made available to Suricata with this configuration.
Napatech-specific settings
napatech: # The Host Buffer Allowance for all streams # (-1 = OFF, 1 - 100 = percentage of the host buffer that can be held back) # This may be enabled when sharing streams with another application. # Otherwise, it should be turned off. hba: -1 # use_all_streams set to "yes" will query the Napatech service for all configured # streams and listen on all of them. When set to "no" the streams config array # will be used. use-all-streams: no (1) zero-copy: yes (2) # The streams to listen on. This can be either: # a list of individual streams (e.g. streams: [0,1,2,3]) # or # a range of streams (e.g. streams: ["0-3"]) streams: ["0-63"] (3)
1 | Use specific streams |
2 | Enable zero-copy |
3 | Use 64 receive streams |
Other settings
Configure af-packet
interface for a standard NIC.
# Linux high speed capture support af-packet: - interface: p1p1 (1) threads: auto cluster-id: 99 cluster-type: cluster_flow defrag: no rollover: yes use-mmap: yes mmap-locked: yes tpacket-v3: yes
1 | Interface name may vary |
Increase the max-pending-packets
setting found in the Advanced settings
section.
# Number of packets preallocated per thread. The default is 1024. A higher number # will make sure each CPU will be more easily kept busy, but may negatively # impact caching. max-pending-packets: 65000
3.1.2. Suricata Rule set Configuration
Signatures play a very important role in Suricata. The public rule sets most commonly used are Emerging Threats, Emerging Threats Pro and Snort/Sourcefire VRT.
Full Rule Set
The "full rule set" used in this testing, the Emerging Threats rule set, is installed and enabled by the installation procedure above. It can be installed manually by the following command:
/usr/bin/wget -qO - https://rules.emergingthreats.net/open/suricata-4.0/emerging.rules.tar.gz | tar -x -z -C "/usr/local/etc/suricata/" -f -
Minimal Rule Set
For "minimal rule set" testing, comment out all the rules in the rule-files
section of the suricata configuration file and add a rule file containing a single rule. The single rule is specified as:
pass ip any any -> any any
3.2. Napatech Link™ Capture Software Configuration
Edit the Napatech ntservice configuration file (/opt/napatech3/config/ntservice.ini
) to create a receive host buffer for each of the 64 Napatech streams specified in the Suricata configuration file above.
HostBuffersRx = [32,16,0],[32,16,1]
Stop and restart ntservice after making changes to ntservice.ini
:
sudo /opt/napatech3/bin/ntstop.sh sudo /opt/napatech3/bin/ntstart.sh
3.2.1. Create streams and configure load distribution
Complete the DUT setup using NTPL (Napatech Programming Language) commands to create 64 streams and to distribute ingress traffic based on a 5-tuple hash.
Create a text file containing the following NTPL commands and save it to a file named
suricata.ntpl
:
Delete=All # Delete any existing filters Setup[numaNode=0] = streamid==(0..31) Setup[numaNode=1] = streamid==(32..63) HashMode[priority=4]=Hash5TupleSorted Assign[priority=0; streamid=(0..63)]= all
Run the following command to execute the NTPL commands using the ntpl
tool:
sudo /opt/napatech3/bin/ntpl -f suricata.ntpl
3.3. Traffic Generator Configuration
The traffic generator for this test is a standard COTS server with an Intel PAC NIC and Napatech Link™ Capture software.
For Suricata and most other IDS applications, traffic patterns have a significant effect on performance. For this reason, previously live network traffic is replayed to obtain a realistic traffic load.
The Napatech pktgen
tool is used to replay a PCAP file at controlled constant rates.
The PCAP is composed of more than 125K IPv4 flows with a combination of malicious and background traffic from dozens of common applications. Average packet size is 486 bytes.
Edit the Napatech ntservice configuration file (/opt/napatech3/config/ntservice.ini
) to increase the transmit buffer size.
HostBuffersRx = [4,16,-1] # [x1, x2, x3], ... HostBuffersTx = [4,2048,-1] # [number, size (MB), NUMA node]
Stop and restart ntservice after making changes to ntservice.ini
:
sudo /opt/napatech3/bin/ntstop.sh sudo /opt/napatech3/bin/ntstart.sh
4. Suricata Throughput Test Methodology
+-----------+ +------------+ | | | | | traffic | | | | |-------->| DUT | | generator | | | | | | | +-----------+ +------------+
4.1. Maximum Throughput Test
Procedure: Send a specific number of frames at a specific constant rate to the DUT and count the frames that are received and processed by the Suricata application.
Suricata throughput is then calculated as:
The first trial is run with a source traffic rate equal to 100% of the link rate and the test is repeated over a range of decreasing source rates. Throughput is calculated as above for each source rate. Maximum Suricata throughput is the highest calculated throughput value, disregarding any packet loss.
4.2. RFC 2544 Zero Packet Loss Test
Procedure: Send a specific number of frames at a specific constant rate to the DUT and count the frames that are received and processed by the Suricata application.
If the received frame count is less than the sent frame count, the source rate is reduced and the test is rerun. The lossless throughput is the highest rate at which the count of frames received by the DUT is equal to the number of frames sent by the traffic source.
5. Detailed Test Procedure
5.1. Start Suricata On DUT
To start Suricata with Napatech Link™ Capture software:
Make sure the ntservice has been started (sudo /opt/napatech3/bin/ntstart.sh
), then start suricata:
sudo suricata -c /usr/local/etc/suricata/suricata.yaml --napatech \ --runmode workers --init-errors-fatal -vv
To start Suricata on standard NIC:
sudo suricata -c /usr/local/etc/suricata/suricata.yaml --runmode workers \ --init-errors-fatal --af-packet=<if>
5.2. Run pktgen on traffic generator host
Start pktgen traffic generator:
/opt/napatech3/bin/pktgen -p 0 -n $REPLAYCOUNT -r $PORTRATE G -f suricata.pcap
5.3. Terminate Suricata, read statistics
The suricata stats.log
records performance statistics on a fixed interval, by default every 8 seconds.
When the pktgen run completes, terminate Suricata with SIGINT
signal (^C
).
5.3.1. Read Suricata Counters
Read Suricata counters from /usr/local/var/log/suricata/stats.log
.
Typical output from Suricata on standard NIC:
------------------------------------------------------------------------------------ Date: 8/31/2018 -- 13:35:53 (uptime: 0d, 00h 06m 16s) ------------------------------------------------------------------------------------ Counter | TM Name | Value ------------------------------------------------------------------------------------ capture.kernel_packets | Total | 902442616 capture.kernel_drops | Total | 116043762 decoder.pkts | Total | 786398843 decoder.bytes | Total | 380058530984 decoder.ipv4 | Total | 786384760 decoder.ipv6 | Total | 10 decoder.ethernet | Total | 786398843 decoder.tcp | Total | 715915981 decoder.udp | Total | 70468779 decoder.icmpv6 | Total | 10 decoder.avg_pkt_size | Total | 483 decoder.max_pkt_size | Total | 1514
Typical output from Suricata on PAC A10 NIC with Napatech Link™ Capture software:
------------------------------------------------------------------------------------ Date: 8/30/2018 -- 22:24:10 (uptime: 0d, 00h 04m 41s) ------------------------------------------------------------------------------------ Counter | TM Name | Value ------------------------------------------------------------------------------------ nt0.pkts | Total | 16667025 nt0.bytes | Total | 7548238800 nt1.pkts | Total | 19253339 nt1.bytes | Total | 9381782396 [...] nt63.pkts | Total | 20265827 nt63.bytes | Total | 9967031528 decoder.pkts | Total | 1253395200 (1) decoder.bytes | Total | 614155305000 decoder.ipv4 | Total | 1253380800 decoder.ethernet | Total | 1253395200 decoder.tcp | Total | 1142593500 decoder.udp | Total | 110787300 decoder.avg_pkt_size | Total | 489 decoder.max_pkt_size | Total | 1518 [...] flow.memuse | Total | 54546496
1 | Record the Suricata decoder.pkts count; this represents the number of
packets received and processed by Suricata’s "decoder" module. |
decoder.pkts count = 1148945600
5.3.2. Read Transmit Packet Count
Read pktgen "Sent packets" count from pktgen terminal outout:
napatech@pit5:~$ /opt/napatech3/bin/pktgen -p 0 -n 275 -r 40G -f ~/suricata.pcap pktgen (v. 3.8.1.46-c6583) ============================================================================== Inspecting file: /home/napatech/suricata.pcap: 2026.38 MB so far Allocated host buffer size: 2045 MB Requires host buffer with 2045 MB bytes (needed raw size is 2124809480 B) Using host buffer with 2045 MB bytes (claimed raw size is 2143289344 B) Sent 1148945600 packets so far. Sent 1148945600 packets in total onto port 0 (1)
1 | Record the pktgen Sent packets count; this represents the number of packets sourced by pktgen. |
Sent packets count = 1148945600
5.3.3. Calculate Suricata Throughput
6. Typical Results
Typical results for Intel® PAC A10 GX NIC with Napatech Link™ Capture software
Load (Gbps) | 5 | 10 | 15 | 20 | 25 | 30 | 35 | 40 |
---|---|---|---|---|---|---|---|---|
Std NIC |
5.0 |
9.9 |
14.4 |
17.3 |
18.3 |
18.4 |
18.8 |
18.8 |
PAC |
5.0 |
10.0 |
15.0 |
20.0 |
24.9 |
26.5 |
26.5 |
27.0 |
Load (Gbps) | 5 | 10 | 15 | 20 | 25 | 30 | 35 | 40 |
---|---|---|---|---|---|---|---|---|
Std NIC |
5.0 |
10.0 |
15.0 |
19.0 |
23.8 |
26.4 |
28.8 |
28.8 |
PAC |
5.0 |
10.0 |
15.0 |
20.0 |
25.0 |
30.0 |
35.0 |
39.9 |
40 thread | 80 thread | |
---|---|---|
Std NIC |
6 Gbps |
15 Gbps |
PAC |
22 Gbps |
39 Gbps |