1. Audience and Purpose

The primary audience for this test report are network architects and engineers implementing the Suricata open source intrusion detection (IDS), inline intrusion prevention (IPS), and network security monitoring (NSM) solution. This report provides information on packet processing performance testing for the specified Suricata release on Intel® Programmable Acceleration Card with Intel Arria® 10 GX FPGA with Napatech Link™ Capture software.

The purpose of reporting these tests is not to imply a single "correct" approach, but rather to provide a baseline configuration and setup procedure with reproducible results. This will help guide architects and engineers who are evaluating and implementing IDS/IPS solutions and can assist in achieving optimal system performance.

2. System Specifications

The device under test (DUT) consists of a standard Intel® architecture COTS server populated with the following:

  • single or dual processor

  • DRAM memory

  • Intel® Programmable Acceleration Card with Intel Arria® 10 GX FPGA network interface card

  • Napatech Link™ Capture Software v11.1.4

Connected to the DUT is a traffic generator used to send traffic at a constant controlled rate to the DUT. In this test, Suricata is used in intrusion detection mode; traffic is sent one-way and throughput is measured using statistics reported by the application.

The traffic source can be another COTS server running a traffic generator application, or a dedicated hardware test set (e.g. Ixia).

2.1. DUT System Specifications

Table 1. DUT Hardware and Software

Server

Dell PowerEdge R740

CPU

2x Intel® Xeon® Gold 6138 CPU @ 2.0 GHz, 20 Cores, 40 Threads

Memory

128GB RAM (16 x 8GB RDIMM, 2666MT/s)

PCIe

PCIe Gen3 x8 slot

NIC

Intel® Programmable Acceleration Card with Intel Arria® 10 GX FPGA (1x40 GE)

NIC

Intel® Ethernet Network Adapter XL710-QDA2 (2x40 GE)

Optical Transceiver

40GBASE-SR4 QSFP+ 850nm 150m MTP/MPO Optical Transceiver Module

Napatech Link™ Capture software

v11.1.4 Linux

Operating System

CentOS Linux release 7

Linux kernel version

3.10.0-957.1.3.el7.x86_64

Suricata version

4.0.5

2.2. Traffic Generator System Specifications

Table 2. Traffic Generator Hardware and Software

Server

Dell PowerEdge R740

CPU

Intel® Xeon® Gold 5120 CPU @ 2.2 GHz, 14 Cores, 28 Threads

Memory

64GB RAM (8 x 8GB RDIMM, 2666MT/s)

PCIe

PCIe Gen3 x8 slot

NIC

Intel® Programmable Acceleration Card with Intel Arria® 10 GX FPGA

Optical Transceiver

40GBASE-SR4 QSFP+ 850nm 150m MTP/MPO Optical Transceiver Module

Napatech Link™ Capture software

v11.1.4 Linux

Operating System

CentOS Linux release 7

Linux kernel version

3.10.0-957.1.3.el7.x86_64

3. Configuration and Setup

3.1. Suricata Installation and Configuration

Suricata is a free and open source, mature, fast and robust network threat detection engine.

Documentation for users and developers can be found at https://suricata-ids.org/docs/.

Suricata source code is available for download from https://suricata-ids.org/download/.

Detailed instructions for installing and configuring Suricata for use with Napatech Link™ Capture Software are provided in the Quick Guide (DN-1113 Napatech Software for Intel PAC Arria 10 GX.pdf), included with the Napatech software distribution.

To build and run Suricata with Napatech support the software must be compiled from source. Make sure that all dependencies listed in the installation guide are installed before attempting to build Suricata.

After downloading and extracting the Suricata tarball, run configure to enable Napatech support and prepare for compilation:

./configure --enable-napatech --with-napatech-includes=/opt/napatech3/include \
            --with-napatech-libraries=/opt/napatech3/lib
make
sudo make install-full
sudo ldconfig

This will install Suricata into /usr/local/bin/, use the default configuration in /usr/local/etc/suricata/ and with output log files to /usr/local/var/log/suricata. Rules are installed in /usr/local/etc/suricata/rules/.

3.1.1. Suricata Configuration

Suricata configuration is managed via a plain text file with YAML syntax. The default suricata.yaml file must be modified to enable Suricata to use the PAC interface and to tune for performance.

Edit the file /usr/local/etc/suricata/suricata.yaml as follows.

Thread affinity settings

CPU thread affinity settings are configured in the YAML file under the threading: section. Modify CPU affinity of Suricata threads by binding different thread groups to specific CPUs.

# Suricata is multi-threaded. Here the threading can be influenced.
threading:
  set-cpu-affinity: yes (1)
  # Tune cpu affinity of threads. Each family of threads can be bound
  # on specific CPUs.
  #
  cpu-affinity:
    - management-cpu-set:
        cpu: [ 0, 2 ]  # include only these cpus in affinity settings (2)
    - receive-cpu-set:
        cpu: [ 4 ]  # include only these cpus in affinity settings (2)
    - worker-cpu-set:
        cpu: [ 1, 3, "5-79" ] (3)
        mode: "exclusive"
        # Use explicitely 3 threads and don't compute number by using
        # detect-thread-ratio variable:
        # threads: 3
        prio:
          #low: [ 0 ]
          medium: [ 0, 2, 4 ]
          high: [ 1, 3, "5-79" ] (4)
          default: "medium"
    #- verdict-cpu-set:
    #    cpu: [ 0 ]
    #    prio:
1 set-cpu-affinity: yes
2 Management and receive threads are for general housekeeping and should be assigned to specific cores
3 Assign worker threads to CPU cores, mode exclusive
4 Set thread priority per CPU core

All 80 compute threads are made available to Suricata with this configuration.

Napatech-specific settings
napatech:
    # The Host Buffer Allowance for all streams
    # (-1 = OFF, 1 - 100 = percentage of the host buffer that can be held back)
    # This may be enabled when sharing streams with another application.
    # Otherwise, it should be turned off.
    hba: -1

    # use_all_streams set to "yes" will query the Napatech service for all configured
    # streams and listen on all of them. When set to "no" the streams config array
    # will be used.
    use-all-streams: no  (1)
	zero-copy: yes       (2)

    # The streams to listen on.  This can be either:
    #   a list of individual streams (e.g. streams: [0,1,2,3])
    # or
    #   a range of streams (e.g. streams: ["0-3"])
	streams: ["0-63"]    (3)
1 Use specific streams
2 Enable zero-copy
3 Use 64 receive streams
Other settings

Configure af-packet interface for a standard NIC.

# Linux high speed capture support
af-packet:
  - interface: p1p1  (1)
    threads: auto
    cluster-id: 99
    cluster-type: cluster_flow
    defrag: no
    rollover: yes
    use-mmap: yes
    mmap-locked: yes
    tpacket-v3: yes
1 Interface name may vary

Increase the max-pending-packets setting found in the Advanced settings section.

# Number of packets preallocated per thread. The default is 1024. A higher number
# will make sure each CPU will be more easily kept busy, but may negatively
# impact caching.
max-pending-packets: 65000

3.1.2. Suricata Rule set Configuration

Signatures play a very important role in Suricata. The public rule sets most commonly used are Emerging Threats, Emerging Threats Pro and Snort/Sourcefire VRT.

Full Rule Set

The "full rule set" used in this testing, the Emerging Threats rule set, is installed and enabled by the installation procedure above. It can be installed manually by the following command:

/usr/bin/wget -qO - https://rules.emergingthreats.net/open/suricata-4.0/emerging.rules.tar.gz | tar -x -z -C "/usr/local/etc/suricata/" -f -
Minimal Rule Set

For "minimal rule set" testing, comment out all the rules in the rule-files section of the suricata configuration file and add a rule file containing a single rule. The single rule is specified as:

pass ip any any -> any any

Edit the Napatech ntservice configuration file (/opt/napatech3/config/ntservice.ini) to create a receive host buffer for each of the 64 Napatech streams specified in the Suricata configuration file above.

HostBuffersRx = [32,16,0],[32,16,1]

Stop and restart ntservice after making changes to ntservice.ini:

sudo /opt/napatech3/bin/ntstop.sh
sudo /opt/napatech3/bin/ntstart.sh

3.2.1. Create streams and configure load distribution

Complete the DUT setup using NTPL (Napatech Programming Language) commands to create 64 streams and to distribute ingress traffic based on a 5-tuple hash.

Create a text file containing the following NTPL commands and save it to a file named suricata.ntpl:

Delete=All                           # Delete any existing filters
Setup[numaNode=0] = streamid==(0..31)
Setup[numaNode=1] = streamid==(32..63)
HashMode[priority=4]=Hash5TupleSorted
Assign[priority=0; streamid=(0..63)]= all

Run the following command to execute the NTPL commands using the ntpl tool:

sudo /opt/napatech3/bin/ntpl -f suricata.ntpl

3.3. Traffic Generator Configuration

The traffic generator for this test is a standard COTS server with an Intel PAC NIC and Napatech Link™ Capture software.

For Suricata and most other IDS applications, traffic patterns have a significant effect on performance. For this reason, previously live network traffic is replayed to obtain a realistic traffic load.

The Napatech pktgen tool is used to replay a PCAP file at controlled constant rates. The PCAP is composed of more than 125K IPv4 flows with a combination of malicious and background traffic from dozens of common applications. Average packet size is 486 bytes.

Edit the Napatech ntservice configuration file (/opt/napatech3/config/ntservice.ini) to increase the transmit buffer size.

HostBuffersRx = [4,16,-1]                # [x1, x2, x3], ...
HostBuffersTx = [4,2048,-1]              # [number, size (MB), NUMA node]

Stop and restart ntservice after making changes to ntservice.ini:

sudo /opt/napatech3/bin/ntstop.sh
sudo /opt/napatech3/bin/ntstart.sh

4. Suricata Throughput Test Methodology

         	         +-----------+         +------------+
	                 |           |         |            |
	                 | traffic   |         |            |
	                 |           |-------->|    DUT     |
	                 | generator |         |            |
	                 |           |         |            |
	                 +-----------+         +------------+

4.1. Maximum Throughput Test

Procedure: Send a specific number of frames at a specific constant rate to the DUT and count the frames that are received and processed by the Suricata application.

Suricata throughput is then calculated as:

\$Suricata\ Throughput\ (Gbps) = ( (Received\ Frame\ Count) / (Sent\ Frame\ Count) ) * Source\ BitRate\ (Gbps)\$

The first trial is run with a source traffic rate equal to 100% of the link rate and the test is repeated over a range of decreasing source rates. Throughput is calculated as above for each source rate. Maximum Suricata throughput is the highest calculated throughput value, disregarding any packet loss.

4.2. RFC 2544 Zero Packet Loss Test

Procedure: Send a specific number of frames at a specific constant rate to the DUT and count the frames that are received and processed by the Suricata application.

If the received frame count is less than the sent frame count, the source rate is reduced and the test is rerun. The lossless throughput is the highest rate at which the count of frames received by the DUT is equal to the number of frames sent by the traffic source.

5. Detailed Test Procedure

5.1. Start Suricata On DUT

To start Suricata with Napatech Link™ Capture software:

Make sure the ntservice has been started (sudo /opt/napatech3/bin/ntstart.sh), then start suricata:

sudo suricata -c /usr/local/etc/suricata/suricata.yaml --napatech \
     --runmode workers --init-errors-fatal -vv

To start Suricata on standard NIC:

sudo suricata -c /usr/local/etc/suricata/suricata.yaml --runmode workers \
     --init-errors-fatal --af-packet=<if>

5.2. Run pktgen on traffic generator host

Start pktgen traffic generator:

/opt/napatech3/bin/pktgen -p 0 -n $REPLAYCOUNT -r $PORTRATE G -f suricata.pcap

5.3. Terminate Suricata, read statistics

The suricata stats.log records performance statistics on a fixed interval, by default every 8 seconds.

When the pktgen run completes, terminate Suricata with SIGINT signal (^C).

5.3.1. Read Suricata Counters

Read Suricata counters from /usr/local/var/log/suricata/stats.log.

Typical output from Suricata on standard NIC:

------------------------------------------------------------------------------------
Date: 8/31/2018 -- 13:35:53 (uptime: 0d, 00h 06m 16s)
------------------------------------------------------------------------------------
Counter                                    | TM Name                   | Value
------------------------------------------------------------------------------------
capture.kernel_packets                     | Total                     | 902442616
capture.kernel_drops                       | Total                     | 116043762
decoder.pkts                               | Total                     | 786398843
decoder.bytes                              | Total                     | 380058530984
decoder.ipv4                               | Total                     | 786384760
decoder.ipv6                               | Total                     | 10
decoder.ethernet                           | Total                     | 786398843
decoder.tcp                                | Total                     | 715915981
decoder.udp                                | Total                     | 70468779
decoder.icmpv6                             | Total                     | 10
decoder.avg_pkt_size                       | Total                     | 483
decoder.max_pkt_size                       | Total                     | 1514

Typical output from Suricata on PAC A10 NIC with Napatech Link™ Capture software:

------------------------------------------------------------------------------------
Date: 8/30/2018 -- 22:24:10 (uptime: 0d, 00h 04m 41s)
------------------------------------------------------------------------------------
Counter                                    | TM Name                   | Value
------------------------------------------------------------------------------------
nt0.pkts                                   | Total                     | 16667025
nt0.bytes                                  | Total                     | 7548238800
nt1.pkts                                   | Total                     | 19253339
nt1.bytes                                  | Total                     | 9381782396
[...]
nt63.pkts                                  | Total                     | 20265827
nt63.bytes                                 | Total                     | 9967031528
decoder.pkts                               | Total                     | 1253395200  (1)
decoder.bytes                              | Total                     | 614155305000
decoder.ipv4                               | Total                     | 1253380800
decoder.ethernet                           | Total                     | 1253395200
decoder.tcp                                | Total                     | 1142593500
decoder.udp                                | Total                     | 110787300
decoder.avg_pkt_size                       | Total                     | 489
decoder.max_pkt_size                       | Total                     | 1518
[...]
flow.memuse                                | Total                     | 54546496
1 Record the Suricata decoder.pkts count; this represents the number of packets received and processed by Suricata’s "decoder" module.

decoder.pkts count = 1148945600

5.3.2. Read Transmit Packet Count

Read pktgen "Sent packets" count from pktgen terminal outout:

napatech@pit5:~$ /opt/napatech3/bin/pktgen -p 0 -n 275 -r 40G -f ~/suricata.pcap
pktgen (v. 3.8.1.46-c6583)
==============================================================================
Inspecting file: /home/napatech/suricata.pcap: 2026.38 MB so far
Allocated host buffer size: 2045 MB
Requires host buffer with 2045 MB bytes (needed raw size is 2124809480 B)
Using host buffer with 2045 MB bytes (claimed raw size is 2143289344 B)
Sent 1148945600 packets so far.

Sent 1148945600 packets in total onto port 0 (1)
1 Record the pktgen Sent packets count; this represents the number of packets sourced by pktgen.

Sent packets count = 1148945600

5.3.3. Calculate Suricata Throughput

\$Suricata\ throughput\ (Gbps) = ( (decoder\.pkts\ count) / (Sent\ packets\ count) ) * Source\ BitRate\ (Gbps)\$

6. Typical Results

Typical results for Intel® PAC A10 GX NIC with Napatech Link™ Capture software

Table 3. Decode Rate (Gbps) vs Offered Load (Gbps) — ET Ruleset, 40 threads
Load (Gbps) 5 10 15 20 25 30 35 40

Std NIC

5.0

9.9

14.4

17.3

18.3

18.4

18.8

18.8

PAC

5.0

10.0

15.0

20.0

24.9

26.5

26.5

27.0


Table 4. Decode Rate (Gbps) vs Offered Load (Gbps) — ET Ruleset, 80 threads
Load (Gbps) 5 10 15 20 25 30 35 40

Std NIC

5.0

10.0

15.0

19.0

23.8

26.4

28.8

28.8

PAC

5.0

10.0

15.0

20.0

25.0

30.0

35.0

39.9


Table 5. Suricata Lossless Throughput (Gbps) — ET Ruleset
40 thread 80 thread

Std NIC

6 Gbps

15 Gbps

PAC

22 Gbps

39 Gbps