Low latency AI for finance

PARTNER SOLUTION:

Myrtle.ai VOLLO™ Acceleration Solution
Napatech SmartNICs and DPUs

Combining Myrtle.ai’s VOLLO accelerator with Napatech’s SmartNICs and DPUs, enables quant traders to make intelligent trading decisions faster than their competitors.

Myrtle.ai specializes in optimizing machine learning (ML) inference workloads for various applications in cloud or enterprise data centers and edge environments. Their expertise lies in developing low-latency, high-throughput, and energy-efficient solutions by integrating software and hardware innovations.

Myrtle.ai’s VOLLO

One of their key products is VOLLO, an ML inference accelerator tailored for the finance industry. VOLLO is designed to achieve minimal latency on financial neural network models while maximizing throughput, quality, and energy efficiency. Notably, it has demonstrated latencies as low as 5.1 microseconds in independently audited benchmarks, outperforming competitors by significant margins. VOLLO supports user-defined models in ONNX or PyTorch formats and is optimized for time-series inference of financial AI models.

Integrating Myrtle.ai’s VOLLO machine learning inference accelerator with Napatech’s programmable SmartNICs and DPUs offers significant benefits for the finance industry.

VOLLO

Latency (microseconds)

GPU

Latency (microseconds)

Independently verified against GPUs and ASICs. LSTM with 160K params, including PCIe data transfers from host CPU.
STAC–ML Tacana Benchmark LSTM_A (99th percentile).

Ultra-Low Latency Inference – Unrivalled Performance

VOLLO is engineered to deliver minimal latency on financial neural network models, achieving audited latencies as low as 5.1 microseconds. This rapid inference capability is crucial for high-frequency trading and real-time financial decision-making, where even microsecond delays can impact profitability.

VOLLO success has been convincingly demonstrated by its performance in the STAC-ML™ Markets (Inference) benchmarks¹ which represent such models. Other competitors demonstrated up to 20x longer latencies than VOLLO.

USE CASE
Ultra-Low Latency AI-Driven Financial Market Data Processing

Financial firms engaged in low latency trading, risk analytics, and fraud detection require real-time, low-latency data processing. Traditional CPU-based architectures struggle to keep up with the ever-increasing volume and velocity of market data, leading to potential missed opportunities and increased operational risk.

Latency Reduction – Achieve sub-microsecond processing of financial transactions.
Increased Throughput – Process millions of messages per second with minimal CPU overhead.
AI at the Network Edge – Run AI inference directly on the SmartNIC/DPU, reducing data movement overhead.
Lower Operational Costs – Offload AI tasks from costly cloud compute to optimized FPGA-based hardware.

Simple to program

Models can be developed in PyTorch or TensorFlow before being exported in ONNX format into the VOLLO tool suite, making it simple to program from your existing ML development environment.

Flexible for future-proofing

The flexibility of FPGA technology ensures that not only can VOLLO be software-configured with multiple user models today, but significant architectural innovations can also be adopted quickly with optimal compute resources³.

Enhanced Data Capture and Analysis

Napatech’s SmartNICs are designed to capture all network traffic at speeds up to 100 Gbps without packet loss, including during microbursts. This ensures comprehensive data collection for analysis, which is vital for optimizing trading algorithms and maintaining regulatory compliance.

Precise Latency Measurement

The nanosecond-level timestamping capabilities of Napatech’s SmartNICs enable financial institutions to monitor and visualize delays accurately. This precision helps in guaranteeing optimal performance and transparency within trading infrastructures.

Seamless Integration and Flexibility

Napatech’s Link-Programmable™ software allows the deployment of custom FPGA IP on their SmartNIC platform, providing flexibility to tailor solutions to specific financial applications.

Napatech NT400D11 SmartNIC
The Napatech NT400D11 PCIe4 SmartNIC is based on Intel^® Agilex^™ AGF 014 FPGA architecture and enables 2x100G applications. The QSFP28 form factor offers flexibility to create high-performance solutions in 1U server platforms for existing 100G network infrastructures. Also available in NEBS variants.

DISCOVER MORE

Napatech F2070X DPU
The Napatech F2070X Data Processing Unit (DPU) is a 2x100G PCI Express (PCIe) card with an Intel^® Agilex^® F-Series FPGA and an Intel^® Xeon^® D processor, in a Full Height, Half Length (FHHL), dual-slot form factor.

DISCOVER MORE

By combining Myrtle.ai’s VOLLO accelerator with Napatech’s advanced SmartNICs and DPUs, financial institutions can achieve unparalleled performance, efficiency, and adaptability in their trading operations and data analysis processes.

Resources and downloads

VOLLO Product Video

Evaluation

Product Brief

Data Sheet

Data Sheet

For an evaluation, questions and
further assistance please contact us.

¹www.STACresearch.com/MRTL230426
²STAC-ML.Markets.Inf.T.LSTM_A.4.LAT.v1 and STAC-ML.Markets.Inf.T.LSTM_A.4.TPUT.v1
³Myrtle.ai can provide optimized FPGA bitstreams for new and emerging models based on an extensive IP library for AI inference

TensorFlow, the TensorFlow logo and any related marks are trademarks of Google Inc.
PyTorch, the PyTorch logo and any related marks are trademarks of The Linux Foundation