Low latency AI for finance
Myrtle.ai VOLLO™ Acceleration Solution
Napatech SmartNICs and IPUs
Combining Myrtle.ai’s VOLLO accelerator with Napatech’s SmartNICs and IPUs, enables quant traders to make intelligent trading decisions faster than their competitors.
Myrtle.ai specializes in optimizing machine learning (ML) inference workloads for various applications in cloud or enterprise data centers and edge environments. Their expertise lies in developing low-latency, high-throughput, and energy-efficient solutions by integrating software and hardware innovations.
Myrtle.ai’s VOLLO
One of their key products is VOLLO, an ML inference accelerator tailored for the finance industry. VOLLO is designed to achieve minimal latency on financial neural network models while maximizing throughput, quality, and energy efficiency. Notably, it has demonstrated latencies as low as 5.1 microseconds in independently audited benchmarks, outperforming competitors by significant margins. VOLLO supports user-defined models in ONNX or PyTorch formats and is optimized for time-series inference of financial AI models.
Integrating Myrtle.ai’s VOLLO machine learning inference accelerator with Napatech’s programmable SmartNICs and IPUs offers significant benefits for the finance industry.
Latency (microseconds)
Latency (microseconds)
Independently verified against GPUs and ASICs. LSTM with 160K params, including PCIe data transfers from host CPU.
STAC–ML Tacana Benchmark LSTM_A (99th percentile).
Ultra-Low Latency Inference – Unrivalled Performance
VOLLO is engineered to deliver minimal latency on financial neural network models, achieving audited latencies as low as 5.1 microseconds. This rapid inference capability is crucial for high-frequency trading and real-time financial decision-making, where even microsecond delays can impact profitability.
VOLLO success has been convincingly demonstrated by its performance in the STAC-ML™ Markets (Inference) benchmarks1 which represent such models. Other competitors demonstrated up to 20x longer latencies than VOLLO.
USE CASE
Ultra-Low Latency AI-Driven Financial Market Data Processing
Financial firms engaged in low latency trading, risk analytics, and fraud detection require real-time, low-latency data processing. Traditional CPU-based architectures struggle to keep up with the ever-increasing volume and velocity of market data, leading to potential missed opportunities and increased operational risk.
- Latency Reduction – Achieve sub-microsecond processing of financial transactions.
- Increased Throughput – Process millions of messages per second with minimal CPU overhead.
- AI at the Network Edge – Run AI inference directly on the SmartNIC/IPU, reducing data movement overhead.
- Lower Operational Costs – Offload AI tasks from costly cloud compute to optimized FPGA-based hardware.
Simple to program
Models can be developed in PyTorch or TensorFlow before being exported in ONNX format into the VOLLO tool suite, making it simple to program from your existing ML development environment.
Flexible for future-proofing
The flexibility of FPGA technology ensures that not only can VOLLO be software-configured with multiple user models today, but significant architectural innovations can also be adopted quickly with optimal compute resources3.
Enhanced Data Capture and Analysis
Napatech’s SmartNICs are designed to capture all network traffic at speeds up to 100 Gbps without packet loss, including during microbursts. This ensures comprehensive data collection for analysis, which is vital for optimizing trading algorithms and maintaining regulatory compliance.
Precise Latency Measurement
The nanosecond-level timestamping capabilities of Napatech’s SmartNICs enable financial institutions to monitor and visualize delays accurately. This precision helps in guaranteeing optimal performance and transparency within trading infrastructures.
Seamless Integration and Flexibility
Napatech’s Link-Programmable™ software allows the deployment of custom FPGA IP on their SmartNIC platform, providing flexibility to tailor solutions to specific financial applications.
Napatech NT400D11 SmartNIC
The Napatech NT400D11 PCIe4 SmartNIC is based on Intel® Agilex™ AGF 014 FPGA architecture and enables 2x100G applications. The QSFP28 form factor offers flexibility to create high-performance solutions in 1U server platforms for existing 100G network infrastructures. Also available in NEBS variants.
Napatech F2070X IPU
The Napatech F2070X Infrastructure Processing Unit (IPU) is a 2x100G PCI Express (PCIe) card with an Intel® Agilex® F-Series FPGA and an Intel® Xeon® D processor, in a Full Height, Half Length (FHHL), dual-slot form factor.
Resources and downloads
For an evaluation, questions and
further assistance please contact us.
1www.STACresearch.com/MRTL230426
2STAC-ML.Markets.Inf.T.LSTM_A.4.LAT.v1 and STAC-ML.Markets.Inf.T.LSTM_A.4.TPUT.v1
3Myrtle.ai can provide optimized FPGA bitstreams for new and emerging models based on an extensive IP library for AI inference
TensorFlow, the TensorFlow logo and any related marks are trademarks of Google Inc.
PyTorch, the PyTorch logo and any related marks are trademarks of The Linux Foundation