← Back to portfolio
Jan 2026

The Case for Rust in Your Trading Stack

I rewrote our order execution engine from Node.js to Rust. The latency dropped 40x. Here's when Rust makes sense for fintech — and when it's overkill.

Why We Switched

Our trading system processed orders through a Node.js service. It worked. But "worked" isn't good enough when you're competing with systems that execute in microseconds. Our p99 latency was 12ms. That's an eternity in algorithmic trading.

The bottleneck wasn't I/O — it was the execution engine itself. Order validation, risk checks, position calculations, and the matching logic were all CPU-bound. Node's single-threaded event loop was the ceiling.

The Numbers

MetricNode.jsRustImprovement
Order validation (p50)3.2ms0.08ms40x
Risk check pipeline (p50)5.1ms0.12ms42x
End-to-end latency (p99)12ms0.3ms40x
Memory usage (steady state)180MB14MB13x
Max throughput (orders/sec)8,400340,00040x

Architecture: 7 Crates, Zero Garbage Collection

The Rust implementation is split into 7 workspace crates, each with a single responsibility:

trading-engine/
├── crates/
│   ├── core/       # Domain types, order book, matching
│   ├── db/         # PostgreSQL via SQLx (async)
│   ├── exchanges/  # Exchange connectors (WebSocket)
│   ├── engine/     # Main orchestration loop
│   ├── strategy/   # Strategy evaluation runtime
│   ├── api/        # Axum REST + WebSocket API
│   └── risk/       # Position sizing, drawdown limits
├── Cargo.toml      # Workspace root
└── docker-compose.yml

The core crate is pure computation — no I/O, no allocations in the hot path. Orders flow through a zero-copy pipeline where validation, risk checks, and matching happen on stack-allocated data.

When Rust is Overkill

Not everything should be Rust. Here's my honest assessment after living with both:

Rust doesn't make your architecture better. It makes your hot path faster. If you don't know where your hot path is, profile first.

The Tokio Async Runtime

Trading systems are inherently async — WebSocket feeds from exchanges, concurrent order submissions, real-time position updates. Tokio handles this beautifully:

// Concurrent exchange connections with Tokio
let handles: Vec<_> = exchanges.iter().map(|ex| {
    tokio::spawn(async move {
        let mut ws = ex.connect().await?;
        while let Some(msg) = ws.next().await {
            engine.process_market_data(msg?).await;
        }
        Ok::<_, Error>(())
    })
}).collect();

futures::future::join_all(handles).await;

Each exchange connection runs on its own task. The engine processes market data as it arrives, with backpressure handled by channel buffers. No thread pools to tune, no callback hell, no GC pauses during critical moments.

Should You Rewrite?

Probably not — unless latency is a competitive advantage in your domain. The rewrite took 6 weeks and required learning a new language deeply. But for trading, those 6 weeks paid for themselves in the first month through better fill rates and reduced slippage.

Start with a single, well-bounded module. Prove the performance gain. Then expand. Don't rewrite your entire stack — rewrite the part that's too slow.