The Case for Rust in Your Trading Stack
I rewrote our order execution engine from Node.js to Rust. The latency dropped 40x. Here's when Rust makes sense for fintech — and when it's overkill.
Why We Switched
Our trading system processed orders through a Node.js service. It worked. But "worked" isn't good enough when you're competing with systems that execute in microseconds. Our p99 latency was 12ms. That's an eternity in algorithmic trading.
The bottleneck wasn't I/O — it was the execution engine itself. Order validation, risk checks, position calculations, and the matching logic were all CPU-bound. Node's single-threaded event loop was the ceiling.
The Numbers
| Metric | Node.js | Rust | Improvement |
|---|---|---|---|
| Order validation (p50) | 3.2ms | 0.08ms | 40x |
| Risk check pipeline (p50) | 5.1ms | 0.12ms | 42x |
| End-to-end latency (p99) | 12ms | 0.3ms | 40x |
| Memory usage (steady state) | 180MB | 14MB | 13x |
| Max throughput (orders/sec) | 8,400 | 340,000 | 40x |
Architecture: 7 Crates, Zero Garbage Collection
The Rust implementation is split into 7 workspace crates, each with a single responsibility:
trading-engine/ ├── crates/ │ ├── core/ # Domain types, order book, matching │ ├── db/ # PostgreSQL via SQLx (async) │ ├── exchanges/ # Exchange connectors (WebSocket) │ ├── engine/ # Main orchestration loop │ ├── strategy/ # Strategy evaluation runtime │ ├── api/ # Axum REST + WebSocket API │ └── risk/ # Position sizing, drawdown limits ├── Cargo.toml # Workspace root └── docker-compose.yml
The core crate is pure computation — no I/O, no allocations in the hot path.
Orders flow through a zero-copy pipeline where validation, risk checks, and matching
happen on stack-allocated data.
When Rust is Overkill
Not everything should be Rust. Here's my honest assessment after living with both:
- Use Rust for: execution engines, risk calculations, data pipelines, anything CPU-bound in the critical path
- Keep Node/Python for: strategy research, backtesting UI, admin dashboards, notification services, anything where development speed matters more than runtime speed
- The hybrid approach works: our Rust engine exposes an Axum API that the Node.js orchestration layer calls. Best of both worlds.
Rust doesn't make your architecture better. It makes your hot path faster. If you don't know where your hot path is, profile first.
The Tokio Async Runtime
Trading systems are inherently async — WebSocket feeds from exchanges, concurrent order submissions, real-time position updates. Tokio handles this beautifully:
// Concurrent exchange connections with Tokio
let handles: Vec<_> = exchanges.iter().map(|ex| {
tokio::spawn(async move {
let mut ws = ex.connect().await?;
while let Some(msg) = ws.next().await {
engine.process_market_data(msg?).await;
}
Ok::<_, Error>(())
})
}).collect();
futures::future::join_all(handles).await;
Each exchange connection runs on its own task. The engine processes market data as it arrives, with backpressure handled by channel buffers. No thread pools to tune, no callback hell, no GC pauses during critical moments.
Should You Rewrite?
Probably not — unless latency is a competitive advantage in your domain. The rewrite took 6 weeks and required learning a new language deeply. But for trading, those 6 weeks paid for themselves in the first month through better fill rates and reduced slippage.
Start with a single, well-bounded module. Prove the performance gain. Then expand. Don't rewrite your entire stack — rewrite the part that's too slow.