Melaya — Build AI agents for any job. Agentic platform for research, ops, outreach, reporting — and the only one where agents can actually trade.

// 04 · Engine latency

Ticker write in 310 ns.

p50 over 89,033 samples on 2026-04-24, engine 0.4.48. Measured in-process inside our Rust state cache with a monotonic clock. Open-source bench harness reproduces it on your machine.

Read this first. What these numbers measure

The headline 310 ns is pure engine speed: recv_wall_ns − state_cache_updated_ns, captured inside our Rust code with a monotonic clock. It does NOT include network distance, TLS, or the time a venue spent aggregating a frame before sending it. End-to-end user-visible latency includes those. Anything you see in the 1-200 ms range elsewhere is network plus venue, not the framework. Both kinds of numbers are real; they measure different segments of the same timeline.

p50
310 ns
p95
980 ns
p99
1,950 ns

One MarketState::update_ticker_owned call. HashMap entry update + OHLCV live-mirror. 89,033 samples on engine 0.4.48, monotonic clock.

The full Rust engine pipeline

Every measurement here is captured inside handle_messages on engine 0.4.48 with a monotonic clock. State-cache writes finish in nanoseconds. Pipeline operations that include JSON parse + dispatch land in single-digit microseconds. The full pipeline end-to-end (socket read to state visible) is sub-15 µs at p50. Nothing crosses a millisecond on the hot path.

metricnp50p99note
state_ticker_ns89,033310 ns1.95 µsTicker cache write + OHLCV live-mirror. The headline.
state_mark_price_ns1202.15 µs3.73 µsFunding rate + open-position uPnL recompute.
state_order_update_ns3,4583.69 µs13.86 µsPrivate order-update event apply.
state_ob_snap_ns16,4064.44 µs17.42 µsOrderbook snapshot apply + write_book.
state_ob_delta_ns102,5495.51 µs16.34 µsOrderbook delta + write_book.
parse_ns176,5551.76 µs77.95 µsWS frame parse (text or binary).
end_to_end_ns176,55514.40 µs248.96 µsws.read return to engine state visible. Parse + dispatch + state write end-to-end.

What you should expect on your hardware

Engine latency is pure in-process compute. No network. CPU model, frequency governor, and thermal headroom all move the p50, but on any "engine tier" hardware the number stays sub-microsecond. A throttled laptop on battery can drift past 1 µs — that's the laptop, not the engine, and the bench harness README documents how to spot it (debug build, low-power state, slow clock source). Tier A is the maintainer-measured production probe; B-D are estimates pending community PRs.

tierhardwareconfigp50p95
AXeon Plat 8369B (Ice Lake)Linux, pinned core, SCHED_FIFO, no turbo310 ns980 ns
BXeon Gold 6438 / EPYC 9354Linux, performance governor350–500 ns0.7–1.2 µs
CApple Silicon (M2 / M3 / M4)macOS 14+, native arm64, plugged in250–450 ns0.6–1.1 µs
Di7-12700K / Ryzen 7700XWin11 high-perf or Linux perf gov400–650 ns0.8–1.4 µs

Reproduce the bench yourself

The bench harness ships in the public OSS repo as a self-contained Rust crate. No path-dependency on the engine source. Three commands from a fresh clone.

# 1. Clone the public OSS repo
git clone https://github.com/melaya-labs/melaya.git
cd melaya/benchmarks/engine

# 2. Run the criterion bench (~100k iterations, ~30 seconds)
cargo bench --bench state_ticker

# 3. Read the per-iteration CSV + summary
cat results/state_ticker_ns.csv | head
cat results/summary.json

For comparable numbers across machines, use the pinned Docker variant or the helper scripts shipped under scripts/. Both disable turbo, pin to a specific core, and run in performance governor.

Join the community