// 04 · Engine latency

Ticker write in 310 ns.

p50 over 89,033 samples on 2026-04-24, engine 0.4.48. Measured in-process inside our Rust state cache with a monotonic clock. Open-source bench harness reproduces it on your machine.

Read this first. What these numbers measure

The headline 310 ns is pure engine speed: recv_wall_ns − state_cache_updated_ns, captured inside our Rust code with a monotonic clock. It does NOT include network distance, TLS, or the time a venue spent aggregating a frame before sending it. End-to-end user-visible latency includes those. Anything you see in the 1-200 ms range elsewhere is network plus venue, not the framework. Both kinds of numbers are real; they measure different segments of the same timeline.

p50

310 ns

p95

980 ns

p99

1,950 ns

One MarketState::update_ticker_owned call. HashMap entry update + OHLCV live-mirror. 89,033 samples on engine 0.4.48, monotonic clock.

The full Rust engine pipeline

Every measurement here is captured inside handle_messages on engine 0.4.48 with a monotonic clock. State-cache writes finish in nanoseconds. Pipeline operations that include JSON parse + dispatch land in single-digit microseconds. The full pipeline end-to-end (socket read to state visible) is sub-15 µs at p50. Nothing crosses a millisecond on the hot path.

metric	n	p50	p99	note
`state_ticker_ns`	89,033	310 ns	1.95 µs	Ticker cache write + OHLCV live-mirror. The headline.
`state_mark_price_ns`	120	2.15 µs	3.73 µs	Funding rate + open-position uPnL recompute.
`state_order_update_ns`	3,458	3.69 µs	13.86 µs	Private order-update event apply.
`state_ob_snap_ns`	16,406	4.44 µs	17.42 µs	Orderbook snapshot apply + write_book.
`state_ob_delta_ns`	102,549	5.51 µs	16.34 µs	Orderbook delta + write_book.
`parse_ns`	176,555	1.76 µs	77.95 µs	WS frame parse (text or binary).
`end_to_end_ns`	176,555	14.40 µs	248.96 µs	ws.read return to engine state visible. Parse + dispatch + state write end-to-end.

What you should expect on your hardware

Engine latency is pure in-process compute. No network. CPU model, frequency governor, and thermal headroom all move the p50, but on any "engine tier" hardware the number stays sub-microsecond. A throttled laptop on battery can drift past 1 µs — that's the laptop, not the engine, and the bench harness README documents how to spot it (debug build, low-power state, slow clock source). Tier A is the maintainer-measured production probe; B-D are estimates pending community PRs.

tier	hardware	config	p50	p95
A	Xeon Plat 8369B (Ice Lake)	Linux, pinned core, SCHED_FIFO, no turbo	310 ns	980 ns
B	Xeon Gold 6438 / EPYC 9354	Linux, performance governor	350–500 ns	0.7–1.2 µs
C	Apple Silicon (M2 / M3 / M4)	macOS 14+, native arm64, plugged in	250–450 ns	0.6–1.1 µs
D	i7-12700K / Ryzen 7700X	Win11 high-perf or Linux perf gov	400–650 ns	0.8–1.4 µs

Reproduce the bench yourself

The bench harness ships in the public OSS repo as a self-contained Rust crate. No path-dependency on the engine source. Three commands from a fresh clone.

# 1. Clone the public OSS repo
git clone https://github.com/melaya-labs/melaya.git
cd melaya/benchmarks/engine

# 2. Run the criterion bench (~100k iterations, ~30 seconds)
cargo bench --bench state_ticker

# 3. Read the per-iteration CSV + summary
cat results/state_ticker_ns.csv | head
cat results/summary.json

For comparable numbers across machines, use the pinned Docker variant or the helper scripts shipped under scripts/. Both disable turbo, pin to a specific core, and run in performance governor.

Join the community

Melaya — Build AI agents for any job. Agentic platform for research, ops, outreach, reporting — and the only one where agents can actually trade.

Ticker write in 310 ns.

The full Rust engine pipeline

What you should expect on your hardware

Reproduce the bench yourself