Melaya — Build AI agents for any job. Agentic platform for research, ops, outreach, reporting — and the only one where agents can actually trade.

// USE CASE · QUANT

Prove the alpha first,ship only what survives.

Half your promising backtests die three weeks into paper trading, and the post-mortem is always the same: parameter sensitivity nobody tested, execution cost nobody modeled, a regime nobody walked the strategy through. Melaya stands up a six-persona quant crew that runs signal quality, execution cost, position sizing, risk, and a skeptic pass on every candidate, then drafts the deploy-or-kill memo your PM can sign in five minutes.

See the pipelines ↓
01
// What breaks today

The status quo costs more than the agent does.

Three pains every sales and BD team hits weekly. Each one is what your reps actually complain about, not what a feature page would call them.

  1. 01

    Half the strategies that ship with a 2.1 Sharpe in the backtest drop below 0.8 within six weeks of paper trading, and the post-mortem catches the same overfit every quarter.

  2. 02

    Execution cost is estimated as a flat 5 bps because nobody on the desk has time to decompose spread, impact, and adverse selection per strategy, so live P&L misses target by 30 to 60 percent.

  3. 03

    Risk reporting is a Friday-night Excel scramble: parametric VaR in one workbook, historical VaR in another, stress tests in a third, and the CRO signs off blind.

02
// Pipelines you can build

Compose. Approve. Replay.

Every pipeline below is a shape you wire on the canvas using the crew and tools further down. Not a feature we ship for you, a pattern you configure.

P01

Score factor signal quality

Run IC, Sharpe with standard error, alpha-decay half-life, and parameter sensitivity on a candidate factor, applying Bonferroni or Benjamini-Hochberg correction across the tested grid. Static context holds the factor library and the desk's significance thresholds.

P02

Decompose execution cost

Take the backtest fills and decompose cost into bid-ask spread, market impact, and Glosten-Milgrom adverse selection. Scoped database tool reads tick captures only, no broker writes possible from this workflow.

P03

Size with Kelly and correlation

Compute fractional Kelly at 0.25x to 0.5x against the live book using a Ledoit-Wolf shrunk correlation matrix. HITL gate blocks the allocation change until the PM approves the size, the correlation set, and the volatility target.

P04

Stress test and set hard limits

Run parametric VaR, historical VaR, CVaR, and Cornish-Fisher tail corrections, then replay the 2008, 2020, and 2022 stress books. Cross-run memory carries circuit-breaker thresholds from the prior review so the limits do not reset each Monday.

P05

Run the skeptic and overfitting pass

Score Deflated Sharpe Ratio against the trials count, demand walk-forward validation, and refuse any single-regime backtest. The rag_retrieve tool pulls prior failed strategies so the same overfit pattern does not pass twice.

P06

Draft the deploy-or-kill memo

Synthesize the five specialist sections into a memo with an ACTION line of DEPLOY, PAPER_TRADE, BACKTEST_MORE, PAUSE, or KILL and exact recommended params. Replay captures every tool call so compliance can audit the verdict end to end.

03
// The crew

Quant firm crew

Real personas from the quant_firm crew. Each ships with a tuned system prompt and a default tool allowlist. Swap models per persona on the canvas.

Quant Analyst

QuantAnalyst

Runs signal-quality analysis on a strategy: IC, Sharpe with standard errors, parameter sensitivity, and Bonferroni or Benjamini-Hochberg correction across the tested combinations.

Execution Specialist

ExecutionSpecialist

Decomposes execution cost into bid-ask spread, market impact, and adverse selection, then quantifies the conditions under which slippage and latency wipe out the edge.

Portfolio Manager

PortfolioManager

Sizes the strategy with fractional Kelly and a Ledoit-Wolf shrunk correlation matrix, then decides whether it earns a slot next to the live book.

Chief Risk Officer

ChiefRiskOfficer

Computes parametric and historical VaR, CVaR, and Cornish-Fisher tail corrections, runs the 2008, 2020, and 2022 stress books, and sets hard circuit breakers.

Quant Skeptic

QuantSkeptic

Scores overfitting with the Deflated Sharpe Ratio, refuses any strategy without walk-forward validation, and issues an APPROVED, CONDITIONAL, or REJECTED verdict with the exact failure conditions.

Report Writer

ReportWriter

Distills the five specialist sections into one PM-ready memo with an ACTION line of DEPLOY, PAPER_TRADE, BACKTEST_MORE, PAUSE, or KILL, and exact recommended parameters.

04
// Scoped tools

Only the actions you grant.

Every tool below is a real shared tool from the Melaya bundle. Allowlist per agent; HITL-gate the writes; revoke any of them in one click.

shared/tools/finance/

Pull price history, technical indicators, FX rates, and macro series for factor research and out-of-sample regime tests. Read-only by design, no broker writes possible from this bundle.

alphavantage_stock_pricealphavantage_stock_historyalphavantage_stock_indicatorsfx_ratealphavantage_macro_dataalphavantage_sector_performance
shared/tools/knowledge/

Build the per-workflow knowledge store from strategy specs, prior tearsheets, post-mortems, and risk policy docs. Powers the three knowledge layers QuantAnalyst and QuantSkeptic call against. Writes to the workflow store only, never to the live trading system.

build_knowledge_from_textbuild_knowledge_from_file
shared/tools/database/

Read backtest output, tick captures, and PnL attribution from your warehouse. The destructive sql_execute and sqlite_execute calls are HITL-gated and off by default for this crew.

sql_querysql_schemasql_export_csvsqlite_query
shared/tools/aiml/

Extract tables from research PDFs, vendor tearsheets, and prior risk filings into structured inputs the QuantAnalyst can cite. No writes; reads only the files you point it at.

hf_summarizehf_text_classifypdf_to_textpdf_extract_tables
shared/tools/data_utils/

Validate backtest CSV exports, summarize distributions before they hit the analyst, and hash result files so the audit log can prove the tearsheet under review is the same one the PM signed.

csv_lintdf_describejson_validatehash_file
shared/tools/msoffice/

Read PM-facing Excel risk books and emit the final deploy-or-kill memo as a Word doc the compliance archive can ingest. All writes are HITL-gated and stage to a draft folder, never overwriting the prior version.

excel_read_sheetexcel_write_dataword_createword_add_paragraphs
shared/tools/messaging/

Drop the ReportWriter's deploy-or-kill memo into the desk channel once a human has approved it. HITL by default, with the same audit log as every other write.

discord_send_messagetelegram_send_message
05
// Three knowledge layers

The crew reads what you give it.

Every pipeline ships with three layers of knowledge access. Mix and match per agent on the canvas. No shared vector space with another tenant, no surprise reads, no opaque retrieval.

L1

Static context

includeContext

Per-pipeline documents appended to specific agents' input on every run. The ICP brief, playbook, pricing sheet, or won-deal email corpus. Whatever needs to be there before the agent thinks. You pick which personas get which docs.

L2

RAG retrieval tool

rag_retrieve

A scoped tool granted per-agent. When the agent decides it needs more depth, it queries the workflow's vector store on demand. Same knowledge base as Static context, accessed only when the model asks for it.

L3

Cross-run memory

pipeline_memory

Pipeline-level state that carries from one run to the next. Yesterday's research is in scope for today's follow-up. The crew remembers what it already prospected, what got approved, what was sent. The audit log is the second-order knowledge base.

07
// FAQ

Questions we get every week.

Will agents place live orders on their own?

No. The quant crew ships with HITL on every order, allocation change, and risk-limit edit. The agents research, backtest, score, and draft the deploy memo. A human PM presses the button on capital.

Can the agents reason over our backtests and tick data?

Three ways. Static context attaches the strategy spec, factor definitions, and risk policy to specific personas on every run. The rag_retrieve tool lets QuantAnalyst and QuantSkeptic pull from backtest logs, past tearsheets, and prior post-mortems on demand. Cross-run memory keeps yesterday's walk-forward results in scope for today's regime test.

Is this a QuantConnect or WorldQuant alternative?

No, it sits next to them. QuantConnect runs your backtest engine. WorldQuant hosts the factor competition. Melaya is the research-and-review layer that scores the signal, sizes the position, and stages the deploy memo. Your backtester stays your backtester.

How do we keep the analysis from sounding like a generic AI tearsheet?

Every metric cites its sample size and time window. QuantAnalyst reports Sharpe with its standard error, QuantSkeptic reports the Deflated Sharpe with the trials count, and ReportWriter refuses to soften a REJECTED verdict. Reviewers can require a citation on every claim as a HITL pre-check.

Which models can we run this crew on?

Any. Claude on QuantSkeptic where adversarial reasoning earns the cost, GPT on the ReportWriter, and a local Ollama on the QuantAnalyst when the strategy spec and tick data must stay inside your VPC. Each agent picks its own.

How fast can a quant team get the first review pipeline running?

With your backtest output sitting in S3 or a SQL warehouse, the signal-quality-to-deploy-memo workflow is a 4-node canvas: ingest results, run QuantAnalyst plus QuantSkeptic, gate on the verdict, draft the memo. Most desks ship it in a working session and review the first strategy the same day.

How does this handle regime change and walk-forward validation?

QuantSkeptic refuses any strategy without walk-forward validation and at least one out-of-sample regime. The CRO persona runs the 2008, 2020, and 2022 stress books on the proposed allocation. If either fails, ReportWriter cannot output a DEPLOY action by design.

Can I audit exactly what the agent did and why?

Every run logs every step, every tool call, every model invocation, and every approval decision. Replay any run at any time. The audit log is the risk log, and it ships with the deploy memo into your compliance archive.

Can we restrict which agents can write to our broker or risk system?

Yes. Tool scoping is per-agent. The QuantSkeptic and QuantAnalyst run read-only against backtest stores. Only the PortfolioManager can stage an allocation change, and that change is HITL-gated by default. The CRO is the only persona allowed to edit circuit-breaker thresholds.

Build quant research & trading teams pipelines on Melaya.

Sandbox tier is free with no card. Join the waitlist and we will email you the moment a slot opens.

← Back to every use case
Join the community