Melaya — Build AI agents for any job. Agentic platform for research, ops, outreach, reporting — and the only one where agents can actually trade.

// USE CASE · ENGINEERING

Engineering agents that review every diff,and pause before they merge.

Your senior engineers spend half their week reviewing other people's code, prepping audits, and writing post-mortems instead of shipping. Melaya lets you build a ten-persona engineering crew that reviews PRs, runs the OWASP checklist, triages hotspots, drafts RFCs, and stages every comment, ticket, and config change for one-click approval. The crew does the prep, your staff engineers keep the judgment call, the replay log keeps the receipts.

See the pipelines ↓
01
// What breaks today

The status quo costs more than the agent does.

Three pains every sales and BD team hits weekly. Each one is what your reps actually complain about, not what a feature page would call them.

  1. 01

    Senior engineers spend ten to fifteen hours a week on review queues and on-call triage instead of shipping the roadmap.

  2. 02

    Security audits and SOC 2 windows stall for weeks because nobody has time to map controls, threat-model the diff, or pull evidence from CI logs.

  3. 03

    A 2 a.m. incident burns a sprint of context because the post-mortem is written from memory three days later and the next on-call repeats the same mistake.

02
// Pipelines you can build

Compose. Approve. Replay.

Every pipeline below is a shape you wire on the canvas using the crew and tools further down. Not a feature we ship for you, a pattern you configure.

P01

Review pull requests by file glob

On PR open, route Rust and Python files to RustPythonEngineer, React files to FrontendEngineer, and contracts to SmartContractExpert. Each persona drafts inline comments grounded in Static context, and the HITL gate blocks posting until a maintainer approves.

P02

Prep the security audit window

Run the ten-item OWASP checklist across the diff and the last 90 days of merges. SecurityAuditor cites CWE numbers and writes the attack scenario, then the rag_retrieve tool pulls matching evidence from CI logs into one audit packet.

P03

Triage performance hotspots weekly

PerformanceExpert applies the USE method to each service, HFTQuantDev ranks fixes by expected gain and effort, and Cross-run memory carries last week's hotspot list so resolved items drop off and regressions surface at the top.

P04

Draft RFCs from a one-line brief

Given a one-line problem statement, TechLead drafts the RFC with context, options, tradeoffs, and a recommendation. The rag_retrieve tool pulls precedent from past ADRs in the knowledge store so the proposal matches house style.

P05

Plan a framework or schema migration

BackendEngineer maps every caller of the old API, RustPythonEngineer drafts the codemod, TechLead slices the work into ticketed phases. Scoped repo tools stay read-only, so the migration plan is written before a single file is touched.

P06

Write post-incident reviews on close

When the incident channel resolves, DevOpsEngineer pulls logs and timeline, TechLead drafts the five-whys and action items, and replay lets the on-call walk every agent step back to a tool call. The Notion write is HITL gated.

03
// The crew

Engineering & Tech crew

Real personas from the tech_team crew. Each ships with a tuned system prompt and a default tool allowlist. Swap models per persona on the canvas.

Tech Lead

TechLead

Synthesizes findings from every specialist into a sprint plan with named owners, day-level deadlines, and measurable success criteria.

{ }

Rust Python Engineer

RustPythonEngineer

Reviews Rust and Python code for unwrap risk, hot-loop allocations, FFI overhead, and tokio blocking calls that would crash a 24/7 service.

Backend Engineer

BackendEngineer

Audits REST and WebSocket integrations, retry logic, decimal precision, and reconciliation coverage so state never drifts silently.

Security Auditor

SecurityAuditor

Runs the ten-item OWASP checklist against the diff, flags CWE numbers, and writes the step-by-step attack path before recommending a fix.

Performance Expert

PerformanceExpert

Applies the USE method to every component, estimates p50, p95, p99, and ranks the three highest-ROI hotspots with expected gain and measurement method.

HFT Quant Dev

HFTQuantDev

Profiles the signal-to-order path against a documented latency budget and proposes ranked optimizations with effort and verification steps.

DevOps Engineer

DevOpsEngineer

Scores the SRE posture across CI/CD, secrets, observability, DR, and alerts, naming the runbook gaps that would extend the next outage.

Frontend Engineer

FrontendEngineer

Reviews React components for stale closures, missing memoization, WebSocket leaks, and bundle bloat against fixed performance budgets.

UI UX Designer

UIUXDesigner

Audits trader-facing screens for data-ink ratio, fast-scan effectiveness, keyboard coverage, and WCAG AA contrast.

Smart Contract Expert

SmartContractExpert

Audits onchain interactions across EVM, Cosmos, Solana, NEAR, and Sui, scoring reentrancy, oracle, MEV, and bridge risk in audit-firm format.

04
// Scoped tools

Only the actions you grant.

Every tool below is a real shared tool from the Melaya bundle. Allowlist per agent; HITL-gate the writes; revoke any of them in one click.

shared/tools/core/

Read-only access to the repo so RustPythonEngineer and FrontendEngineer can pull the diff, blame a line, and grep for patterns. No writes, so nothing ships without a separate HITL-gated step.

git_statusgit_diffgit_loggit_showgit_blamegrep_searchglob_searchfile_read
shared/tools/gitlab_public_tools/

Pull merge requests, project metadata, and file contents from GitLab so the review crew works against the real diff. Read-only by design; comments and approvals route through a separate HITL gate.

gitlab_list_merge_requestsgitlab_project_infogitlab_repo_filegitlab_list_issuesgitlab_repo_tree
shared/tools/codeberg_tools/

Same review surface for teams on Codeberg or self-hosted Gitea. Reads only. The agent drafts the review, an engineer posts it.

codeberg_list_pullscodeberg_repo_infocodeberg_repo_filecodeberg_list_issues
shared/tools/package_intel_tools/

Resolve dependency metadata, last-publish date, and download counts so SecurityAuditor can flag stale or abandoned packages in the diff. Read-only fetches against public registries.

npm_package_infopypi_package_infocrates_package_infonpm_downloadspypi_downloads
shared/tools/devops/

Read cluster state, pull pod logs, and stage manifest changes for DevOpsEngineer's runbooks. k8s_apply and aws_cli writes are HITL gated by default so no rollout happens without an SRE approving.

aws_clik8s_getk8s_logsk8s_applydocker_psdocker_logs
shared/tools/project_mgmt/

File the action items TechLead produced as Jira or Linear tickets with owners and deadlines, and drop the post-mortem into Notion. Every create call is HITL gated so titles and assignees are reviewed before the ticket exists.

jira_create_issuelinear_create_issuenotion_create_pagelinear_create_commentnotion_search
shared/tools/knowledge/

Build the per-workflow knowledge store from ADRs, past post-mortems, coding standards, and security playbooks. Feeds all three knowledge layers for every persona on the crew.

build_knowledge_from_textbuild_knowledge_from_file
shared/tools/messaging/

Push the synthesized review summary or incident timeline to the on-call channel. Sends are HITL gated so the wording is approved before the room sees it.

discord_send_messagetelegram_send_message
05
// Three knowledge layers

The crew reads what you give it.

Every pipeline ships with three layers of knowledge access. Mix and match per agent on the canvas. No shared vector space with another tenant, no surprise reads, no opaque retrieval.

L1

Static context

includeContext

Per-pipeline documents appended to specific agents' input on every run. The ICP brief, playbook, pricing sheet, or won-deal email corpus. Whatever needs to be there before the agent thinks. You pick which personas get which docs.

L2

RAG retrieval tool

rag_retrieve

A scoped tool granted per-agent. When the agent decides it needs more depth, it queries the workflow's vector store on demand. Same knowledge base as Static context, accessed only when the model asks for it.

L3

Cross-run memory

pipeline_memory

Pipeline-level state that carries from one run to the next. Yesterday's research is in scope for today's follow-up. The crew remembers what it already prospected, what got approved, what was sent. The audit log is the second-order knowledge base.

07
// FAQ

Questions we get every week.

Will the agents merge code or push to production on their own?

No. Every git write, CI trigger, infra apply, and ticket transition is HITL gated by default. The crew writes the review, the diff suggestion, or the runbook draft; an engineer clicks approve before anything lands. You can lift the gate per-template once a workflow has proven itself.

Can the agents reason over our codebase and incident history?

Three layers. Static context attaches your architecture diagram, coding standards, and on-call playbook to specific personas on every run. The rag_retrieve tool lets BackendEngineer or PerformanceExpert pull from indexed code, ADRs, and post-mortems on demand. Cross-run memory means last week's hotspot triage is in scope for this week's follow-up.

Is this a Devin alternative or a Cody alternative?

Closer to a review and triage layer than a code-writing autopilot. Unlike Devin you keep an engineer in the loop on every commit, and unlike Cody or Codium each persona ships with a specialist prompt and a scoped toolkit. The Sweep style auto-PR is one template among many, not the only workflow.

How do we keep the review comments from sounding generic?

Findings cite file and line, name a CWE or SWC number where it applies, and pull phrasing from your own ADRs and past PR comments loaded into the knowledge store. SecurityAuditor refuses to ship a finding without an attack scenario and a concrete remediation.

Which models can we run this crew on?

Any. Claude on TechLead and SmartContractExpert where reasoning depth earns the cost, GPT on the drafting personas, a local Ollama on RustPythonEngineer when source must stay on a private network. Each agent picks its own provider per template.

How fast can an engineering team get the first pipeline running?

With a Git connector and Slack authorized, the PR review pipeline is a 4-node canvas: fetch diff, route by file glob, run RustPythonEngineer or FrontendEngineer, post comment. Most teams ship it in a single working session and see their first reviewed PR the same day.

How do we keep the agents from leaking source to a third party model?

Tool scopes restrict reads to allow-listed repos and branches, and each persona's provider is pinned per template. Route sensitive workflows to a local Ollama and the code never leaves your network. The audit log shows which model saw which file.

Can I audit exactly what the agent did and why?

Every run logs every step, every tool call, every model invocation, and every approval decision. Replay any run at any time. The audit log is the change log a compliance reviewer can read end to end.

Build engineering & tech teams pipelines on Melaya.

Sandbox tier is free with no card. Join the waitlist and we will email you the moment a slot opens.

← Back to every use case
Join the community