Agentic Development

The Autonomous Development Stack: Two Competing Visions

Agent Orchestrator from Composio and Symphony from OpenAI have independently diagnosed the same problem with how AI-assisted development works. Their prescriptions reveal a genuine philosophical split about who — or what — should be in control.

AO vs Symphony · Parallel agents, divergent philosophy · Tested on the ragas repo

01 /

The Problem

The Bottleneck Moved, It Didn't Disappear

There is an uncomfortable transition happening in how software gets built. AI didn't remove the developer as the central bottleneck. It moved the bottleneck up one abstraction layer. You're no longer writing every line of code. You're spawning agents, checking whether they're stuck, reading CI logs, forwarding review comments, tracking which branch is doing what, and cleaning up afterward. You've become a human orchestrator.

This doesn't scale. And two projects — Agent Orchestrator (AO) from Composio and Symphony from OpenAI — have independently arrived at the same diagnosis. But their prescriptions are radically different, and understanding why reveals a genuine philosophical split about what software development should look like in a world where agents can actually write code.

The "one agent, one terminal" workflow is a dead end. It feels powerful when you're in it: you prompt, the agent writes, you iterate. But the moment you need to handle more than one issue at a time, the model breaks down. You become the router. You track context across branches. You manually forward CI output back to the agent. None of this is in your job description.

Both AO and Symphony attack this with the same architectural idea: parallel agents running in isolated workspaces, each handling one issue, with automated feedback loops that route CI failures and review comments back to the right agent without human intervention. The developer only enters the picture when genuine judgment is required.

"The question both tools answer differently is more fundamental than architecture: who is actually in control, you or the system?"

Fig. 01 The shared architectural foundation — where both tools agree

Issue appears in tracker

↓

Agent spawned in isolated workspace

↓

Code written, PR opened

↓

CI / review feedback routed back

↓

Human enters only here

Both AO and Symphony agree on this loop

The architectural foundation is identical. The split comes at step 5 — and in everything that determines what "human enters" actually means in practice.

02 /

Agent Orchestrator

AO: The Developer Stays in the Loop

AO ships as a real, installable tool today. One command — ao start https://github.com/your-org/your-repo — clones the repo, generates a config, launches a dashboard at localhost:3000, and starts an orchestrator agent that manages everything else.

The orchestrator spawns worker agents per issue, each in its own Git worktree with its own branch. Each agent reads code, writes tests, opens PRs. When CI fails, the agent gets the logs and fixes it. When a reviewer leaves comments, the agent addresses them. The loop is tight and the feedback is automatic.

The configuration is explicit and overridable. The agent-orchestrator.yaml file gives you fine-grained control over every reaction:

reactions:
  ci-failed:
    auto: true
    action: send-to-agent
    retries: 2
  changes-requested:
    auto: true
    action: send-to-agent
    escalateAfter: 30m
  approved-and-green:
    auto: false  # flip to true for auto-merge
    action: notify

The plugin architecture is worth studying. Seven abstraction slots — runtime, agent, workspace, tracker, SCM, notifier, and terminal — each independently swappable. You can run agents in tmux or as detached local processes. You can use Claude Code, Codex, Aider, or OpenCode. You can connect to GitHub or Linear. Every interface is defined in TypeScript; a plugin implements one interface and exports a PluginModule.

The underlying philosophy is explicit. AO inherits what you might call the Claude Code worldview: agents are powerful but still unreliable, humans need visibility and override capability, workflows are messy and evolving. You want to augment your process, not replace it. When things break, they break loudly. CI failures surface on the dashboard. Stuck agents are visible. You intervene, fix, and continue.

03 /

Symphony

Symphony: The System Replaces You

Symphony is not a product. It is a specification — a language-agnostic document for building a machine that does software development autonomously. There is no code to install. The README's primary instruction is: tell your favorite coding agent to implement it.

There is an experimental Elixir reference implementation, but the point is the spec itself. You read it, you build your own Symphony in whatever language fits your stack, and you get exactly the system your team needs, nothing more, nothing less.

The execution model is a tight loop: poll an issue tracker for work, dispatch agents into isolated workspaces, validate via CI and checks, report results, repeat. No dashboard required. No hand-holding. The service is a long-running daemon that reads work from Linear, creates per-issue workspaces, and runs Codex app-server sessions inside them indefinitely, without human intervention.

The WORKFLOW.md contract is where the intelligence lives — version-controlled alongside your code, defining which tracker states trigger dispatch, what the agent prompt looks like, concurrency limits, retry behavior, sandbox policy.

tracker:
  kind: linear
  project_slug: my-project
  active_states: [Todo, In Progress]

agent:
  max_concurrent_agents: 10
  max_turns: 20
  max_retry_backoff_ms: 300000

codex:
  command: codex app-server
  turn_timeout_ms: 3600000
  stall_timeout_ms: 300000
---
You are working on {{ issue.identifier }}: {{ issue.title }}

{{ issue.description }}

{% if attempt %}
This is retry attempt {{ attempt }}. Review what was already tried.
{% endif %}

The orchestration state machine is sophisticated. Issues move through internal states — Unclaimed, Claimed, Running, RetryQueued, and Released — independent of their tracker states. Retry backoff is exponential. Stall detection kills agents that haven't produced output and queues them for retry. The philosophy is pure autonomous delegation: stop interacting with agents, start managing work.

04 /

The Divide

The Philosophical Divide

These two approaches represent a genuine disagreement, rooted in how Claude Code and Codex have diverged as interaction paradigms.

Claude Code is a tool to be wielded. It lives in your terminal and IDE, acts like a senior developer who understands your codebase, asks clarifying questions, and produces code meant to be maintained long-term. The philosophy is methodical. CLAUDE.md files persist project context. Custom hooks fire on specific events.

Codex is an employee to be managed. You write a well-specified task, Codex executes it in a cloud sandbox, you review results. The philosophy is rapid iteration. Ambiguous prompts produce ambiguous results, which is why harness engineering matters so much. The quality of your specification directly determines the quality of the output.

Fig. 02 Two philosophies — what each tool believes about the developer's role

Agent Orchestrator

The developer stays in the loop

Agents are powerful but unreliable. Humans need visibility and override capability. You want to augment your process, not replace it.

Developer role Supervisor

Control surface Dashboard + YAML

Agent model Claude Code

Setup ao start — today

Failure mode Loud — dashboard alerts

The off-the-rack suit. Fits most people well. Take it home today.

Symphony

The system replaces you

Stop interacting with agents. Start managing work. You define the system once, precisely, and then get out of the way.

Developer role Systems designer

Control surface WORKFLOW.md spec

Agent model Codex app-server

Setup Build the harness first

Failure mode Silent — billing dashboard

The exacting bespoke tailor. Perfect when it works. May not work at all.

The two tools have forked at the level of assumptions about the developer's role — not just at the implementation level. This is why choosing between them is a philosophical decision, not a technical one.

Fig. 03 Head-to-head comparison

Developer role

Agent Orchestrator

Supervisor

Symphony

Systems designer

Control surface

Agent Orchestrator

Dashboard + YAML

Symphony

WORKFLOW.md spec

Agent philosophy

Agent Orchestrator

Claude Code

Symphony

Codex

Code shipped

Agent Orchestrator

Yes — npm package

Symphony

No — spec only

Failure visibility

Agent Orchestrator

High — dashboard + alerts

Symphony

Depends on your spec

Setup cost

Agent Orchestrator

Low — ao start

Symphony

High — build the harness

AO and Symphony differ on nearly every practical dimension while sharing the same architectural vision. The right choice is determined by your team's maturity with harness engineering, not by which is technically superior.

05 /

The Hard Lesson

Symphony's Silent Failure

During real-world testing on the ragas repository, both tools were run against the same set of issues. AO completed the workflow as expected. Symphony hit a specific failure: the agent couldn't send a completion signal back to Linear after finishing its work.

What happened next is instructive. There was no alert. No stop condition fired. The orchestrator, seeing the issue still in an active state because the agent failed to transition it, concluded that work remained and scheduled another turn. Which also failed to send the completion signal. Which triggered another retry. The system sat in a loop, burning tokens, with no external indication that anything was wrong.

"AO fails in ways humans can see. Symphony can fail in ways that are only visible on your cloud billing dashboard."

This is not a bug in Symphony's design. It is an inherent property of fully autonomous systems. The spec is explicit about trust boundary assumptions: implementations must define their own posture on approval policies, sandboxing, and operator-confirmation requirements. It warns extensively about harness hardening. But it cannot write those stop conditions for you.

Symphony's spec does include stall detection. If stall_timeout_ms elapses without agent output, the worker is killed and retried. But a stall on the completion signal does not look like a stall to the orchestrator. The agent finished its turn. The issue is still active. The system behaves exactly as designed, indefinitely.

Fig. 04 Symphony orchestration state machine — and where the silent loop occurs

Happy path: signal sent successfully to Linear (dashed arc on desktop)

Unclaimed

Waiting for work

→

Claimed

Assigned to agent

→

Running

Agent working

→ problem path

RetryQueued

Completion signal failed

↻ retry loop

delay = min(10000 × 2^(n−1), max_ms)

→

Released

Work complete

The silent failure

No external signal this loop is running. The orchestrator behaves exactly as designed. The only evidence is your cloud billing dashboard.

The silent failure loop: the agent completes its turn but can't send the completion signal to Linear. The orchestrator sees the issue still active, schedules another turn. Repeat indefinitely, burning tokens.

06 /

Decision Guide

When to Use Each

The choice between AO and Symphony is not a performance comparison. It's a question of where your team sits on the control-versus-autonomy spectrum, and how much engineering time you're willing to invest before getting value.

Fig. 05 Decision framework — which tool fits your situation

Use Agent Orchestrator when…

You want results this week, not after harness engineering
Your team uses Claude Code or wants agent-agnostic flexibility
Your workflows are still evolving and absorbing changes
You need visibility into what agents are doing and why
You are doing production development, not experiments

Correct default for almost every team doing AI-augmented development today.

Use Symphony when…

You've bought into the Codex harness engineering model
You can write precise, stable task specifications
You're willing to invest real engineering time upfront
You have robust monitoring independent of Symphony
You're starting a new project, not retrofitting an old one

A bet on how fast agent trust gets established — valuable if you need a perfectly tailored system.

AO is the safe default. Symphony is the right choice when you have the engineering discipline to implement it correctly and the monitoring infrastructure to know when it's failing.

07 /

The Bigger Picture

The Road and the Destination

Both tools point at the same trajectory. Developer work is moving from writing code to designing systems that produce code. That shift is not gradual, and the architectural decisions you make now about how agents interact with your codebase, your tracker, and your CI system will compound over time.

AO lets you ease into that shift. You start with ao start, get agents working on real issues, and incrementally automate more of the feedback loop. The human stays in the loop, supervising at higher and higher levels of abstraction. Symphony demands the shift upfront — you cannot run it without committing to harness engineering, to the idea that your entire development workflow can be specified precisely enough for a machine to execute it without supervision.

There is one thing the tailor analogy misses. The off-the-rack suit can be tailored. AO's plugin architecture and explicit YAML configuration are the same ingredients Symphony asks you to build from scratch. Every reaction that currently reads auto: false is a dial you can turn. The scaffold for a Symphony-style workflow already exists inside AO, it's just not switched on by default.

"Symphony describes a destination. AO could build a road that leads there gracefully."

If AO can build a system where the orchestrator itself can slowly understand the grand schemes of the project and make a self-improving spec by figuring out the intention behind the changes it is making, then we have a product that can gently guide developers through the current tumultuous AI agent landscape.

The question Symphony forces you to sit with is worth asking regardless of which tool you use. If you had to specify your entire development workflow precisely enough for a machine to execute it without any human intervention, could you? If the answer is no, that gap is worth closing. Not because Symphony is ready for production everywhere, but because closing the gap is the work.