Andrew Crookston. @acr
Updated May 19, 2026 · 17 essays · Stockholm RSS ↗
The thesis

From prompting
to orchestrating.

AI coding is reorganising software work along a four-stage progression. The models are good enough. The infrastructure isn't.

By Andrew Crookston Updated May 2026 10 min read

AI models are excellent at coding now, and getting better every month. The limit has moved. What surrounds them — how we prompt, structure, verify, and scale the work — is now the harness that decides whether AI delivers anything useful.

This page is my thesis on where that work has landed and where it's going next. Four clear stages of adoption have emerged: prompting, pair programming, delegating, orchestrating. Each is a real shift in how engineering teams work with AI.

Where are you, and where is your team?

What's happening at scale
42%
of committed code is now AI-generated or assisted. Projected 65% by 2027.
Sonar, 2026
96%
of developers don't fully trust AI-generated code. Only 48% always verify before committing.
Sonar, 2026
73%
of engineering teams have no standardised templates or golden paths.
Harness, 2026
51%
of frequent AI users report more quality problems than before. Not fewer.
Sonar, 2026

Adoption is mainstream. Trust isn't. The gap is the story.

01The four stages.

01
Prompting
Chat window, copy-paste, manual loop. The AI is fast; you're the bottleneck.
Tools
ChatGPT, Claude.ai
Limit
Human as copy-paste pipeline
02
Pair programming
AI lives in the editor, suggests line by line. You drive, it navigates.
Tools
Copilot, Cursor
Limit
Still your pace, your keyboard
03
Delegating
Describe a task, walk away — sort of. Powerful, but tethered to your machine.
Tools
Claude Code, Codex
Limit
Babysitting, lost context, sync
04
Orchestrating
Async, fleet-scale, pipeline-enforced. Queue work, review outcomes.
Tools
Stripe Minions, Spotify Honk, Pilot
Limit
Infrastructure doesn't exist yet

Most teams think they're at stage three. Most are stuck between one and two. → Read the foundational piece

Stage is decided by infrastructure, not ambition. You can't delegate if your environments are flaky. You can't orchestrate at scale without golden paths and a way to verify what came back.

Different stages suit different problems. One-shot a regex at stage one. Pair-program a feature at stage two. Delegate a migration at stage three. Earlier stages don't go away as you build toward stage four — they sit alongside it, each used where it fits.

02AI is a multiplier.

Where AI shines
Greenfield
Empty repository. Fresh context. No constraints, no history. The model has everything because there's nothing it doesn't have. Demos and viral wins come from this mode.
Most demos · Few real codebases
Where AI struggles
Iceberg
The code in front of you is the small part above the waterline. The bulk of what matters — why this service owns that table, which module holds up three downstream consumers — lives in senior engineers' heads.
Most real work · Few demos

AI multiplies whatever you give it, including the broken parts. Spotify and Stripe didn't sprinkle AI on tired codebases. They spent a decade building platforms AI could plug into, then wrote their own tools.

Real work is the iceberg. The code in front of you is the small visible part. The bulk — why this service owns that table, which module holds up three downstream consumers, what the post-mortem from 2023 actually concluded — lives in senior engineers' heads and old Slack threads, where AI has no access.

What changed with AI wasn't the volume of work but its location. Code generation compressed; specification upstream and review downstream expanded to match.

The bottleneck moved.
To the two ends most teams have underinvested in for years.
Before AI
Spec
Writing code
Review
After AI
Spec
Code
Review
Writing was never the bottleneck. Reviewing was — and specification before it. AI compresses the middle and exposes the bookends. Vague specs in, bad code out, now at 10× the volume. → Read the multiplier essay

03What stage four requires.

Existing tools converge on a specific shape — single task, single agent, single screen, synchronous, freeform. They work well for solo, non-critical, or non-production work, where speed and flexibility are the priorities and the cost of a bad output is low.

Production is a different problem. To let AI coding into shipped systems, the work needs checks and balances, pipelines, and verification — not as polish but as the structure that lets you trust the output. Without that, AI just produces more code than anyone can safely review.

Five constraints describe what stage four actually looks like. Each one is a frustration with stage three turned into a principle.

1
Chats
Tasks
2
Terminals
Servers
3
Synchronous
Async
4
Single agent
Fleet
5
Freeform
Pipeline

Five frustrations, five principles, one shape. → Read the constraints piece

04The loop has to close.

Under the five constraints sits a single principle: agents become useful when the loop closes.

The loop has two halves. Verification is what gets checked — tests, lints, spec match, structural gates. Heartbeat is what advances state — what triggers the next step, what signal moves the work forward. Both halves have to be load-bearing rather than advisory — pipelines, not suggestions.

That distinction is what separates the existing tools from what stage four needs.

Where most tools sit today
Runner
Executes things. Fast. Flexible.
  • Generic heartbeat
  • Time- or run-based
  • Advisory structure
  • Fails open
  • Flexibility by default
  • Solo, exploratory work
Where stage four lives
Workflow
Executes things in order. Refuses to advance when preconditions aren't met.
  • Semantic heartbeat
  • State-aware
  • Load-bearing structure
  • Fails closed
  • Structure by default
  • Team-scale, regulated work

Flexibility lives at the spec layer, not the execution layer. → Read the close-the-loop piece

05Where this leads.

None of this is theoretical. The constraints, the bookends, the close-the-loop logic — those came out of the last year of building production AI work, and adopting AI across an engineering org in a regulated healthcare context.

I'm working the thesis from two sides. I'm building Pilot — an orchestration layer that takes the runner-vs-workflow split seriously, starting with an open-source community edition. And I do advisory work with engineering leaders navigating the same transition inside their organisations — figuring out which stage they're actually at, what foundation work matters next, how to adopt AI without amplifying what's already broken.

Different angles on the same question. The question is what I'm interested in.

If any of this resonates and you want to follow the thinking, the Field Notes go out when there's something worth saying. Otherwise, the writing keeps going.