ADRs in a post-flip world

Cheaper experiments make for firmer decisions

June 20, 2026 · 8 min read

A couple of months ago I wrote about the Pareto flip: the way agentic coding has inverted the old 80/20 of software, so that thinking now accounts for most of the effort and implementation for very little.

This has altered my personal workflows a reasonable amount, I’m leaning in to worktrees with aligned tmux sessions as work now has an associated Claude session, not just a branch.

There’s not been much change in how we operate as a team and department. A lot of the practices for organising the humans have not been altered in any significant way. I think this is indicative that this work aligns heavily with the thinking side of the work.

The Architecture Decision Record is the one exception. An ADR that weighs three options now routinely means three branches we actually built, not three ideas we reasoned about. Going as far in that for one recent case, we went beyond our boundary to create three versions of the code our integrators would write, for a feature we hadn’t yet shipped.

The shape and contents of these documents haven’t changed, but the rigour behind them and when we create them has.

What an ADR used to be for

At their core ADRs serve as memory. A note to our future selves, explaining why you chose this over the alternatives, so we have an snapshot of the context for the decision, not just its outcome.

But, at least at Cronofy, it served another purpose. To avoid wasting time in the present. Writing an ADR forced a level of rigour before committing to a path. An ADR can cheaply capture the plan for several days, sometimes weeks, of work. And in that form, those decisions are cheap to change.

Our use of ADRs sought to push decision making down to individuals, but with a step of oversight to avoid wasted effort. The “show your working” that complemented a plan. We did the thinking first to avoid investing expensive effort in bad paths, a form of cost control.

In that world, the ADR was a kind of purchase order. You set out the options, advocated for one, and the document was your authorisation to go and spend the implementation effort.

The thing the purchase order authorised got cheap

That logic holds only while implementation is the expensive part. When building the wrong thing cost you days it made sense. The flip removed that. When the path is cheap to build, there is much less waste to hedge against, and so the hedge goes with it. Retiring the up-front ADR isn’t abandoning discipline, it’s retiring a cost-control mechanism for something that’s now at the level of discretionary spend.

We no longer need to authorise a path before building it, because building it is now cheap enough to be part of how you decide. Rather than debate whether an approach will get unwieldy, you can have an agent build a spike of it and watch whether it does. Why not get another instance to try the alternative while you’re at it. The agent has no attachment to either, and will throw both away without complaint. And before the investment in production-ready care and attention, the engineers have no attachment either.

So the decision is no longer made up front based on gut, vibes, and a dash of experience. It is made after you have seen the options stand up, on evidence. And the ADR, naturally, moves with it. It gets captured after the exploration rather than before, because that is now the point at which you actually know.

The rigour went up, not down

The obvious worry is that removing the up-front gate removes the discipline with it. In our experience, the opposite happens.

The discarded-alternatives section of an ADR used to be a record of paths we reasoned our way out of. Now it is increasingly a record of paths we tried. An ADR weighing three options used to mean three ideas, compared on paper. Now it more frequently means three spikes. The rigour applied to really crucial decisions is now applied liberally because it has become cheap. That is a firmer foundation for a decision, based in facts rather than opinions, and it is one we couldn’t previously afford to lay.

The sceptic’s move here is to call it rationalisation. You built something, you shipped it, and the ADR is the justification you wrote afterwards. The two discarded options are theatre, plausible-sounding alternatives invented to flatter the path you’d already chosen. It’s a fair charge to level at any after-the-fact record, and the honest answer is that the old way was more prone to it, not less. When alternatives lived only on paper, writing up two you never-seriously-intended “alternatives” cost nothing. You cannot fake having built something. When each option is a spike that was created, the debate is founded in reality rather than rhetoric.

The sharper objection is that a spike spun up in twenty minutes is a demo, not an evaluation. We are rejecting real architectural options on the strength of toy implementations that never meet the problems which only show up in production. It is true that they are often still only spikes. But they are not judged casually. The agent is given the success criteria before it iterates on any approach, so every branch is anchored to the same root context and measured against the same bar. A spike that passes the criteria you would have held the real thing to is evidence, not a demo.

I want to be careful not to overstate this. The thinking itself hasn’t collapsed the way implementation has. If anything we explore more now, because cheaper experiments invite more of them and the agent can assist the exploration. The time spent deciding is not dramatically different. What changed is that the deciding is now anchored to things we built rather than things we imagined.

How far this can stretch

Most of the time this is unremarkable. The spikes are just variations of our own code, built and compared. But the same cheapness facilitates the evaluation reaching places it never could before, and one recent decision showed me how far.

We were weighing options for a feature coming to our API. Normally we’d spike the implementation choices on our side and pick between them. This time we also spiked what each option would mean for the integrator: the code a developer using the API would have to write under each design. So the developer experience wasn’t something we reasoned about and hoped we’d got right based on experience. It sat in front of us, in actual code, for each option.

That let us weigh both costs at once: the code our integrators would write, and the code we’d build and maintain behind it. We could make a globally-best choice rather than a locally-best one. Optimising the whole interface rather than just our side of it, leaning towards integrator ease, because that’s where the leverage is. Evaluating the far side of an interface you don’t control, empirically, before you’ve shipped a line of it, is simply not something we’d have done when each spike cost real time. We’d sweat the design, and that code would be a mental exercise, not a literal one.

The rigour didn’t creep up a little. In the right case it can increase dramatically.

Who writes ADRs

The reasoning now happens in the conversation with the agent, which was present for the whole exploration. So we get the agent to draft the ADR, with human refinement.

This is the same division of labour the flip set out elsewhere. We build the shared context from one perspective, describing the problem, discussing approaches, interrogating the current system. An agent is good at translating that efficiently into another form. An ADR is just a form. Drafting it is a point-to-point translation of what is in context for the agent and the engineers. This is the kind of work agents are strong at. The judgment: which decision, why this one, what we are knowingly betting on, and the final version of the ADR stays human. That is the part that is load-bearing and the part an agent will quietly skip over if you let it.

There is a neat second use here. The ADR was always context for our future selves. It is now context for future agents too. The same document that stops a human relitigating a decision grounds an instance that would otherwise have no sense of why the system is shaped this way. It turns out the thing we wrote for our future selves works just as well for the agents we hand the work to.

The good practices we had in place before agentic engineering hit are helping us leverage agents effectively.

What hasn’t changed

For all of that, the artefact looks much as it always did. We write roughly the same number of ADRs, at roughly the same level of detail. They are higher quality, and they sit at a different point in the process, but a reader picking one up wouldn’t necessarily notice either.

Which is rather the point. Our future selves don’t care at what stage the ADR was written, or who drafted it, or how cheap the experiments behind it were. They only care that it exists, and that the reasoning in it holds.

The clue was in the name the whole time. It’s a Decision Record. A record is something you write down about what happened — it was never a promise about when. We had attached the up-front timing to it because the economics demanded it, not because the artefact did. Take those economics away and the timing falls off without anything of value going with it. The record remains, as useful as it ever was, and now grounding agents as readily as humans. When it got written turns out to be the part that never mattered.

More broadly, there’s a lot of talk about producing more with agents. That’s certainly possible, but we’re also leaning in to producing better. Better decisions compound in to long term velocity that endures.

Hey, I’m Garry Shutler

CTO and co-founder of Cronofy.

Husband, father, and cyclist. Proponent of the Oxford comma.