Real story · internal testingEstivo: rebuilding an internal estimation tool, guided by intent-cli itself
Estivo is our in-house estimation and invoicing tool, originally written in PHP. As part of internal testing we started rebuilding it (V2) on a modern stack — C# with the Sekiban event-sourcing framework, a React + shadcn/ui front end behind a .NET 10 BFF, Entra ID for auth, Aspire for local dev, and Azure App Service for hosting. The point of the exercise was not the rebuild itself, but to see whether an engineer who had never touched intent-cli could drive the whole intent phase — and, in the session that followed, get the implementation and review loops actually turning. It is early, but task execution is now running.
intent-cli guides the AI, not the human
The engineer started a fresh Claude session, installed intent-cli from a CI artifact, and simply told Claude: "ask intent-cli and follow it." intent-cli ships AI-facing guidance — not a human manual. The human never reads it. Claude reads it, learns the workflow and the rules for structuring intent, and runs the session. Notably, intent-cli contains no AI of its own: it is a curated set of prompts plus commands that mutate the intent metadata. The thinking is done by the operator's own LLM, so the AI cost is borne by whoever runs the host.
This is Intent Storming — you become the product owner being interviewed
From there the engineer described the goal and the architecture in plain language, as a product owner would. Guided by intent-cli, Claude turned that into a structured interview — for each open decision it laid out background, options, the pros and cons of each, and a recommendation, then asked. This is Intent Storming: the work of structuring technology and intent as the product owner. Here it ran solo — one engineer and the AI — rather than as a group workshop, but it is the same process. It felt less like prompting a tool and more like talking to the manager of a development company while you act as the prime contractor. The answers were captured directly as a structured intent tree, then sliced into packets and cut, one at a time, into GitHub issues that the three-thread loop picks up.
The build loop is now turning — three threads, one repo
In the following session the three-thread loop went live, and task execution started running for real. The human stays on the design thread; two AI threads run unattended on five-minute cycles — implementation (on Sonnet) and review (on Opus, deliberately the stronger model, because the reviewer is the one that catches build failures and missing pieces). Each thread works from its own checkout of the same repository, so they never collide. A packet becomes a GitHub issue, the implementer opens a PR, and the reviewer compares it to the intent and either approves-and-merges or posts a repair comment that sends it back. We watched the very first packet — a hello-world vertical slice — go all the way through implement → review → repair → re-implement → approve → merge, and the instant it merged, the loop cut the next issue and started again. GitHub's own issue list — with status labels (in progress, re-review ready, request update, approved) and a saved view sorted by last-updated — became the single dashboard for the only question that matters: "is anything stuck?"
It is honestly still early. Perhaps one run in ten stalls — often a Windows/PowerShell encoding hiccup, or an agent that loses the thread — and the fix is not to touch code by hand but to tell the design thread "this should keep going, repair it," which almost always recovers it. The day's honest takeaway is that the human's job is shifting: less writing code, more keeping the AI from stalling and answering its product-owner questions as they arrive.
Technical skill still matters — and it pays off
This is not "no technical skill required." Choosing the stack, judging the proposed options, and recognizing when the intent has drifted all need real engineering judgment. But the flip side is the encouraging part: a technically capable person, with no prior intent-cli experience, was able to assemble solid, structured intent on the first run — because intent-cli's guidance carried the method, and their own Claude carried the thinking. The intent-phase runs were almost mistake-free and behaved as expected; the build loop is younger and still needs a human nearby, but it is already turning out merged PRs.