What Flows Between the Cells
A team’s deliverables are its collaboration made visible. The grid in the previous chapters is an abstraction, a set of cells with owners, but you never see a cell; what you see is the artifact that passes from one cell to the next, the user story that carries intent into build, the mockup that carries structure into experience, the architecture decision record that carries one person’s reasoning to everyone who will live with it. The artifact is the hand-off you can hold. So a chapter about how the agentic team works together has to be, in part, a chapter about what flows between the people, because when the deliverable changes shape, the collaboration has changed shape, and most of the deliverables a software team relies on are changing shape right now in ways that move ownership from the person who used to produce them to a person the old model never named.
The pattern is the same for every artifact, so it is worth stating once before walking through them. Each deliverable was built for deterministic software that a human operated, and each one carried a single owner’s work across a single seam. The agent breaks the assumption the artifact rested on, and the artifact does one of three things: it shrinks, surviving only for the part of the system that is still deterministic; it mutates, becoming a different artifact that can carry probabilistic work; or it splits, becoming two artifacts for two audiences. And in every case the mutation pulls in an owner the original artifact did not have, because the new version requires a judgment the old producer cannot supply alone.
The user story shrinks, and what replaces it is shared
Start with the user story, because its change is the most disruptive and the prior books made the sharpest version of the case. For two decades the unit of work has been the user story and its acceptance criterion: when the user does X, the system does Y, verified by a test that passes or fails. That primitive works because the system is deterministic, there is a single correct Y, and pass-or-fail is a real answer. None of those three holds for an agent’s behavior. The agent’s output is a distribution, not a single Y; correct is a judgment, not a binary; and a test that runs once tells you nothing because the next run may differ. So the acceptance criterion the user story was built to carry has nowhere to live.
The refund agent makes the shift concrete, and it is worth seeing both versions side by side because the difference is the whole point. The old way writes a story: as a customer, when I request a refund for a defective item within the return window, the agent issues the refund and sends confirmation. It is a clean story; it would pass any backlog grooming in the world. And it is the wrong primitive, because it describes the case that was never hard. The defective item inside the window was always going to be refunded. What the story cannot say is what the agent should do with the request that is technically outside the window but right on the merits, or the one where the item is fine but the customer is about to churn and is worth more than the refund, or the one where the amount is ten times the others and the pattern smells like fraud. Those are the cases the agent will actually be judged on, and the story has no field for any of them.
The new way writes an outcome-centric spec, and it has parts the story does not: the outcome (resolve refund requests the way a reasonable manager would endorse on review), the bounds (refund autonomously up to a limit; above it, or outside the window, or on a fraud signal, route to a human), the eval set (a curated collection of real refund requests each paired with the outcome a senior support lead actually endorsed, including the hard ones), and the definition of acceptable failure (a needless escalation is tolerable; an auto-refund on a fraud-pattern case is a never-ship failure). Read the two side by side and the shift is concrete: the story has one field, the happy path, and the spec has four, and engineering can build to the second and cannot build to the first, because the story never said what the agent was allowed to get wrong.
Now notice what the prior books, writing from the product manager’s seat, did not quite say: this new artifact is not the product manager’s to produce alone, and that is the re-vantaging. The outcome the spec states is the product manager’s. But the eval set, the curated golden dataset of judged cases that the spec is graded against, is the eval owner’s to construct and maintain and the domain expert’s to fill, because deciding what outcome a senior support lead would actually endorse on the hard refund, or what triage a cardiologist would actually accept, is domain judgment no product manager can supply. The user story was a single-owner artifact: the product manager wrote it, the engineer built to it. The outcome-centric spec that replaces it takes three: the product manager states the outcome, the domain expert supplies the judged cases, the eval owner makes them into a graded set. A team that has the product manager writing the outcome spec alone is running the old single-owner habit under the new artifact’s name, and the cases the domain expert would have flagged are the cases that reach the affected person.
And the user story does not die; its territory shrinks. The deterministic shell around the agent, the orchestration, the integrations, the screens the user clicks, is still ordinary software and still gets ordinary user stories. What resists the story is the probabilistic core, the agent’s behavior, and as agents absorb more of the work that used to be deterministic code, the territory the story can describe keeps contracting. The story is not wrong today. The share of the product it can describe is getting smaller, and the part it cannot describe is the part that now matters most.
The journey map becomes a boundary map, and gains an owner
The journey map is the next artifact to change, and it changes by gaining a layer. The journey map traces a user moving through an experience, and it has been the workhorse of product design for as long as there has been product design. But when the agent acts on the higher rungs of autonomy, the actor moving through the flow is not a user. It is the agent, acting on the customer’s behalf. The journey map does not break; it moves down a layer, becoming the agent’s path through the system. And a second map appears above it that no methodology a product team was trained on describes: the map of what the agent is allowed to do, where its authority ends, and where control passes back to a human. Call it the boundary map, and drawing it is now core product work, because the agent’s autonomy boundary, what it may do alone and what it may not, is a product decision with the weight the journey map used to carry.
The re-vantaging is that the boundary map is not a single craft’s artifact the way the journey map mostly was. The journey map lived with the designer and the product manager. The boundary map is the product manager’s where the line should sit, the architect’s where the line is physically enforced, and the designer’s where the human re-enters the experience at the moment control passes back: the journey map and the enforcement decision and the approval moment, fused into one drawing, owned by the three people who own those three things. A boundary map drawn by a designer alone is a picture of intentions the architecture may not honor, which is the nine-second deletion drawn in advance and not read.
The architecture decision record becomes the agent’s contract
Some artifacts do not mutate so much as change job entirely, and the architecture decision record is the clearest case. In deterministic software an ADR documented a decision a human had already made and would implement: why this database, why this pattern, written down so the people who lived with it later could understand it rather than reverse-engineer it. It pointed backward, at a choice that was settled, and skipping it cost you some confusion months on. In an agentic product the ADR points forward. It is no longer a record of what a human decided and will build; it is the specification of what the agent must build, the human’s structural decision written down precisely because the agent is the one that will execute it. That is a different artifact wearing the same name.
What the ADR now governs is the agent’s entire runtime contract, and the list is the reason it is load-bearing rather than optional. Which runtime framework the agent runs on. How the agent connects to the data, through what interface, with what access. How it reports its telemetry, the events the observation instruments will be composed from. How it is provisioned, how its security and its credentials are scoped, how it stays available, where it is tuned for performance. And above all the boundary plane: what the agent may reach and what it may not, expressed not as a sentence the agent reads but as the structure the agent runs inside. These are the decisions the entire supervision pipeline depends on, because the instruments cannot watch what the telemetry was never wired to emit, and the boundary cannot hold what the credential scoping never enforced. The ADR is where those decisions get made, by a human, in advance, as the contract the agent is bound to.
And that is why a missing ADR is not lost context anymore; it is an abdication. The agent generates structure as readily as it generates code, so if no human has specified the runtime framework and the data access and the telemetry and the credential scope, the agent chooses them itself, at build time, by default, reasonably-looking and unreviewed. The architecture then exists, fully, with no one’s name on it, decided by the system it was supposed to constrain. The nine-second deletion from the previous part was exactly this: a credential the agent reached because no ADR had scoped it down, a boundary plane the architect never specified and so the agent inherited whatever was lying around. The ADR is the artifact that holds the architect’s cell of the grid, and in an agentic product that cell is the foundation the boundary and the supervision are built on, so the ADR is where the human makes the structural decisions the agent must follow, and a team without one has not skipped a document. It has handed the architecture to the agent and called the result a design.
Seen this way, the ADR is the place Channel 2 stops being a wish and becomes structure, at least partially. The earlier part of this book drew the hard line that a boundary which is only a sentence in the prompt is advisory, something the agent can reason its way past, and that a boundary is real only when it is a wall the agent cannot cross. The ADR is where the architect writes the walls. The credential scoped so the destructive call is unavailable, the data interface that exposes only what the agent is allowed to touch, the telemetry the agent is required to emit so the instruments have something to read, the control plane the kill switch lives in, outside the agent’s reach, are not requests the agent honors. They are the runtime it is built inside and cannot step out of. So the ADR is the artifact that converts the supervisory layer from intent into enforcement, the deliverable through which Channel 2 is imposed on the agent at build time rather than hoped for at run time. The qualifier matters, because the ADR enforces only the structural half. It can guarantee the agent emits telemetry; it cannot guarantee anyone watches it, which is the supervisor’s cell, not the architect’s. It can scope the credential so the dangerous action is unreachable; it cannot decide which actions are dangerous enough to deserve that, which is the product manager’s blast-radius judgment. It can wire the kill switch into a control plane the agent cannot suppress; it cannot decide who is authorized to fire it. The ADR builds the walls and leaves to other cells the questions of which walls to build and who stands watch on them. But the walls are the part that has to be structural, because they are the part a sentence cannot do, and the ADR is where they get built. It is the architect’s instrument for winning the boundary fight before the boundary is breached, which is the only time it can be won.
This makes the ADR the other end of a hand-off, and naming the hand-off is the point. The executable brief carries the product manager’s Channel 2 intent, the governance requirements stated as numbered, testable lines: the agent must never refund above a limit without approval, must never reach data outside the tenant, must never run a destructive action unprompted. Those are requirements. They say what must be true, not how it is made true, and a requirement that stays a requirement is a sentence, which is the thing the boundary chapter warned cannot hold. The ADR is where the architect receives each of those requirements and turns it into structure: the never-refund-above-a-limit becomes a pre-call gate, the never-reach-outside-the-tenant becomes a scoped credential and a data interface that physically cannot, the never-delete-unprompted becomes a permission the agent does not hold. The executable brief specifies the wall; the ADR builds it. So the brief and the ADR are two halves of one decision crossing one of the new seams, the product manager’s Channel 2 intent passing to the architect’s Channel 2 structure, and the boundary in the brief is not done when it is written. It is done when it has become a wall in the ADR and the architect has confirmed the wall holds. The nine seconds was this hand-off left open: a governance requirement that either never reached the ADR as a credential-scoping line or never had an ADR to reach, so it stayed a sentence and the agent walked through it. Two artifacts, two owners, one decision, and the seam between them is where the supervisory layer is either built into the runtime or left as a wish in a document.
A new artifact appears: the source-of-truth register
The triad’s artifacts all transform; one artifact the triad never had simply appears, because the agent reasons over a body of knowledge the screen never did. An agent answers from sources, the policy corpus, the price list, the schedule, the knowledge base, and which sources it may read, how authoritative each one is, how fresh it must be, and what supersedes what are not things any of the old documents recorded, because a screen does not read, it displays what a human already decided to show it. The source-of-truth register is where those decisions live: each source the agent can reach, with an owner, an authority rank, a freshness requirement, a supersession rule, and a retirement workflow. It is the context owner’s artifact the way the ADR is the architect’s, and it crosses the same kind of seam, from the domain expert who knows which source outranks which and when a guideline expires, to the context owner who maintains the register, to the architect who wires the freshness telemetry that proves a stale source gets caught. The travel agent’s failure was a missing row in a register no one kept: the cancellation lived in a calendar the agent could not see, and nobody owned the list of what the agent was supposed to see. An artifact that does not exist cannot have an owner, and a body of knowledge with no register is a set of sources the agent treats as equally true and equally current because no one told it otherwise.
The mockup and the deck point at the supervisor now
Two artifacts change by changing who they are about. The mockup, the Figma screen the designer hands to engineering, was a picture of what the user would see and operate. For an agentic product the user often operates nothing; the agent acts and the human supervises. So the mockup that matters most is no longer the user’s screen, it is the supervisor’s, the decision package at the approval moment, the dashboard of the running agent, the recovery interface when something has gone wrong. The designer’s deliverable did not disappear, but its subject moved from the user doing the task to the supervisor watching the agent do it, which is the shift the craft section of this book treated in full, and the mockup is where that shift becomes a concrete handoff: a designer who hands engineering a beautiful user screen and no supervisor screen has designed Channel 1 and left Channel 2 as a wall of raw logs.
The status presentation changes the same way, and it matters because it is the artifact leadership actually sees. The status deck reported what the team built and how the metrics moved, and for a deterministic product those metrics, usage, retention, task completion, told the story. For an agentic product the deck that reports only those numbers reports the wrong product, because it says the agent completed three hundred tasks and is silent on whether they were the tasks the user intended, says adoption is up and is silent on whether the supervisor is still watching or has slid into the rubber stamp. The status deck that tells the truth about an agentic product carries Channel 2’s readings, the override frequency, the unintended-action rate, the drift, alongside the build metrics, and the person who composes it has to know those readings exist, which means the deck is now a hand-off from the supervisor and the eval owner to leadership, not just from the product manager to leadership. A status deck with only the build metrics is the green dashboard from the procurement story, presented upward as success.
The PRD splits, and the MRD compresses
Two of the documents a product team relied on most change in opposite directions, and it is worth keeping them apart, because they were never the same kind of thing. The product requirements document carried both the decision to build and the definition of what to build, and in an agentic product it splits in two, which this book’s foundations treated in full and this chapter only needs to place. The decision to build, the business case and the go/no-go and the boundary the room owns, becomes the human brief, the document a room argues with. The definition of what to build becomes the executable brief, the structured specification a system acts on, which is where the outcome-centric spec and the numbered governance requirements live. The PRD did not survive as one document because it was trying to be two, a thing to decide with and a thing to build from, and the agent, which builds exactly what the executable document says, forced the two apart. These do not flow at the same time, which is the collaboration consequence worth placing here: the human brief itself runs in two passes around the go/no-go, a slim decision pass that gates the bet and a fuller commitment pass written only on a go, and the executable brief starts in parallel with that second pass, not the first, because there is nothing to specify executably until the room has committed to build. So the seam between the briefs is also a seam in time, and the people who join at each point differ, the architect and domain expert at the gate on the decision, the eval owner and designer after it on the build. The split has an ownership consequence too: the human brief is the product manager’s and the room’s, and the executable brief is shared, because its governance requirements are the architect’s, the source of the ADR hand-off above, and its eval set is the eval owner’s and the domain expert’s, so even the document that replaced the PRD is no longer one person’s to write.
The market requirements document does not split; it compresses. The MRD was always a different artifact with a different job, not the decision to build and not the definition of what to build, but the evidence that a market worth building for exists at all: the segments, the competitors, the sizing, the proof that a real problem is felt by enough people to matter. That job was weeks of a product manager’s research, reading reports, scanning competitors, interviewing, synthesizing it into a formal document that argued the market case. And that job, the gathering and the synthesis, is precisely the work an agent is now fast and capable at. So the MRD is becoming a research project an agent runs for you: a deep, sourced investigation the agent conducts and assembles into a findings document, in hours rather than weeks, current rather than six months stale. The document gets thinner and faster and the production of it stops being the product manager’s labor. What does not compress, and what the product manager keeps, is the judgment of whether the market case the research lays out is real and worth a bet, which is the work that feeds the go/no-go in the human brief. So the MRD’s fate is the same move every artifact in this chapter makes, the production automates and the judgment remains, except that here it does not become a brief or gain a co-owner; it becomes an agent-produced input that the product manager interrogates and turns into the decision the human brief records. I am reasoning slightly ahead of the evidence here, because the prior books did not settle the MRD’s fate and the field has not named this yet, but the direction is hard to miss: the document that was always a research synthesis is becoming the research an agent synthesizes, and the part that was ever the hard part, deciding what the evidence means and whether to bet on it, stays exactly where it was.
Every artifact has to watch the watcher
There is a thread running under all of these changes that is easy to miss because each artifact seems to be about the agent, and naming it directly is what keeps this chapter tied to the book’s deepest claim: the thing that decays is not only the agent, it is the human supervising it. The supervision paradox from the foundations said the supervisor erodes precisely because the agent is reliable, and if that is true then an artifact that tracks only the agent’s state and not the supervisor’s is measuring the half that fails second. So each of these deliverables owes a second reading, the one aimed at the human.
The boundary map owes more than where the line sits; it owes a note on where the human’s attention will decay, which boundaries the supervisor will start waving through after the agent has been right two hundred times, so the design can put friction exactly there. The status deck cannot report only that the agent completed its tasks; it has to report whether the supervisor is still supervising, the override frequency trending down toward zero that means rubber-stamping, not safety. The ADR owes the telemetry that makes supervisory drift visible at all, because the instruments cannot watch the watcher if the events were never wired to be emitted. And the eval package owes a revalidation cadence and a dataset-freshness date, because a golden dataset that is never re-judged is a checkmark decaying quietly toward the cardiology ward, certifying an agent against a world that has moved. An artifact that names the agent’s behavior and stays silent on the supervisor’s is Channel 1 wearing a Channel 2 label, and the team that ships it has built instruments that watch everything except the part of the system this book says fails first.
The deliverable is where you can see the seam
Step back from the individual artifacts and the pattern is the chapter’s whole argument. Every deliverable a software team relies on was a single-owner artifact crossing a single seam, and the agent makes each one a shared artifact crossing the seams of the grid. This is not a coincidence of several artifacts changing at once. It is the same fact the people half of this part established, seen from the side of the things that flow rather than the people who hold them: the agentic product has more owners than the old team had, the new owners cluster in the supervision channel, and the deliverables are simply where those owners have to meet. A team can see its own collaboration gaps faster by looking at its artifacts than by looking at its org chart, because the org chart shows the chairs and the artifact shows the hand-off, and the failures live in the hand-offs.
The whole shift reads in one view, old artifact to new, single owner to shared:
| Old artifact | Becomes | New owners |
|---|---|---|
| User story (pass/fail criterion) | Outcome-centric spec with an eval set | Product manager (outcome), domain expert (judged cases), eval owner (graded set) |
| Journey map | Boundary map | Product manager (where the line sits), architect (where it is enforced), designer (where the human re-enters) |
| Architecture decision record (backward-looking) | The agent’s runtime contract (forward-looking) | The architect, building the walls Channel 2 needs |
| (nothing) | Source-of-truth register | The context owner, with the domain expert and architect |
| Mockup (the user’s screen) | The supervisor’s screen | The designer, subject moved to oversight |
| Status deck (build metrics) | Build metrics plus Channel 2 readings | The supervisor and eval owner, not just the product manager |
| PRD | Human brief plus executable brief | The room (human brief); shared (executable brief) |
| MRD | An agent-run research project | The product manager keeps the judgment, the agent does the gathering |
Every row moves the same way: the production automates or splits, and a new owner is pulled in where the agent forced a judgment the old producer could not supply alone.
So take the artifacts your team produced for its last agentic product and ask, of each one, whether it is still the thing it used to be and whether it still has the owner it used to have. Is the agent’s behavior written as a user story with a pass-fail criterion it cannot meet, or as an outcome spec with an eval set, and if it is the spec, who filled the eval set, the product manager guessing or the domain expert judging. Is there a boundary map, and did the architect confirm the boundaries on it are enforced. Is there a supervisor mockup, or only a user screen. Does the status deck carry Channel 2’s readings, or only the build metrics. Each artifact that is still the old single-owner thing, produced by the person who always produced it, with no new owner pulled in, is a place the team is running the new work through the old collaboration, and the deliverable will look finished while the judgment it was supposed to carry was never added.