Who Sits in the New Seats
The grid says what must be owned. The box says where the owning starts. This chapter is about the act in between, growing the box to cover the grid, and the honest version of that act is not a single org chart you copy. It is a different answer at five people than at five hundred, and getting the difference right is most of what it means to staff an agentic team. A startup that separates every cell into its own role drowns. An enterprise that lets one person hold the whole supervision channel ships the empty column as an incident. So the question is never how many roles does an agentic team have. It is which cells merge and which must split, at the size you are, and the principle that governs the answer is one this series has stated before in a different room: merge to decide, separate to ship.
That principle resolves the apparent fight between moving fast and covering the grid. The roles that build the agent can merge while a team is small and deciding what to build, because at that stage a tight room of high-judgment people is the whole advantage, and separating them prematurely is bureaucracy a startup cannot afford. The roles that supervise the agent must separate as the thing ships and runs, because the supervision failures are distributed across the grid and no merged generalist can see all of them at once. Merge the build, separate the supervision. A team that merges everything, build and supervision alike, ships fast and finds the empty column at scale. A team that separates everything from the first day cannot move. The skill is knowing which cells are which, and the answer changes as you grow.
The startup: one senior, and one pair of eyes that are not theirs
At the smallest scale, one person holds most of the grid, and that is correct rather than a compromise. A founding team building its first agent might have a single senior engineer who owns the entire build channel, intent and structure and build and run for the agent itself, because at five people the cost of separating those cells exceeds the cost of one capable person holding them, and a senior person can hold them well. The box at this size is barely a box. It is two or three people wearing all the hats between them, and that is not a failure of structure, it is the right structure for the stage.
What the smallest team cannot do, and what kills the ones that fail, is let the person building the agent also hold the entire supervision channel by default. The builder is the last person who can see their own creation clearly, which is the oldest reason code review exists, and an agent supervised only by the person who built it is an agent supervised by its most biased observer. So the minimum viable separation, even at five people, is that someone other than the agent’s builder owns the question of whether the agent can be trusted. Not a department. A second pair of eyes, with the standing to say this is not ready and the independence to mean it. The smallest honest agentic team is not one pipeline with one owner. It is one person who can hold most of the build and a second who holds at least the intent and the watching of the supervision, so that the supervision channel has at least one set of eyes in it that did not also write the thing being watched. Everything else can wait for scale. That one separation cannot: it is the difference between a team that catches its own agent and a team that ships its blind spot.
The scale-up: the cells become roles
In the middle, the cells separate along their natural seams, and the grid starts to look like distinct people. This is the stage most of this book has been describing, because it is the stage at which each cell has become a full-time judgment rather than a part of someone’s week. The architect emerges as the owner of structure across both channels, how the agent is shaped and which guarantees are walls rather than requests, because the system is now complex enough that structure cannot be held on the side. An eval owner emerges because the golden dataset and the judge calibration and the regression discipline have become more than anyone can do between other duties. A supervisor or agent-operations role emerges because the running agent now needs someone whose actual week is the watching. The product manager, who at five people made every intent decision in both channels, now shares the supervision-intent with the people who can see drift and correctness and recovery, and keeps the part that is theirs, the trust conditions stated as requirements and the decision to ship.
This is where the box visibly grows, and where the growth has to be managed rather than just allowed. Adding the architect as a distinct seat changes the product manager’s job, because a decision that used to be theirs alone, where the boundary sits and how it is built, is now split across two people who have to agree, and that split is exactly the old power struggle waiting to flare. The teams that grow the box well at this stage do it by naming the work, not the rank. The architect owns whether the boundary holds; the product manager owns where it should sit; neither owns the other; and the seam between them is a hand-off both are responsible for closing. Framed as work, the new seat is a relief, someone finally owns the thing that was falling through. Framed as status, it is a threat, someone took a decision that used to be mine. The same addition lands as either, and which one it lands as is decided by whether the team staffs from the grid or from the org chart.
The enterprise: the cells become teams, and the seams become borders
At the largest scale the cells do not just separate, they multiply, and a single cell becomes a team. The supervision-operate cell, one person at scale-up, becomes an agent-operations function with an on-call rotation and a manager. The eval cell becomes an evaluation-engineering team. The structure cell becomes an architecture group. The box is now a large organization, every seat filled, and a new failure appears that the smaller stages did not have. When every cell is owned by a different team, the hand-offs between cells stop being conversations and become organizational boundaries, and intent leaks across an organizational boundary far more easily than across a hallway. The boundary the product manager specified used to reach the architect in a meeting; now it reaches a different department through a ticket, and a ticket carries intent worse than a conversation does. The enterprise grid is fully staffed and most at risk of the seam failures this book has catalogued, because the seam that was a five-minute exchange at fifteen people is a cross-team dependency at five hundred. The large organization has every chair filled and has to work hardest to keep the spaces between the chairs from becoming the new empty column.
So the shape of the growth is not linear, and it is worth saying plainly. Going from startup to scale-up, the work is adding seats. Going from scale-up to enterprise, the work shifts to defending the hand-offs between seats that now sit in different orgs, because at that size the failure is rarely an unowned cell and usually an unowned seam. The capabilities this book has named, the architect, the eval owner, the supervisor, the domain expert, the context owner, the forward-deployed engineer, the orchestration engineer, who owns how the fleet’s agents compose and delegate, the build seat of the next part, and the red team, and the rest, are not a fixed roster to hire all at once. They are the cells that emerge as the box grows, bundled into one senior at a startup and spread across teams at an enterprise, and the list of them is useful as a checklist of work to be owned, not as an org chart to be copied. A team that reads the list as eight required hires at any size has misread it. The list is what the grid contains. The staffing is how much of it one person can hold at the size you are.
One dimension from the foundations is deliberately not given a seat here: the regulatory one. In a regulated domain it is a seat, counsel at the table with a veto on the gate design; below that threshold it rides with the domain expert and the boundary requirements, and the team should be able to say which of the two it is doing on purpose.
The roles the market has not learned to pay for
There is a way to see how new the supervision channel still is, and it is in the money. The product manager role at the frontier labs commands compensation that has become its own headline, medians well into the high six figures and beyond, because the market has decided that the judgment of what to build with AI is scarce and valuable, and it is right that the judgment is scarce. But look for the compensation band for the agent supervisor, the person whose job is to watch the running agent, and it is not there. There is an established, well-priced market for the person who decides what the agent should do and a thin-to-absent one for the person who watches whether it is still doing it safely. The roles that own the supervision channel, the agent supervisor, the eval owner as a standing function, the agent-operations manager, are being invented faster than the labor market has learned to name and price them, which is the empty column showing up in the one place that is hardest to argue with, the salary survey.
This is the closed loop made literal. The discourse celebrates the product manager as the center of the agentic team, the market pays the product manager as if that were true, and the roles that the product manager’s own product depends on for safety, the ones this book has spent its length arguing are necessary, do not yet have a price because the field has not yet fully admitted they are jobs. The teams that staff the supervision channel before the market forces a price on it are buying the scarce thing while it is still cheap, and the teams that wait will pay for it the way you always pay for the seat you left empty, after the incident, at a premium, under duress.
Who the watchers report to
There is an org question underneath the staffing question that the book has not asked and that decides whether any of the staffing matters, and it is a question about reporting lines. Suppose you do everything this part argues for. You hire the eval owner and the agent supervisor, you fund the supervision column, you fill the seats. And you have them all report to the product leader whose roadmap depends on the agent shipping and the metrics staying green. You have just built the supervision channel and handed it to the person it exists to check. The eval owner who reports to the leader whose quarter depends on a green checkmark is under quiet, structural pressure to find the checkmark green, not because anyone is corrupt but because that is what reporting lines do, they align incentives, and the whole point of the supervision channel is that its incentives must not align with the channel it supervises. Banking learned this the expensive way and enforced the answer: the people who validate the models and audit the controls do not report to the business whose models and controls they check, and the audit function reports past the executives to the board, because independence that reports to the thing it watches is not independence. The agentic team does not have a regulator making it draw the line, but the line is the same. If the supervision of the agent reports into the production of the agent, the supervision is advice the producer can overrule, and the green checkmark means what the producer needs it to mean. Where Channel 2 reports is not an HR detail. It is whether Channel 2 is real.
The cheapest version of this discipline, and the first governance act a team can take, costs nothing but a list: an inventory of the agents you are running, each one risk-tiered by the gradient from the foundations, so that the high-stakes agents get the full supervision apparatus and the low-stakes ones get the thin column on purpose rather than by neglect. Most organizations today cannot produce that list, which means they cannot say which of their running agents deserve which supervision, which means the question is being answered by default, everywhere, as none. The inventory is cheaper than any hire and it is the precondition for all of them.
There is a second cheap precondition that the inventory points at, and it is about the people, not the agents. A supervision seat is only as good as the literacy of the person in it, because watching an agent for the failure that matters requires understanding what the agent is doing well enough to know when it has stopped. The research on AI literacy now treats it as a gradient, not a binary, a spread of competence that runs from “can operate the tool” up through “can judge its output” to “can supervise it when it is subtly wrong,” and the finding that matters for staffing is that the gradient mediates whether a human-in-the-loop actually catches anything: put a low-literacy reviewer in a high-stakes loop and you have staffed the seat without filling it. So the literacy of the supervisor is not a training afterthought; it is a staffing gate, a thing the team should assess before it assigns the watch, the same way the inventory assesses which agents deserve a watch at all. The two questions are the matched pair the empty column has been hiding: which running agents deserve supervision, and which people are literate enough to provide it.
There is a hard objection to everything in this part, and the book should voice it before a reader does, because the book’s own best evidence makes it. This book’s answer, over and over, is name an owner. But the insurer in the failures part had an owner: a physician was assigned to review every denial, the gate was designed, the box on the org chart was filled, and it was theater anyway, a human at about a second a case. An ownership grid is exactly the artifact every large enterprise already produces in volume, and the RACI chart that satisfies the auditor while nothing changes is the slide-box failure in the book’s own vocabulary, applied to the book’s own apparatus. So naming an owner is necessary and it is the cheap half. What separates an owned cell from an assigned one is three things the slide does not capture: budgeted hours behind the name, a verification that the work is actually being done, and an honest answer to the seconds question, how much time the owner has to do the thing they own. A name with no hours is the insurer’s gate. A name with hours but no verification is the green checkmark no one checked. The grid is the start of the work and never the proof of it, and a team that mistakes the filled-in table for the done work has built the most sophisticated version of the empty column there is, the one that looks full.
The arithmetic the CFO is actually asking
A leader reading this part has a question the book has so far talked around, and it deserves a direct answer because the honest answer is the persuasive one. The preface said the agentic shift lets a few people do the work of many, and this part has spent its length adding seats, the architect for enforcement, the eval owner, the supervisor, the domain expert, the context owner. So which is it: is the agentic team smaller or larger than the triad it replaced? The CFO cutting headcount this quarter because of AI is owed a real number, not a dodge.
The answer is that the team ends up smaller in total and differently shaped, or the same size pointed differently, and the mechanism is a reallocation, not an addition. The agent collapses the build roles, the thing this book’s craft section showed: one fluent engineer now spans what used to be several specialists, the production half of the designer’s and the product manager’s work evaporates, the sheer headcount that used to go into making the thing falls. That freed capacity is the budget. It does not get banked as a smaller team; it gets moved, from building the agent to supervising it, from Channel 1 to Channel 2. The team that takes the agent’s productivity dividend and pockets it as a layoff has bought the runaway loop and the silent drift, because it kept the cheaper half and cut the half that keeps the cheaper half safe. The team that takes the same dividend and spends it on the supervision seats ends up roughly the same size as before, or smaller, but with its people pointed at the half of the work that the agent made dangerous rather than the half it made cheap. The number the CFO wants is not “add four people.” It is “the same money, fewer builders, and supervisors you did not have, because the builders you no longer need are how you afford the supervisors you now do.”
The dev chapter showed where the dividend goes by default: into the review queue, the senior buried at four hundred percent. The reallocation is therefore a sequence, not a swap: build the automated review first, recover the capacity, then spend it on the seats. The dividend Part II watched leak into the queue is the exact budget Part V spends here, and a team that has felt the leak has already paid for the seats without buying them.
The seat that sits with the customer
There is one more owner the preface promised and the staffing has not yet seated, and it is the person who sits with the customer in the environment the product team never saw. An agentic product does not finish at the boundary of the company that built it; it lands in a customer’s workflow, with the customer’s data and the customer’s edge cases and the customer’s particular way of being wrong, and someone has to be there when it lands, not to sell it but to watch it work in a context the builders could not have tested. The forward-deployed role, an engineer or specialist embedded with the customer, is already a named, hired thing in the agentic market, and it maps onto a vector this book named in the failures part and left without a clear owner: compensatory drift, the shadow workflows users build around the agent when it does not quite fit, the manual step they add, the field they stop trusting, the workaround that becomes load-bearing. That drift is invisible from headquarters and visible only to someone in the room with the user, and it is the forward-deployed seat’s to see and carry back. A team that ships an agent into environments it never saw and staffs no one to watch it land has built a product whose most important failures happen where no one on the grid is looking.
The change the box does not absorb on its own
One more thing has to be staffed, and the prior books in this series handed it to the product manager, which is where it does not belong. Growing the box is not only an org-chart exercise; it is a change to how people work, and a change that large does not absorb itself. When the agent arrives and the engineer’s job inverts toward review, when the designer starts designing the supervisor instead of the screen, when the product manager’s day empties of the translation that used to fill it, the team goes through a real and disorienting transition, and a capable team can get worse for a quarter, not because the agent was bad, but because nobody designed for what the agent did to the people. Managing that transition, keeping the team intact and learning through it rather than thrashing, is change management, and it is not the product manager’s to own, because the product manager has no authority over how the engineering manager develops engineers or how leadership decides to invest in the workforce through a disruption.
This is the place to name a seat the book has leaned on without introducing, because it has now quietly acquired two of the heaviest responsibilities the agentic shift creates. The engineering manager has appeared in these pages as the owner of the skill leg, the one who can put practice hours on a calendar and defend them against the throughput pressure that argues them away, and the engineering manager appears again here as the owner of the transition the agent forces on the team. Those are not small additions to a familiar role. They are the agentic product loading new weight onto a seat that already existed, the same move the box has always made, except that the engineering manager’s new weight is not a new title but a new and harder version of the job they already held: developing people whose development the agent is actively eroding, and leading a team through a change to the meaning of their work. The architect and the eval owner and the supervisor are new seats the box grows. The engineering manager is an old seat the agent makes load-bearing in a way it was not before, and a team that fills the new seats while assuming the engineering-manager role is unchanged has missed half of what the shift demanded. It is the engineering manager’s and the organization’s leadership, the same shift as the skill-erosion problem earlier in this book. The box grows, and someone has to lead the people through the growing, and that someone manages people, which the product manager does not. Naming change management as a product-manager job is the closed loop again, assigning the work to the seat that cannot perform it because that seat is the one writing the books.
What the product manager can do, and should, is the one move that is theirs even when the fix is not: make the cost visible to the person who can act on it before it comes due. A product manager cannot mandate practice hours or run the team through a reorganization, but they can see the transition coming, because they sit at the seam and watch every craft change at once, and they can put the cost on the table, in the planning conversation, named and quantified, so the engineering manager and the leadership who own the fix are deciding about it on purpose rather than discovering it as a bad quarter. The failure the prior framing produced was not that the product manager refused the work; it was that the work fell to the one person who could only watch it, and so no one with a lever ever heard that it needed pulling. The product manager’s real job at the transition is to be the early-warning system for the people who can act: name the human cost of the shift in the room where the budget is set, to the seat that can spend against it.
So the staffing of the agentic team comes down to a sequence of honest questions about your own size. Which cells of the grid can your current box still hold, and which have become too much for the seats you have. Which new owner does the next stage of growth require, and is it being added as work that needed an owner or contested as status someone lost. Is anyone watching the running agent whose actual job that is, or is it the builder watching their own creation. And who is leading the people through the change the agent is making to all of their jobs, because that is a job too, and it is not the product manager’s. Answer those for the team you actually are, not the one the org chart wishes you were, and the box grows to fit the work the way it always has, one seat at a time, in the order the work demands.
It is worth ending where the book began, with the travel agent that booked a non-refundable fare into a cancelled trip, because the whole argument is the difference between the team that shipped it and the team that grew the box to hold it. Run the same agent past a team that has filled the seats this book named, and watch the failure fail to happen. The product manager who owns the boundary decides that an irreversible international booking is not a fourth-rung action and gates it, so the fare pauses for a human. The architect who owns enforcement makes that gate a wall in the booking path rather than a line in the prompt, so it holds even when the agent reasons its way toward confidence. The context owner, writing the sufficiency statement, sets the two lists side by side, what the job requires the agent to see and what it can actually see, and the calendar sits in the delta, closed before launch or accepted in writing instead of discovered as a charge, and the audit surface is scoped to that blind spot on purpose. The domain expert, someone who knows how corporate travel actually breaks, ranks the sources the agent does read, so a stale itinerary is never served as a current one. The eval owner builds the cancelled-trip case into the golden dataset, so the gap is caught in testing rather than in production. The agent supervisor, whose week is the running agent and not the launch, watches the override and exception patterns and catches the booking in the hour it happened, when it is still recoverable, because the recovery workflow was built rather than improvised. And the affected person, the employee whose trip was cancelled, was in the design review this time, represented by someone whose job was to ask what happens to the human at the end of a wrong decision. None of those people is heroic. Each of them owns one cell the triad left empty, and the agent that ended the first chapter as an incident ends the book as a caught exception, because the box grew the seats the work required. That is the entire difference, and it is the book: same agent, same blind spot, a team shaped to contain it.