Appendix B · Reference

Glossary

Single-line definitions of the technical and conceptual terms introduced in this book. Sorted alphabetically. Category letters in parentheses refer to the platform taxonomy in Appendix A.

Accuracy
Proportion of classifier outputs that match ground truth; a single-step validator metric that breaks down for multi-step agent trajectories.
Actor-to-supervisor transition
The shift from performing a task to supervising an agent that performs it; the central change-management problem for agentic adoption.
Adaptive governance
Governance that adjusts rule intensity to context (acuity, consequence class, time), with structured override rights and audit.
Adversarial testing
Deliberate attempts to make an agent misbehave via prompt injection, tool abuse, credential leakage, or permission escalation.
Affected person
The individual whose life is changed by an agent’s decision, distinct from the user who operates the agent.
Agent
An AI system that uses tools and takes actions on behalf of a user or another system, typically across multiple reasoning steps.
Agent candidacy checklist
A planning artifact listing the criteria a problem must meet before an agent is built for it.
Agent identity
An IAM-bound principal under which the agent acts; a base platform dependency relabeled as agent-specific by most vendors.
Agentic AI
AI that plans and executes multi-step actions on behalf of a user, as distinct from AI that responds to a single prompt.
Algorithm aversion
The tendency of users to abandon an AI tool after a single visible failure even when its track record is otherwise strong.
Approval moment
The typed handoff point where an agent pauses, presents a decision package, and waits for human judgment before continuing.
AUC (area under the curve)
A classifier metric measuring discrimination between two classes across thresholds; a single-step validator metric.
Audit surface
The reconstructable decision trajectory of an agent, composed from observable action traces rather than the model’s self-narration.
Automation bias
The tendency to accept an automated recommendation even when contradicted by the evidence in front of you.
Automation complacency
The erosion of operator vigilance as a system’s observed reliability increases over time.
Automation expectation
The pre-existing hand-off orientation users bring to a new AI tool from experience with every other AI tool they have used.
Autonomy boundary
The typed runtime declaration of what an agent may do unilaterally versus what requires a human decision.
Autonomy ladder
A graduated scale of agent autonomy from suggestion through copilot to autonomous actor.
Background failure
An agent output that passes semantic evaluation but is not matched by any state change in the target system; the agent said “done” but nothing happened.
Bainbridge irony
The 1983 observation that the more reliable the automated system, the more thoroughly atrophied the operator’s monitoring skill becomes; the foundational citation for the supervision paradox.
Base platform dependency (Category E)
A non-agent-specific platform capability (IAM, audit logging, tenant isolation) that is required for agentic systems to function safely.
Behavioral monitoring
Observation focused on whether the agent’s judgment is correct, distinct from infrastructure monitoring of whether the system is running.
Blast radius
The scope of consequences an agent action can produce in the same execution; the PocketOS case (production data plus volume backups deleted in nine seconds) is the canonical example.
Bounded objective
An agent task with a clearly delimited goal, not an open-ended one.
Break-even volume
The task volume at which the marginal cost of running an agent falls below the marginal cost of the human process it replaces.
Brownfield deployment
An agentic deployment on top of an existing AI-capable platform license (SAP, Salesforce, ServiceNow, Microsoft); incremental floor price.
Channel 1
The agent itself: autonomy boundary, logging, error handling, recovery workflow.
Channel 2
The human experience of supervising the agent: queue, intervention surface, workflow fit, supervisory interface.
Classifier metric
A metric defined against a binary or multi-class ground truth (accuracy, sensitivity, specificity, AUC, precision, recall, F1); valid at single-step validators.
Compound probability
The product of per-step success probabilities across a chained agent workflow; 0.95 across ten steps yields roughly 0.60 end to end.
Confidence calibration
The degree to which an agent’s stated confidence tracks its observed correctness; well-calibrated agents are uncertain on things they get wrong.
Constitutional AI
Anthropic’s 2022 framework for training models to follow declared principles; the architectural reference for the Constitutional Runtime Layer.
Constitutional runtime layer
Platform-level rules that execute in the agent’s request path and return a refusal or modified action, independent of design-time governance.
Context sufficiency
Whether the agent receives the relationships, calculated semantics, and governance rules it needs to act, not only the raw fields.
Copilot
A declared agent system type: acts only on explicit step-by-step instructions; the human initiates each step.
Coverage statement
A PM artifact listing which user intents, failure modes, and adversarial inputs the eval suite tested, and which were known but deliberately deferred.
Currency question
The set of contract-time vendor questions that ensure the platform’s AI capabilities remain current through the contract period.
Data lineage
The recorded chain of provenance from agent output back through retrieved sources, training data, and source-system origin; required for upstream-data-wrong detection.
Data observability
Detection of when the data the agent reasons from is not what the agent thinks it is: freshness, completeness, referential integrity, context availability, knowledge graph mapping accuracy.
Decision package
The information required for a human to approve an agent’s proposed action: what the agent knows, what is uncertain, what the consequences are, what the alternatives are.
Decision-trace capture
The reconstructable sequence of an agent’s reasoning and tool calls; distinct from the agent’s self-narration.
Deference allocation
The case-by-case design problem of when the supervisor should rely on the agent and when they should pause and verify; harder than blanket trust or blanket review.
Deployment event
Any change to the agent’s model, prompt template, or tool configuration; treated as a deployment requiring re-baselining. Foundation model updates from the provider are deployment events.
Derived KPI (Category B)
A metric computed on top of events the platform’s observability layer captures, not shipped as a named platform feature.
Deskilling
The loss of an existing capability through sustained AI substitution; the Budzyń ACCEPT colonoscopy result (twenty-eight to twenty-two percent adenoma detection in three months) is the empirical anchor.
Drift
Divergence of production behavior from the reference baseline; includes data drift, concept drift, prediction drift, and behavioral drift.
Earned autonomy
Movement up the autonomy ladder triggered by demonstrated competence in the specific failure modes that matter, not by a count or a calendar date.
End-to-end task success
Whether the agent completed the task the user intended, measured across the full trajectory rather than per step.
Eval / evaluation
A structured test of agent behavior under controlled conditions; partial observability of system behavior, not proof of correctness.
Eval sign-off checklist
A planning artifact enumerating the eval outcomes required before a release decision.
External audit trigger
A pre-declared condition that invokes external audit of an agent’s behavior, regardless of internal governance status.
F1 score
The harmonic mean of precision and recall; a single-step classifier metric.
Fermentation culture
The 2025 to 2026 enterprise pattern of using agent count as a competitive benchmark while the security and supervisory containers remain underbuilt.
First-contact
The point where an agent handles the beginning of a problem story, often a mismatch with models trained on end-of-story documentation.
Floor cost
The minimum cost per agent task including model inference, tool invocation, human review, and coordination overhead.
Greenfield deployment
An agentic deployment without an existing AI-capable platform license; full infrastructure and observability build cost.
Hallucination
Plausible-sounding agent output not grounded in reality; categories include factual errors, outdated references, spurious correlations, fabricated sources, incomplete reasoning, and upstream-data-wrong.
Horvitz principle
The observation that interruption has a cost; routing approval requests requires a budget, not unlimited attention.
Iceberg
The phenomenon where data transferred from a source system loses its relationships, calculated semantics, and governance rules in transit; the fields arrive, the meaning stays behind.
Incident recovery time
The organizational time from detecting an agent incident to freezing, attributing, notifying, reauthorizing, and resuming.
Instrument half-life
The roughly eighteen-month useful life of an agentic observation instrument before frontier model generation change requires re-calibration.
Interruption budget
A cap on the number and priority of approval requests a supervisor receives per unit time, with routing, deferral, and batching against the cap.
Kill-switch
An architectural intervention that stops an agent action upstream of execution; distinct from emergency shutdown of the service.
Large language model (LLM)
A neural network trained on text to predict tokens; the model at the base of most contemporary agentic systems.
LLM-as-a-judge
The use of a language model to score another model’s output against a rubric; introduces its own error rate that must be calibrated against human labels.
LLM-as-judge bias
Documented systematic errors in LLM judges: longer-answer preference, position effect, same-model-family preference; present by default unless calibrated.
MAESTRO
Multi-Agent Environment, Security, Threat, Risk, and Outcome framework; the first agentic-specific threat modeling framework, Cloud Security Alliance, February 2025.
MedLog
The Harvard Medical School proposal for a clinical AI logging standard; nine fields covering model, user, target, inputs, artifacts, outputs, outcomes, feedback. The pattern generalizes.
Memory poisoning
An adversarial pattern that injects malicious content into an agent’s persistent memory store; present in ninety-four percent of audited production deployments (2025).
Mental model declaration
The explicit declaration of whether an agent is a suggestion tool, a copilot, or an autonomous actor; prevents user miscalibration.
Michelin Condition
The structural alignment between guide accuracy and business model: a guide is trustworthy when accuracy is the mechanism that drives the behavior that drives the revenue. Fails when revenue depends on the audience trusting the guide rather than acting on it.
Mis-skilling
The development of a capability calibrated to a flawed reference; the middle category in the NEJM 2025 deskilling-mis-skilling-never-skilling taxonomy.
MVP House of Cards
The pattern where MVP delivery culture defers governance layer after layer until the backlog exceeds the roadmap’s capacity to close it.
Never-skilling
The failure to acquire a foundational capability because AI was present during the entire formative window; the most consequential category in the NEJM 2025 taxonomy.
Non-delegable list
The set of decisions a PM declares must not be performed by the agent, regardless of autonomy level.
OpenTelemetry
The open-source observability standard for distributed tracing; the substrate most AI platforms emit agent traces against.
Orchestration layer
The software above the model that manages prompts, tool calls, memory, and multi-step workflow.
OWASP Top 10 for Agentic Applications
The OWASP Gen AI Security Project’s December 2025 vulnerability list specifically for agentic systems; policy framework, not platform primitive.
Override frequency
The rate at which humans reject or modify agent-proposed actions; falling over time can indicate supervisory erosion, not improved agent quality.
Pass@K
The practice of running an eval K times and reporting the success distribution; the correct quality gate for non-deterministic systems.
PII detection
Identification of personally identifiable information in agent input or output; a single-step validator where classifier metrics apply.
Planning artifact (Category C)
A PM-team output tracked in a work-management system (suitability declaration, go or no-go memo, coverage statement); not a platform feature.
Platform primitive (Category A)
A first-class platform feature the PM or builder consumes directly.
Platform emits, PM composes
The working contract for the six observation instruments: the platform provides events; the PM’s team composes the metrics.
Policy requirement (Category D)
An organizational standard the platform must support but does not itself define (GDPR compliance, retention policies, kill-switch obligations).
Precision
The proportion of agent outputs classified as positive that are truly positive; a single-step classifier metric.
Project Glasswing
Anthropic’s 2026 controlled-deployment consortium for the Claude Mythos Preview model, restricted to defensive security work; example of frontier model capability outpacing public release.
Prompt injection
An adversarial input that attempts to override the agent’s instructions through crafted text in the user prompt, tool output, or retrieved content.
RAG poisoning
An adversarial pattern that corrupts the retrieval corpus an agent reads from; ninety-percent attack success at five malicious texts in a base of millions (PoisonedRAG, USENIX Security 2025).
Real-time observability
Observation that fires upstream of irreversible action; required for actions whose consequence timescale is shorter than alert-and-respond cycles.
Reasoning trace
The model’s self-narration of its chain of thought; distinct from action trace and treated as commentary rather than evidence.
Recall
The proportion of truly positive cases the agent correctly classifies as positive; a single-step classifier metric. Equivalent to sensitivity.
Recovery workflow
The typed declaration of how an agent recovers from an error: compensating action, rollback, or explicit non-recovery.
Reference dataset
The immutable, versioned collection of test cases against which an agent’s evals are run; carries source-document lineage.
Refusal detection
A classifier that identifies when an agent has correctly declined to answer; a single-step validator where classifier metrics apply.
Retrieval-augmented generation (RAG)
A pattern where the system retrieves relevant text at query time and provides it to the model as context; base platform dependency, not a sufficiency proof.
Retirement workflow
The lifecycle primitive that decommissions an agent while preserving its audit trail and blocking new invocations.
Rollback time
The time from detecting an incorrect agent action to restoring the last known good state; measured as a distribution.
Router
A single agent step that selects which tool to invoke; a common single-step validator where classifier metrics apply.
Scheduled autonomy
Movement up the autonomy ladder triggered by a count or a calendar date rather than demonstrated competence; the pattern in the Utah Doctronic case.
Semantic validation
Verification that the agent’s output matches what the rubric expects; complementary to state validation.
Sensitivity
Same as recall; the proportion of truly positive cases correctly identified. A single-step classifier metric.
Sequential Tool Attack Chaining (STAC)
An adversarial pattern that chains legitimate-looking tool calls toward an unauthorized outcome; over ninety-percent attack success on GPT-4.1 across four hundred and eighty-three test scenarios (AWS / UC Berkeley, 2025).
Shadow workflow
A parallel manual process maintained by users alongside the agent, indicating distrust; the canonical sign of a failing Channel 2.
Silent degradation
Gradual erosion of agent performance without a triggering incident; the failure mode continuous monitoring and periodic checkpoints catch differently.
Skill decay
The erosion of cognitive and strategic skills in people when agents take over tasks; invisible to the performer, visible in downstream handoffs.
Specificity
The proportion of truly negative cases correctly identified; a single-step classifier metric.
State validation
Verification that an agent’s claimed action produced a corresponding state change in the target system; complementary to semantic validation.
Suggestion engine
A declared agent system type: surfaces options and waits; the human always decides and acts.
Suitability assessment
A pre-build gate determining whether a problem is appropriate for an agent: bounded objective, tolerable error, clear success signal, delegatable authority, recoverable consequences, measurable outcome.
Supervision paradox
The structural condition that the deployment that requires human supervision is the same mechanism that erodes the supervisor’s ability to perform it; framework anchored on Bainbridge 1983 and the recent empirical literature.
Supervisory system
The second product shipped with every agent: the tools, metrics, and interventions the human uses to supervise agent behavior.
System type
The declared category of an agent (suggestion engine, copilot, autonomous actor) that determines UI affordances and user expectations.
Task success rate
The proportion of agent tasks that completed the user’s intended outcome, distinct from task completion rate.
Tolerable error
The explicit declaration of which categories and magnitudes of agent error are acceptable before a problem is suitable for agent automation.
Tool boundary
The enumerated, logged set of systems an agent can access; the most concrete expression of the agent’s authority.
Tool call accuracy
Whether the agent invoked the correct tool with correct parameters; a single-step validator where classifier metrics apply.
Tool privilege escalation
An adversarial pattern that uses an agent’s authorized tool calls to gain unauthorized access; present in ninety-five percent of audited production deployments (2025).
Trajectory
The full sequence of reasoning steps, tool calls, and observations an agent produces for one task.
Trajectory evaluation
Eval of the full multi-step agent path end to end, scored by match against a reference trajectory (exact, in-order, unordered, or superset) or by judge against criteria.
Trust boundary
The demarcation between inputs and tools an agent may act on autonomously versus those requiring explicit authorization.
Two-channel agentic design
The design discipline of treating Channel 1 (the agent) and Channel 2 (the supervisory system) as two products shipped together, not one product with a training plan.
Unintended action rate
The frequency at which an agent takes actions outside its declared autonomy boundary.
Upstream-data-wrong
A hallucination category where the model faithfully reproduces output from complete but operationally wrong input; undetectable at the model layer.
When-wrong spec
A PM artifact specifying how the system must behave when it is wrong, not only when it is right; operationalized into the four-question pre-launch review with named owners.