Image Not FoundImage Not Found

  • Home
  • AI
  • Balancing Autonomy and Control: KPMG’s Framework to Safely Scale AI Agents by 2026
A hand reaches towards a vibrant blue digital display filled with swirling patterns of light and dots, creating an immersive and interactive visual experience. The scene conveys a sense of technology and innovation.

Balancing Autonomy and Control: KPMG’s Framework to Safely Scale AI Agents by 2026

Autonomous AI agents are moving from novelty to operational backbone—without the luxury of improvisation

Enterprises are rapidly shifting from chat-based copilots to autonomous AI agents that can plan, reason, and execute multi-step work across systems. The appeal is straightforward: agents compress decision cycles, reduce coordination overhead, and unlock new forms of automation that traditional RPA and workflow tools struggled to reach. Yet the same autonomy that creates value also introduces a board-level anxiety—unpredictable “runaway” behavior, where an agent drifts beyond intent, mis-executes at scale, or triggers cascading downstream actions.

KPMG Trusted AI leader Sam Gloede frames the moment as less a question of whether agentic systems will be adopted, and more a question of whether they will be adopted safely enough to scale. His prescription is a balanced framework: preserve autonomy, but constrain it with explicit boundaries, continuous testing, and human governance that is designed into operations—not bolted on after a failure.

Recent public missteps at major firms such as Amazon and McKinsey, alongside experimental incidents like the Moltbook social network for agents, have sharpened the urgency. These episodes function as early warning signals: when agents interact with real systems—customer-facing channels, procurement, finance, compliance workflows—the cost of error is no longer a contained model failure. It becomes an enterprise incident with reputational, regulatory, and financial consequences.

Boundary-first engineering: why “agent identity” and containment are becoming table stakes

A central technical theme emerging from Gloede’s approach is boundary-driven autonomy—the idea that agents should be powerful inside well-defined perimeters, and deliberately constrained outside them. This is not merely policy language; it is an architectural stance that echoes how cloud computing matured through containerization, microservices governance, and least-privilege access.

Key controls gaining prominence include:

  • Explicit operational boundaries: clearly defined scopes for what an agent can access, change, approve, and initiate—down to APIs, datasets, and transaction limits.
  • Unique agent identifiers (agent IDs): persistent identity for each agent instance or role, enabling precise attribution, monitoring, and audit trails.
  • Telemetry by design: instrumenting agents so every significant action, tool call, and decision rationale can be observed and reconstructed.

This identity-and-boundary model matters because agent risk is often less about a single wrong answer and more about compounding execution. An agent that can write to a CRM, trigger refunds, update pricing, or modify cloud resources can turn a small reasoning error into a large operational event. In that sense, agent governance resembles production reliability engineering as much as it resembles model evaluation.

The more subtle implication is that enterprises are being pushed toward a new discipline: agent provenance and accountability. As organizations consume agentic capabilities through platforms and third parties, due diligence must extend beyond vendor reputation to the specifics of each agent’s control model—mirroring how supply-chain audits evolved in manufacturing and how third-party risk management matured in cybersecurity.

From SOC to “AI Operations Center”: the rise of closed-loop governance and continuous red-teaming

Gloede’s emphasis on an AI Operations Center signals a broader organizational shift: agentic AI is forcing companies to operationalize governance in real time. This mirrors the evolution of Security Operations Centers (SOCs)—where monitoring, incident response, and escalation paths became permanent fixtures rather than episodic projects.

A mature AI operations model is likely to include:

  • Real-time monitoring and anomaly detection for agent actions, tool usage, and outcome patterns
  • Graduated oversight calibrated to task risk (low-risk automation vs. high-stakes approvals)
  • Human intervention thresholds with clear escalation protocols
  • Kill switches as an ultimate fallback—rarely used, but essential as a last-resort circuit breaker

Crucially, Gloede positions red-teaming not as a periodic audit, but as a lifecycle practice—closer to CI/CD than to annual compliance reviews. The logic is pragmatic: agent behavior changes as models update, tools change, prompts evolve, and business processes shift. Without continuous adversarial testing, safety becomes a snapshot rather than a capability.

This is also where “digital twin” thinking becomes relevant. A growing best practice is to mirror production agents with sandboxed twins that simulate actions, score risk, and test updates before deployment—borrowing from industrial IoT and advanced manufacturing governance.

The business calculus: risk-adjusted ROI, insurance markets, and trust as competitive differentiation

The economic promise of autonomous agents is often described in terms of productivity and labor leverage. The more realistic enterprise framing is risk-adjusted ROI: the value of faster execution must be weighed against the probability and magnitude of loss events—mis-payments, compliance breaches, data exposure, or customer harm.

This is already shaping market behavior in three ways:

  • Trust as a product feature: organizations that can demonstrate robust agent controls—identity, logging, oversight matrices, incident playbooks—can market safety as a differentiator, much as cybersecurity hygiene became table stakes for cloud providers.
  • New vendor monetization layers: platforms bundling orchestration, governance dashboards, and red-team toolkits are positioned to capture subscription and services revenue as “agent operations” becomes a category.
  • Risk transfer and assurance: demand is likely to rise for AI-specific insurance products, third-party audits, and certifiable frameworks that translate technical controls into underwriter-friendly evidence.

There is also a macro-level trust dynamic. In a fragmented geopolitical environment, autonomous agents can be perceived as digital proxies acting on behalf of firms and, by extension, jurisdictions. That may accelerate localized compliance requirements, segmented data governance, and stricter accountability expectations—especially where agents touch critical infrastructure, financial systems, or regulated decisioning.

Gloede’s forward-looking claim—that enterprises can safely scale AI agents by 2026—hinges on whether companies treat governance as a growth engine rather than a brake. The organizations that win the agent era are unlikely to be those with the most autonomy in production, but those with the most intentional autonomy: bounded, observable, continuously tested, and operationally governable at enterprise scale.