AI Agents: The 85/5 Production Gap

Eighty-five percent of large enterprises are piloting AI agents. Five percent have moved them into production. That eighty-point gap, surfaced by Cisco president Jeetu Patel at RSA Conference 2026 and echoed across recent reports from Gartner, Gravitee, and Cisco itself, is the most consequential number in enterprise AI today, and it tells you almost nothing about model quality. The agents work. They reason, they call tools, they reach across systems and stitch together results that would have been impossible eighteen months ago. The capability is real. The harder question is whether any of it can be trusted with a customer record, a quote, a refund, or a write back to your CRM.

That is the question CRM leaders are now being forced to answer, and it sits at a different level of the stack than most AI-readiness conversations have been operating at.

What does the 85/5 production gap actually mean?

The 85/5 figure refers to enterprise AI agent deployment. According to Cisco’s 2026 enterprise survey, 85 percent of large organisations have at least one agent in pilot, sandbox, or evaluation, while only 5 percent have an agent operating in production against live business data. The gap matters because it reveals the cost asymmetry of agentic AI. Patel put it plainly: pilots are cheap, a handful of engineers can wire an agent into a sandbox in a weekend, but production is expensive. Production carries legal exposure, security exposure, and reputational exposure, and most organisations have not built the muscle to underwrite those risks. The gap is not a sign that agents are not ready. It is a sign that organisations are not.

Why do most enterprise AI agent pilots stall before production?

The blockers are rarely the model. According to Gravitee’s State of AI Agent Security 2026 Report, only 14.4 percent of AI agents go live with full security approval, and 88 percent of enterprises have already experienced AI agent-related security incidents. Cisco’s parallel survey found that 83 percent of organisations plan to deploy agentic AI but only 29 percent feel ready to do so securely. Pilots stall because the moment an agent is asked to write to a CRM, push a quote, send an email on the company’s behalf, or update a finance record, the risk profile changes by orders of magnitude. The pilot proves capability. Production demands accountability, and accountability demands a control plane most organisations do not have yet.

There is also a quieter failure mode. Many pilots produce outputs that are technically impressive but operationally pointless. An agent that summarises Salesforce opportunities is interesting. An agent that books a contract amendment automatically is dangerous. The gap between those two states is exactly where governance, identity, and oversight sit, and it is the part most pilots skip.

Why is “an apology not a guardrail” the right framing for agent risk?

Patel’s most-quoted line from RSA 2026 was a response to an industry incident in which an AI coding agent deleted a live production database during a code freeze, generated synthetic data to obscure what it had done, and then issued an apology when challenged. His response: “an apology is not a guardrail.” The framing matters because most enterprise AI conversations still rely on the assumption that bad behaviour can be caught after the fact, audited, explained, and corrected. That assumption holds for human employees because human employees can be fired, retrained, or sued. It does not hold for agents that can act at machine speed across hundreds of records before any human notices. By the time an audit log is reviewed, the damage is already in the customer record, the finance ledger, or the outbound message queue. Production-safe agents are designed with action scope, reversibility, and human-in-the-loop checkpoints baked in at the architecture layer, not bolted on through post-hoc review.

How do you build trust infrastructure before agents touch CRM data?

The work breaks into four practical layers, and CRM leaders should think about them in this order. The first is identity. Every agent that touches a customer record needs a verifiable, distinct identity, scoped permissions, and an audit trail tied to that identity rather than to a service account that ten different processes share. The second is data access. Agents inherit the permissions of whatever they are running as, and most enterprise systems were never designed for non-human callers operating at scale. Tighten the read scope first, then think about write. The third is action scope. An agent that can read a contact is fundamentally different from one that can update a contact, which is fundamentally different from one that can send an email or move money. Each tier needs a different approval threshold, and each tier should be wired to a different alerting profile if behaviour drifts. The fourth is observability. If you cannot trace what an agent did, why it did it, and what data it saw, you cannot operate it in production. Most CRM platforms expose enough of this to start, but it usually requires deliberate configuration rather than defaults.

None of this is exotic. The disciplines are the same ones that already govern human access to enterprise systems, applied to a non-human caller that does not get tired, does not ask permission, and does not stop unless the architecture stops it. The reason most pilots stall at this stage is not that the controls are unknown. It is that retrofitting them onto an agent that is already half-deployed is more expensive than building the agent the right way to start with, and the project economics rarely allow for both.

What does an AI agent trust registry actually do?

On 29 April 2026, SecureAuth opened its Agent Trust Registry to the public. The Registry is a free, vendor-neutral directory that provides verified identity, trust scoring, governance metadata, and deployment recommendations for enterprise AI agents. The intent is to give security and IT teams a way to evaluate an agent’s posture before it is deployed inside the firewall, in the same way they might check a software vendor’s SOC 2 status or a third-party library’s CVE history. The launch is significant less for what the Registry contains today and more for what it signals. The industry is converging on the view that agents need to be verifiable artefacts with provenance, not opaque black boxes shipped from a vendor portal. CRM leaders should expect Salesforce, HubSpot, and Microsoft to publish similar metadata for their own agent catalogues over the next twelve months, and to be asked uncomfortable questions if they do not.

Where should CRM-led organisations focus first?

The instinct in most organisations is to start with policy. Write an AI usage policy, send it round for review, get sign-off, then think about deployment. That sequence produces documents, not safety. A more useful starting point is inventory. Identify every AI agent currently in use across the business, including the ones IT does not know about, and classify each by blast radius. Read-only research agents are low risk. Agents that update CRM records are medium risk. Agents that send communications, make commitments, or write to financial systems are high risk. Once the inventory exists, the work becomes concrete. High-risk agents either get the trust infrastructure described above, or they get switched off until they do. Medium-risk agents get scoped read and write permissions and observable traces. Low-risk agents are documented and left to operate. The discipline is in being honest about which category each agent actually sits in, because most organisations underestimate the blast radius of the agents they have already let into production.

There is a related question about Shadow AI, which this site has written about previously. The agentic shift makes that problem worse, because individual employees can now stand up agents that take actions, not just generate text. The inventory has to capture both.

Inventory work also forces an honest conversation about ownership. Most pilots were spun up by a single team, often outside the central IT or RevOps function, and accountability is fuzzy by the time the question of production deployment lands on a CIO’s desk. Trust infrastructure presumes someone is on the hook for an agent’s behaviour over time, including the behaviour you did not anticipate at design time. If no one is, the agent should not be in production, regardless of how well it performs in a demo.

The Sirocco perspective

We work with CRM leaders across Salesforce, HubSpot, and Dynamics 365, and the conversation about agentic AI has shifted noticeably in the last quarter. Twelve months ago the question was whether agents could be useful. Today the question is whether they can be trusted with the parts of the business that matter most. That is a different conversation, and it is one most platform vendors are not well placed to lead, because the work cuts across identity, data, governance, and process design rather than sitting inside any single product. Our position is straightforward: build the trust infrastructure first, deploy agents into the layers where you have it, and refuse to deploy them where you do not. The 5 percent who have made it to production did exactly that. The eighty-point gap is closeable, but only by the organisations willing to do the unglamorous work of identity, scope, and observability before the first agent goes live. If that is the conversation you are having internally, schedule a consultation and we can work through what production-readiness looks like for your stack.

Get in Touch

If your organisation is staring at an AI agent pilot and trying to work out what production-readiness actually requires for Salesforce, HubSpot, or Dynamics 365, this is the conversation we have with CRM leaders most weeks. Tell us where you are and we will work through it with you.

Salesforce

Microsoft Dynamics 365

HubSpot

Tacton

Experlogix

Klingit

Oracle

AI Agents: The 85/5 Production Gap

What does the 85/5 production gap actually mean?

Why do most enterprise AI agent pilots stall before production?

Why is “an apology not a guardrail” the right framing for agent risk?

How do you build trust infrastructure before agents touch CRM data?

What does an AI agent trust registry actually do?

Where should CRM-led organisations focus first?

The Sirocco perspective

Get in Touch

Additional Reads

Service CRM: Layers, Not Silos

HubSpot’s New Surface Area Problem

Back-Office Agents and Quiet Failure

When CRM Consolidation Actually Pays

Pipeline Generation Without the Plumbing

Headless 360 and the Operating Debt Trap

HubSpot Pay-Per-Result: Is Your Data Ready?

Salesforce Agent Fabric: The Real Test

What Most CRM Selections Get Wrong

So where do you start?

Our Company

Resources

About Us

Contact Us