Risk-Based Approach to Leveraging Agents: Beyond the AI Horror Stories

 

We have seen a lot of excellent technical resources lately on how to "secure" AI agents. On the flip side, we’ve also seen the clickbait: blog posts detailing nightmare scenarios where agents autonomously hijack a CEO’s email to initiate a blackmail scheme.

Between the technical weeds and the tabloid headlines, one thing is missing: a data-backed, risk-based approach. As a risk community, we need to stop reacting to headlines and start quantifying the "Agent." If we want to move from fear to informed decision-making, we have to look at these systems through a FAIR (Factor Analysis of Information Risk) lens.


What Exactly Is an Agent?

According to our colleagues at IBM, an AI agent is a system that autonomously performs tasks by designing workflows with available tools. Note there's a difference between a defined workflow and an agentic workflow that maybe doesn't come through here. A workflow follows a well-defined set of steps and actions. This could include the use of skills or LLM calls with defined prompts. An Agent or Agentic workflow on the other hand is often not defined at all and could follow any number of paths. To put it another way, Workflows tend to have well defined Processes, Agents tend to have well defined Goals.

Unlike a standard chatbot that just answers a prompt, an agent executes. They operate via predefined human rules and user prompts, utilizing their training data and any integrated tools (APIs, databases, browsers) to bridge gaps in analysis. To maintain quality and prevent "hallucination drift," we often keep a Human-in-the-Loop (HITL) or use other agents as "supervisors."

The Spectrum of Complexity: Not All Agents Are Created Equal

From a risk perspective, we have to distinguish between the levels of autonomy. We generally see five types:

Agent Type

Capabilities

Risk Profile

Simple Reflex

Acts only on current perception (If X, then Y). No memory.

Low: Highly predictable.

Model-Based

Uses memory to maintain an internal model of the world.

Moderate: State-dependent.

Goal-Based

Plans sequences of actions to reach a specific goal.

Elevated: Increased autonomy.

Learning

Autonomously adds new experiences to its knowledge base.

High: Harder to predict drift.

Utility-Based

Selects actions to maximize a "utility function" (reward).

Critical: Complex to quantify.



Securing the Frontier: The Frameworks We Use

Thankfully, as agents have become the "new norm" over the last year, several frameworks have emerged to help us define control strength. While the NIST AI RMF and OWASP AI Exchange establish the broader organizational governance and process baselines for AI adoption, they stop short of the system-level, runtime architectures required for autonomous agents. The following frameworks fill this gap by offering concrete, system-level assessment models.

  1. Databricks AI Security Framework (DASF) 3.0: DASF v3.0 formalizes Component 13 (Agentic AI) and shifts focus from static model inputs to dynamic agent execution loops (reasoning, planning, and execution). It introduces 35 distinct agentic risks and 6 core controls, providing critical coverage for the Model Context Protocol (MCP), which governs how autonomous agents securely connect to enterprise tool servers, data, and APIs.
    1. Primary Use Case: Security architecture validation, secure engineering design reviews, and system-level assessments.
    2. FAIR Applicability:
      1. DASF’s 6 mitigation controls determine Resistance Strength (RS). For example, implementing sandboxing for agent execution directly reduces vulnerability against Remote Code Execution threats.
      2. The framework’s emphasis on Observability of Thought provides tracing of an agent's reasoning path. From a FAIR perspective, this shortens the detection and containment windows, lowering the Loss Magnitude during a security incident.
  2. OWASP Top 10 for Agentic Applications (2026): A globally peer-reviewed standard for identifying the most critical security vulnerabilities unique to autonomous AI. It explicitly identifies threats that traditional application security models miss, like Agent Goal Hijack, Tool Misuse & Exploitation, Identity & Privilege Abuse, and Cascading Failures.
    1. Primary Use Case: Application threat modeling, secure code development guidelines, and scoping the focus areas for AI red teaming and penetration testing.
    2. FAIR Applicability:
      1. OWASP provides the taxonomy for defining the Risk Scenario in a FAIR analysis. Instead of modeling a vague "AI Hack," analysts can isolate a specific scenario, such as an ASI06: Memory & Context Poisoning exploit.
      2. By tracking your organization's open OWASP vulnerabilities alongside industry threat intelligence, analysts can derive a realistic Threat Event Frequency for how often an adversary will attempt to exploit these software vulnerabilities.
  3. MITRE ATLAS (Adversarial Threat Landscape for Artificial-Intelligence Systems): A curated knowledge base of real-world adversary tactics, techniques, and procedures (TTPs) targeting AI systems. For agentic workflows, it highlights specific techniques like LLM Agent Exploitation and Tool Malicious Manipulation.
    1. Primary Use Case: Security Operations Center (SOC) detection engineering, incident response planning, and defining realistic threat capabilities.
    2. FAIR Applicability: MITRE ATLAS helps security teams assess the difficulty of an execution path. This enables analysts to quantify the Threat Capability required by an adversary to execute a multi-stage agentic attack.
  4. Meta’s "Rule of Two": This is my favorite heuristic for initial scoping. Meta suggests an agent should satisfy no more than two of the following three properties in a single session:
  1. Processing untrustworthy inputs (e.g., the public web).
  2. Accessing sensitive systems or private data.
  3. Changing state or communicating externally (e.g., sending an email).
  1. Primary Use Case: Product management scoping, architectural governance, and setting compliance gates that mandate Human-in-the-Loop (HITL) controls.
  2. FAIR Applicability: Enforcing a 2/3 configuration mathematically removes entire loss paths from your FAIR risk analysis, while a 3/3 configuration without a validation boundary or HITL carries a vulnerability approaching 1.0 for targeted prompt injection attacks.

If an agent requires all three properties to function, it must have human-in-the-loop approval or another reliable means of validation. It cannot operate autonomously!

 


FAIR-AIR… But for Agents!

 

1. Contextualize: Mapping the Territory

Before we touch a calculator, we have to understand the environment. This is where we identify our Crown Jewels and the Privileged Access the agent might hold. We ask: In whose house is this agent living, and what do they have the keys to?

  • Example A (The Utility Sector): Imagine an agent deployed within a Customer Care system for a major utility provider. The context isn't just "AI"—the context is Critical Infrastructure. The crown jewels here are customer billing records and the ability to modify service status. A "loss" here isn't just data; it’s operational downtime and regulatory fines.
  • Example B (The Fintech/Payments Sector): An agent working within a Wallets or Payments business unit. The context is high-frequency, high-value transactions. The crown jewel is the ledger. If an agent has the "context" of the payment rail, the risk is fundamentally different than an agent sitting in the Marketing department's brainstorming tool.

2. Scope: Defining the Type and "The Rule of Two"

Once we know the environment, we define the agent’s boundaries. We use the Rule of Two to determine if we are looking at a "Helpful Librarian" or a "Digital Rogue."

  • Determine the Type: Is this a Simple Reflex Agent (If this, then that) or a Learning Agent that evolves its behavior?
  • The Scoping Test:
    • Example: A "Learning Agent" in the DevOps pipeline that reads external bug reports [A], accesses the source code [B], and autonomously pushes fixes [C].
    • Result: It violates the Rule of Two (it hits all three). Our scope now identifies this as a Critical Risk Scenario that requires a mandatory human-in-the-loop.

3. Quantify: Moving from "Scary" to "Calculated"

This is where we move the needle from "I think this is risky" to "This is the probable loss magnitude." We look at the data to build our loss event frequency and magnitude.

  • Loss Event Frequency: How often is the agent acting? If an agent facilitates 5,000 automated customer interactions a day, the frequency of a potential threat finding a vulnerability is much higher than a monthly reporting agent.
  • Loss Magnitude: If a prompt injection occurs, what is the "blast radius"?
    • The Math: If an agent can access 500,000 PII records, and our historical data (or industry benchmarks) shows a $150 cost-per-record for a breach, we are looking at a $75M primary loss event. Suddenly, the "scary AI" has a price tag that the Board can understand.

4. Prioritize & Treat: Selecting Your Armor

Now that we have a dollar amount, we can prioritize our spending. We don't just "buy security"; we treat the specific risks we identified in Step 3.

  • Example: We look at the Databricks DASF 3.0 controls. If our quantification showed that "Tool Call Hijacking" is our highest probable loss, we prioritize Control-6 (Execution Guardrails).
  • Treatment: We might decide to change the agent’s data access from "Dynamic" (all access) to "Manual/Static" (limited, pre-approved lists). This treatment directly reduces the Vulnerability factor in our FAIR equation.

5. Decision Making: The ROI of "Yes"

This is the finale. We take our quantified risk and our cost-of-treatment to the AI Governance Committee or the Board to get the funding we need.

  • The Pitch: "By granting this agent privileged access but implementing these three specific DASF controls, we are enabling a $2M productivity gain while keeping our Probable Annual Loss under $100k. Without these controls, the risk is $5M. Do we fund the controls, or do we limit the agent's access?"

Case Studies: From Theory to Reality

Let’s look at how this data-backed approach shifts our perspective on two different scenarios.

To move from a "horror story" to a defensible business strategy, we have to treat high-risk scenarios with a structured, economic rigor. Let’s take a deep dive into the "Rule of Three" Violator—our automated Third-Party Risk Management (TPRM) agent—and walk through the FAIR-AIR methodology to see how we actually secure it.

 


First Case Study: A third party risk management agent with a goal to manage third party risk.

Step 1: Contextualize (The Crown Jewels)

Before we look at the agent’s code, we look at its environment. This agent lives in the Third-Party Risk Management (TPRM) ecosystem.

  • The Business Unit: Procurement and Information Security.
  • The Crown Jewels: Our vendor relationship database, internal risk tiers, and—most importantly—our legal reputation.
  • The Impact: If an agent sends an email that sounds like a legal threat or blackmail to a Tier-1 partner (like a cloud provider or a critical manufacturer), the loss isn't just "bad data"—it’s a breach of contract, potential litigation, and massive reputational damage.

Step 2: Scope (The Rule of Three Violation)

We identify this as a Utility-Based Agent. It’s not just scanning; it’s trying to maximize a "utility" (getting vendors to remediate). Using Meta’s "Rule of Two" framework, we see why this is a critical-tier risk:

Property

Present?

Description

[A] Untrustworthy Inputs

YES

It scrapes "outside-in" findings from the public web/security headers.

[B] Sensitive Systems/Data

YES

it reads our internal vendor lists and historical findings.

[C] State Change/External Comm

YES

It autonomously emails the vendor on our behalf.

The Diagnosis: This is a "Rule of Three" violator. By definition, it cannot be permitted to operate without strict, documented governance and specific technical mitigations.

Step 3: Quantify (The Probable Loss Magnitude)

We move from "this is risky" to "this costs $X."

  • Threat Event Frequency: The agent is designed to scan 500 vendors daily. If we estimate a 0.5% "hallucination" rate or a successful prompt injection once every 200 scans, we are looking at 2.5 "malformed" legal communications per week.
  • Loss Magnitude: A single defamatory or "blackmail-adjacent" email to a key partner could result in legal settlements, lost contract discounts, or emergency PR response.
    • Primary Loss: $50k - $250k per incident (Legal fees + settlements).
    • Secondary Loss: Millions in brand erosion or disrupted supply chains.
  • Total Risk: We are looking at an Annualized Loss Expectancy (ALE) of $6M - $12M if left untreated.

 


Step 4: Prioritize & Treat (The Mitigating Controls)

Now we apply the "Discount" to that $12M risk using our specific frameworks.

1. Leveraging Meta’s Rule of Two Mitigation

  • Control: Session Isolation / Context Reset. * Application: The agent can gather the external data [A] and read internal data [B] in one session. However, to execute the email [C], the agent must "hand off" the findings to a separate, fresh session that does not have access to the untrustworthy external inputs. This "breaks" the chain of a prompt injection.

2. Leveraging Databricks DASF 3.0

  • Control 6: Execution Guardrails (Human-in-the-Loop). * Application: We enforce a "Manual" classification for the final action. The agent drafts the email, but a human must click "send" in a dashboard. This single control reduces our Vulnerability factor to near zero for "blackmail" scenarios.
  • Control 2: Input/Output Filtering. * Application: Deploying a "jailbreak" filter on all external web data [A] to ensure malicious payloads aren't being fed into the agent's reasoning engine.

3. Leveraging OWASP Top 10 for Agentic Applications (2026)

  • Agentic-01: Prompt Injection Mitigation. * Application: We use System Prompt Hardening. We strictly define the agent's "persona" and use a separate LLM "evaluator" to check the drafted email for aggressive, coercive, or non-compliant language before it ever reaches the human reviewer.

 


Step 5: Decision Making (The Recommendation)

As the Advisor, my recommendation to the Board is clear:

"We will allow the TPRM Agent to proceed into production only if we move property [C] (Communication) from 'Autonomous' to 'Manual Review.' By implementing DASF Control 6 and a Context Window Reset, we reduce our Probable Annual Loss from $12M to less than $50k, while still capturing 90% of the productivity gains from automated scanning. The cost to implement these controls is $80k—providing a nearly 150x return on our security investment."

 

Case Study 2:

Step 1: Contextualize (The Crown Jewels)

We start by looking at the environment: the Marketing and Growth business unit.

  • The Business Goal: Improving conversion rates by using AI to identify which customers are most likely to respond to a specific campaign.
  • The Crown Jewels: Our Customer Database (PII). This includes names, purchase history, and contact information.
  • The Context: This is a "Read-Only" environment. The agent isn't being asked to change the world; it’s being asked to organize it. The primary risk here isn't a "blackmail headline"; it's Unauthorized Data Disclosure (Privacy Risk).

Step 2: Scope (The Rule of Two Test)

We classify this as a Goal-Based Agent. It has a specific objective: "Find the top 500 customers for Project X." Let’s look at how it measures up against Meta’s Rule of Two:

Property

Present?

Description

[A] Untrustworthy Inputs

NO

It only pulls from our internal, vetted CRM.

[B] Sensitive Systems/Data

YES

It requires access to the customer database to function.

[C] State Change/External Comm

NO

It produces an internal list; it does not send the emails.

The Diagnosis: This agent satisfies only one of the three properties. It is inherently stable and far easier to govern because we have removed the external threat vectors ([A] and [C]) by design.

Step 3: Quantify (The Probable Loss Magnitude)

Since the agent can't "act" on the world, our Threat Event Frequency drops significantly.

  • Threat Event Frequency: We are no longer worried about a "Prompt Injection" from the public web. The threat is now Internal Misuse (e.g., an employee asking the agent for data they shouldn't see).
  • Loss Magnitude: If an unauthorized user prompts the agent to export the entire database:
    • Primary Loss: Cost of a data breach notification for the affected records.
    • The Math: If the agent is limited to a subset of data (e.g., only "Active Customers" in North America), the Magnitude is capped. We aren't looking at a $12M disaster; we are looking at a manageable $200k privacy event.

 


Step 4: Prioritize & Treat (The Mitigating Controls)

Because we’ve quantified the risk as "Balanced," we don't need to "break the bank" on controls. We use the frameworks to "right-size" our defense.

1. Leveraging Databricks DASF 3.0

  • Control 1: Identity & Access Management (IAM).
    • Application: We apply strict RBAC (Role-Based Access Control). Only the Marketing Analytics team can "talk" to this agent. This directly slashes our Threat Event Frequency.
  • Control 5: Data Privacy Controls (PII Masking).
    • Application: We don't give the agent the raw PII. We give it "Masked" data. It sees "Customer ID 12345" instead of "Sally Smith." The agent can still prioritize the IDs, but if the output is leaked, the Loss Magnitude is effectively zero because the data is de-identified.

2. Leveraging OWASP Top 10 for Agentic Applications (2026)

  • Agentic-03: Broken Access Control.
    • Application: We ensure the agent’s service account has Least Privilege. It can "Read" the specific marketing tables, but it has no "Write" or "Delete" permissions and no access to the HR or Payroll databases.

 


Step 5: Decision Making (The "Green Light" Pitch)

As Advisor, this is the easiest report I’ll write all quarter:

"The Marketing Prioritization Agent is a High-Utility, Balanced-Risk asset. By removing external communication and using PII masking (DASF Control 5), we have reduced the Probable Annual Loss to a negligible level. We recommend immediate deployment. The productivity gains for the Marketing team outweigh the residual risk by a factor of 20:1. This is exactly how we use AI to empower the business without compromising our security posture."


Key Takeaways

  • Contextualize Before You Categorize: Risk does not exist in a vacuum. Before assessing an agent, you must identify the Crown Jewels it touches. An agent summarizing public marketing PDFs carries a fundamentally different risk profile than one with "Read/Write" access to your production payment rails. Always map the agent to the business unit and the sensitivity of the data environment first.
  • Autonomy is a Variable, Not a Binary: Stop asking "Is this agent safe?" and start asking "How much autonomy is cost-justified?" By using the spectrum of agent types—from Simple Reflex to Utility-Based—you can match the level of oversight to the complexity of the task. Higher autonomy requires higher control strength.
  • Adopt the "Rule of Two" as Your Primary Heuristic: Meta’s framework is the most effective "triage" tool we have. If an agent hits all three properties—[A] Untrustworthy Inputs, [B] Sensitive Data, and [C] External Communication—it is a "Rule of Three" violator. These cases should automatically trigger a mandatory Human-in-the-Loop (HITL) requirement until specific mitigating controls (like context resets or output filtering) are verified.
  • Quantification is the Antidote to Friction: The "Department of No" usually lacks data. When we apply the FAIR-AIR lens, we transform vague fears into a math problem:
    Risk=Loss Event Frequency×Loss Magnitude
    By showing the Board that a specific set of DASF 3.0 or OWASP controls can reduce a potential $10M"blackmail" scenario to a $50k residual risk, you aren't just securing the business—you’re enabling it to move at the speed of AI.
  • Governance is a Productivity Multiplier: The time spent building an AI Governance committee and a risk-based intake process isn't "overhead." It is an investment in velocity. A clear framework decreases the friction between the business’s desire to innovate and the security team’s need to protect.

Final Thought

In the end, a risk-based approach to agents ensures that your team stays out of the headlines and stays focused on what matters: delivering value. We are moving into an era where "Agentic AI" will be the backbone of enterprise productivity. By applying FAIR principles today, we ensure that backbone is resilient, quantified, and—most importantly—governed.

image 37