Risk Model

Risk Model & Protection Classes

How MCP-first protects sensitive areas, classifies capabilities into six protection classes, and ensures the human has the necessary information before every critical action.

Full agent capability does not mean an agent may do everything. It means the system can describe every one of its states and capabilities in a structured way, including the capabilities that remain fundamentally off-limits to an agent. The risk model is the tool that draws this boundary: precise, machine-readable, and without exceptions.

Sensitive areas

Some parts of a system require a higher level of protection, not only against external attackers, but also against internal agents. A common misconception states:

The user is allowed to see it, so the AI is allowed to see it too.

That is not correct. Effective AI access results from the intersection of user permission, agent permission, client trust level, resource sensitivity, purpose, and confirmation status. Each of these elements is its own barrier, not just the user role.

The sensitive areas that must be particularly protected include:

  • Financial data, bank details, payment data, billing details
  • Health data, medical documents, medical records, certificates
  • Salary data, wages, bonuses, salary history, payroll runs
  • Personnel files, contracts, application documents, internal evaluations
  • Credentials & API keys, passwords, private keys, access tokens
  • Deletion and bulk functions, delete user, delete tenant, bulk export
  • Permission management, granting and revoking permissions, changing roles
  • Security logs, audit trails, security events, active sessions
  • Private communications, confidential notes, internal assessments
  • Tax data and contracts, tax-relevant documents, legally binding contracts
  • Personal data under GDPR, all data that identifies a natural person

Six protection classes

Every resource and every tool is assigned to one of six protection classes. The class determines who gets access, whether an agent even sees the tool in discovery, and what approval is required before execution.

Public

Low

Publicly accessible content without restriction. No authentication, no special role, no confirmation required. Suitable for help articles, product information, and documented system capabilities.

help.article.read
public.product_info.read

Internal

Low

Only for logged-in users. Authentication is sufficient; no special privileges required. Typical for list queries, searches, and general work views of one’s own context.

project.list
contact.search

Confidential

Medium

Only accessible with an explicit role and assigned scope. A simple login is not enough. The data is internal but not intended for all employees.

contract.read
customer_private_note.read

Sensitive

High

Only with additional approval or restricted context sharing. An agent must not load this data into its context unchecked. When passed to AI, the scope must be narrowly limited and the purpose documented.

salary.read
bank_account.read
health_document.read

Critical

Critical

The AI may not act autonomously in this class. Every execution requires explicit user approval, often step-up authentication. The actions produce external effects, are difficult to undo, or directly affect third parties.

payment.execute
user.delete
contract.send
email.send_external
security.change_permissions

Forbidden for AI

Forbidden for AI

This class is a hard boundary, not a policy decision. Tools and resources of this class are completely hidden from tool discovery for agents. An agent can neither read, call, nor reference them.

password.read
private_key.read
full_database_export
raw_access_token.read

What an agent must not see, it must not find. Forbidden tools do not appear in discovery responses.

The risk model

The risk model translates the protection classes into operative rules for every execution case. It determines whether autonomy is permitted, whether confirmation is required, and what type of approval is demanded.

Low

Low

Purely read operations or actions without lasting external effect. Agents may call these tools autonomously.

Low Risk, Examples
  • list_projects Low
  • get_current_user Low
  • search_contacts Low
  • get_help_article Low
  • create_reminder Low

Medium

Medium

Data or states are created or modified, but within a narrowly defined scope without external effect. Autonomous execution is permitted when scope and context clearly emerge from the workflow.

Medium Risk, Examples
  • create_note Medium
  • update_task_status Medium
  • generate_summary Medium
  • create_draft Medium

High

High

Actions that change relevant system states or affect other users and external resources. Explicit confirmation is generally required.

High Risk, Examples
  • change_project_status High
  • invite_calendar_attendee High
  • share_file_link High
  • update_customer_data High

Critical

Critical

External communication, payments, deletions, permission changes. Always confirmation, often step-up auth. Autonomous execution is not permitted.

Critical Risk, Examples
  • emails.send_external Critical
  • payment.execute Critical
  • user.delete Critical
  • security.change_permissions Critical
  • payroll.export Critical
  • contract.send Critical

Forbidden

Forbidden for AI

Completely blocked for AI agents, neither reading nor writing. Not visible in discovery, not callable, not usable as a reference.

Forbidden, Examples
  • password.read Forbidden for AI
  • private_key.read Forbidden for AI
  • raw_access_token.read Forbidden for AI
  • disable_audit_log Forbidden for AI
  • full_database_export Forbidden for AI

Approval UX

The technical risk model only takes full effect when the user interface also meets its role. A confirmation is only as good as the information on which it is based.

Before a human confirms a critical agent action, they must be able to clearly see seven things:

  1. What does the agent want to do?, The concrete action, not the agent’s intent in its own words.
  2. Why does the agent want to do it?, Which intent or assignment triggered this action.
  3. Which data is being used?, Recipients, attachments, referenced entities.
  4. What external effects will occur?, What changes outside the system: email is sent, payment is triggered, document is transmitted.
  5. Who is affected?, Persons, companies, tenants, external parties.
  6. Can the action be undone?, Clear statement: reversible or irreversible.
  7. What happens upon confirmation?, Complete description of the next system states.

The following example shows what a complete approval card for the tool emails.send_external looks like:

Sales Assistant

emails.send_external
Critical

Following up on project Havelblick based on the last interaction.

Recipient
Max Müller, Müller GmbH <[email protected]>
Subject
Follow-up on project Havelblick
Attachment
No direct attachment, download link expires after 14 days.
External effect
Email will be sent and become part of communication history.
Reversible
No, cannot be undone after sending.

GrundExternal communication with project-related information and download link. Irreversible after sending.

The agent waits for the user’s decision. It must not anticipate the confirmation, set a default action, or base execution on the timestamp of a previous approval.

100% controllable does not mean 100% autonomous.