Full agent capability does not mean an agent may do everything. It means the system can describe every one of its states and capabilities in a structured way, including the capabilities that remain fundamentally off-limits to an agent. The risk model is the tool that draws this boundary: precise, machine-readable, and without exceptions.
Sensitive areas
Some parts of a system require a higher level of protection, not only against external attackers, but also against internal agents. A common misconception states:
The user is allowed to see it, so the AI is allowed to see it too.
That is not correct. Effective AI access results from the intersection of user permission, agent permission, client trust level, resource sensitivity, purpose, and confirmation status. Each of these elements is its own barrier, not just the user role.
The sensitive areas that must be particularly protected include:
- Financial data, bank details, payment data, billing details
- Health data, medical documents, medical records, certificates
- Salary data, wages, bonuses, salary history, payroll runs
- Personnel files, contracts, application documents, internal evaluations
- Credentials & API keys, passwords, private keys, access tokens
- Deletion and bulk functions, delete user, delete tenant, bulk export
- Permission management, granting and revoking permissions, changing roles
- Security logs, audit trails, security events, active sessions
- Private communications, confidential notes, internal assessments
- Tax data and contracts, tax-relevant documents, legally binding contracts
- Personal data under GDPR, all data that identifies a natural person
Six protection classes
Every resource and every tool is assigned to one of six protection classes. The class determines who gets access, whether an agent even sees the tool in discovery, and what approval is required before execution.
Public
LowPublicly accessible content without restriction. No authentication, no special role, no confirmation required. Suitable for help articles, product information, and documented system capabilities.
help.article.read
public.product_info.read
Internal
LowOnly for logged-in users. Authentication is sufficient; no special privileges required. Typical for list queries, searches, and general work views of one’s own context.
project.list
contact.search
Confidential
MediumOnly accessible with an explicit role and assigned scope. A simple login is not enough. The data is internal but not intended for all employees.
contract.read
customer_private_note.read
Sensitive
HighOnly with additional approval or restricted context sharing. An agent must not load this data into its context unchecked. When passed to AI, the scope must be narrowly limited and the purpose documented.
salary.read
bank_account.read
health_document.read
Critical
CriticalThe AI may not act autonomously in this class. Every execution requires explicit user approval, often step-up authentication. The actions produce external effects, are difficult to undo, or directly affect third parties.
payment.execute
user.delete
contract.send
email.send_external
security.change_permissions
Forbidden for AI
Forbidden for AIThis class is a hard boundary, not a policy decision. Tools and resources of this class are completely hidden from tool discovery for agents. An agent can neither read, call, nor reference them.
password.read
private_key.read
full_database_export
raw_access_token.read
What an agent must not see, it must not find. Forbidden tools do not appear in discovery responses.
The risk model
The risk model translates the protection classes into operative rules for every execution case. It determines whether autonomy is permitted, whether confirmation is required, and what type of approval is demanded.
Low
LowPurely read operations or actions without lasting external effect. Agents may call these tools autonomously.
-
list_projectsLow -
get_current_userLow -
search_contactsLow -
get_help_articleLow -
create_reminderLow
Medium
MediumData or states are created or modified, but within a narrowly defined scope without external effect. Autonomous execution is permitted when scope and context clearly emerge from the workflow.
-
create_noteMedium -
update_task_statusMedium -
generate_summaryMedium -
create_draftMedium
High
HighActions that change relevant system states or affect other users and external resources. Explicit confirmation is generally required.
-
change_project_statusHigh -
invite_calendar_attendeeHigh -
share_file_linkHigh -
update_customer_dataHigh
Critical
CriticalExternal communication, payments, deletions, permission changes. Always confirmation, often step-up auth. Autonomous execution is not permitted.
-
emails.send_externalCritical -
payment.executeCritical -
user.deleteCritical -
security.change_permissionsCritical -
payroll.exportCritical -
contract.sendCritical
Forbidden
Forbidden for AICompletely blocked for AI agents, neither reading nor writing. Not visible in discovery, not callable, not usable as a reference.
-
password.readForbidden for AI -
private_key.readForbidden for AI -
raw_access_token.readForbidden for AI -
disable_audit_logForbidden for AI -
full_database_exportForbidden for AI
Approval UX
The technical risk model only takes full effect when the user interface also meets its role. A confirmation is only as good as the information on which it is based.
Before a human confirms a critical agent action, they must be able to clearly see seven things:
- What does the agent want to do?, The concrete action, not the agent’s intent in its own words.
- Why does the agent want to do it?, Which intent or assignment triggered this action.
- Which data is being used?, Recipients, attachments, referenced entities.
- What external effects will occur?, What changes outside the system: email is sent, payment is triggered, document is transmitted.
- Who is affected?, Persons, companies, tenants, external parties.
- Can the action be undone?, Clear statement: reversible or irreversible.
- What happens upon confirmation?, Complete description of the next system states.
The following example shows what a complete approval card for the tool
emails.send_external looks like:
Sales Assistant
emails.send_external Following up on project Havelblick based on the last interaction.
- Recipient
- Max Müller, Müller GmbH <[email protected]>
- Subject
- Follow-up on project Havelblick
- Attachment
- No direct attachment, download link expires after 14 days.
- External effect
- Email will be sent and become part of communication history.
- Reversible
- No, cannot be undone after sending.
GrundExternal communication with project-related information and download link. Irreversible after sending.
The agent waits for the user’s decision. It must not anticipate the confirmation, set a default action, or base execution on the timestamp of a previous approval.
100% controllable does not mean 100% autonomous.