Tools

Tool Design Guide

How MCP tools are described, classified, and secured according to a unified contract standard, from Low to Forbidden.

An MCP tool is not a function call. It is a contract. It describes what the system can do, who is allowed to call it, what happens in the process, and what can go wrong. This contract must be machine-readable, typed, and complete before an agent or a user even calls the tool.

A tool without a contract is a black box. A black box is not an MCP-first system.

The principle

The tool contract standard

Every MCP tool is described according to a unified schema. The schema covers not only input and output types, but also risk level, permissions, side effects, audit events, and possible failure cases. Only whoever knows all fields can use a tool safely, whether through an agent, a web app, or an automated workflow.

The fields at a glance:

  • Name, unique, machine-readable identifier in domain.verb format
  • Description, natural-language description of the purpose for agents and developers
  • Category, domain classification (e.g. communication, files, billing)
  • Risk Level, classification according to the risk model (Low to Forbidden)
  • Autonomous Execution, whether an agent may call the tool without human approval
  • Confirmation, whether explicit user confirmation is required
  • Step-up Auth, whether additional authentication is needed and when
  • Input Schema, typed parameters including optional fields
  • Output Schema, typed return values including status codes
  • Permissions, which permissions the tool requires
  • Side Effects, all state changes outside the return value
  • Audit Event, which event is written to the audit log
  • Failure Modes, complete list of all possible error codes

The following example shows the complete contract for emails.send_external, a tool with critical risk that triggers external communication:

emails.send_external
Critical

Sends an external email to one or more recipients.

Kategorie
communication
Autonome Ausführung
not permitted
Bestätigung
yes
Step-up Auth
optional, depending on recipient count and attachment
Audit-Event
email.external.sent

Input-Schema

  • to: string[]
  • subject: string
  • body: string
  • attachments?: fileId[]
  • projectId?: string

Output-Schema

  • emailId: string
  • status: sent | scheduled | failed

Berechtigungen

  • emails:send
  • contacts:read
  • files:read

Fehlerfälle

  • permission_denied
  • confirmation_required
  • invalid_recipient
  • attachment_too_large
  • rate_limit_exceeded
  • policy_violation

Side Effects

  • External communication is sent.
  • Email becomes part of the communication history.
  • Audit event is created.

Every field has a function in the system: permissions controls the policy engine, sideEffects informs the confirmation UI, failureModes are evaluated in error handling, auditEvent is written to the audit log. The contract is therefore not just documentation, it is part of the runtime.

The risk model

MCP-first classifies every tool into one of five risk levels. The level determines whether an agent may call the tool autonomously, whether confirmation is required, and whether step-up auth is demanded. The classification is not optional, it is part of the contract.

Low

Low

Tools at this level are purely read-only or produce no lasting effects. They may be executed autonomously by agents without confirmation or step-up auth. Typical examples are list queries, searches, and help resources.

Low Risk, Examples
  • list_projects Low
  • get_current_user Low
  • search_contacts Low
  • get_help_article Low
  • create_reminder Low

Medium

Medium

Tools at this level create data or states, but within a narrowly bounded scope without external effects. They may be executed autonomously when scope and context clearly emerge from the workflow. With ambiguous context, confirmation is advisable.

Medium Risk, Examples
  • create_note Medium
  • update_task_status Medium
  • generate_summary Medium
  • create_draft Medium

High

High

Tools at this level change relevant system states or affect other users and external resources. They generally require explicit confirmation. An agent must not execute them silently.

High Risk, Examples
  • change_project_status High
  • invite_calendar_attendee High
  • share_file_link High
  • update_customer_data High

Critical

Critical

Tools at this level trigger external communication, payments, data deletions, or permission changes. They always require explicit confirmation and often step-up auth. Autonomous execution is not permitted.

Critical Risk, Examples
  • emails.send_external Critical
  • bulk_export Critical
  • delete_user Critical
  • change_permissions Critical
  • execute_payment Critical
  • send_contract Critical

Forbidden for AI

Forbidden for AI

Tools at this level may neither be read nor executed by an AI agent. They are completely hidden from tool discovery for agents. The classification is a hard boundary, not a policy decision.

Forbidden, Examples
  • read_private_key Forbidden for AI
  • read_password Forbidden for AI
  • read_raw_access_token Forbidden for AI
  • disable_audit_log Forbidden for AI
  • export_full_database_without_approval Forbidden for AI

What an agent must not see, it must not find. Forbidden tools do not appear in discovery responses.

Idempotency

Many tools are called multiple times in agent workflows, through retries, parallel execution, or faulty planning. MCP-first recommends making tools idempotent where possible: the same input leads to the same result on repeated calls, without generating undesired duplicates or side effects.

The tool create_download_link is a good example. If a valid link for a file with the same fileId and the same expiresAt already exists, the tool returns it rather than creating a second one:

create_download_link(fileId: "f_abc123", expiresAt: "2025-12-31T23:59:00Z")

→ First call: creates new link, returns linkId + url
→ Second call (same parameters, link still valid): returns same link
→ Third call (link expired): creates new link

Idempotency must be explicitly implemented, it does not arise automatically. The tool contract field sideEffects should describe the behavior on repetition so that agents and clients can plan accordingly.

Dry Run

Risky tools, especially those with bulk effects, should offer a dry run mode. The dry run performs all validations and checks but produces no real side effects. The result shows what would happen, without anything actually happening.

emails.bulk_send with dryRun: true delivers a complete preview:

// Request
{
  "tool": "emails.bulk_send",
  "params": {
    "templateId": "tmpl_newsletter_q3",
    "segmentId": "seg_active_customers",
    "dryRun": true
  }
}

// Response
{
  "dryRun": true,
  "recipientCount": 4821,
  "sampleMessages": [
    {
      "to": "[email protected]",
      "subject": "Your Q3 update is here",
      "previewText": "Dear Anna, in the third quarter we…"
    },
    {
      "to": "[email protected]",
      "subject": "Your Q3 update is here",
      "previewText": "Dear Max, in the third quarter we…"
    }
  ],
  "warnings": [
    "43 recipients have no first name, fallback 'Hello' will be used.",
    "12 email addresses are marked as undeliverable and will be skipped."
  ],
  "missingPermissions": [],
  "expectedSideEffects": [
    "4821 external emails will be sent.",
    "4821 audit events of type email.external.sent will be created.",
    "Campaign status will be set to 'sent'."
  ],
  "estimatedDurationMs": 18400
}

Only after reviewing the dry run result and explicit confirmation by the user does the real call happen without dryRun. The agent must not skip the dry run when the tool declares it as required.