Tool Design Guide · MCP-first

An MCP tool is not a function call. It is a contract. It describes what the system can do, who is allowed to call it, what happens in the process, and what can go wrong. This contract must be machine-readable, typed, and complete before an agent or a user even calls the tool.

A tool without a contract is a black box. A black box is not an MCP-first system.

The principle

The tool contract standard

Every MCP tool is described according to a unified schema. The schema covers not only input and output types, but also risk level, permissions, side effects, audit events, and possible failure cases. Only whoever knows all fields can use a tool safely, whether through an agent, a web app, or an automated workflow.

The fields at a glance:

Name, unique, machine-readable identifier in domain.verb format
Description, natural-language description of the purpose for agents and developers
Category, domain classification (e.g. communication, files, billing)
Risk Level, classification according to the risk model (Low to Forbidden)
Autonomous Execution, whether an agent may call the tool without human approval
Confirmation, whether explicit user confirmation is required
Step-up Auth, whether additional authentication is needed and when
Input Schema, typed parameters including optional fields
Output Schema, typed return values including status codes
Permissions, which permissions the tool requires
Side Effects, all state changes outside the return value
Audit Event, which event is written to the audit log
Failure Modes, complete list of all possible error codes

The following example shows the complete contract for emails.send_external, a tool with critical risk that triggers external communication:

emails.send_external

Critical

Sends an external email to one or more recipients.

Kategorie: communication
Autonome Ausführung: not permitted
Bestätigung: yes
Step-up Auth: optional, depending on recipient count and attachment
Audit-Event: email.external.sent

Input-Schema

to: string[]
subject: string
body: string
attachments?: fileId[]
projectId?: string

Output-Schema

emailId: string
status: sent | scheduled | failed

Berechtigungen

emails:send
contacts:read
files:read

Fehlerfälle

permission_denied
confirmation_required
invalid_recipient
attachment_too_large
rate_limit_exceeded
policy_violation

Every field has a function in the system: permissions controls the policy engine, sideEffects informs the confirmation UI, failureModes are evaluated in error handling, auditEvent is written to the audit log. The contract is therefore not just documentation, it is part of the runtime.

The risk model

MCP-first classifies every tool into one of five risk levels. The level determines whether an agent may call the tool autonomously, whether confirmation is required, and whether step-up auth is demanded. The classification is not optional, it is part of the contract.

Low

Tools at this level are purely read-only or produce no lasting effects. They may be executed autonomously by agents without confirmation or step-up auth. Typical examples are list queries, searches, and help resources.

Low Risk, Examples

list_projects Low
get_current_user Low
search_contacts Low
get_help_article Low
create_reminder Low

Medium

Tools at this level create data or states, but within a narrowly bounded scope without external effects. They may be executed autonomously when scope and context clearly emerge from the workflow. With ambiguous context, confirmation is advisable.

Medium Risk, Examples

create_note Medium
update_task_status Medium
generate_summary Medium
create_draft Medium

High

Tools at this level change relevant system states or affect other users and external resources. They generally require explicit confirmation. An agent must not execute them silently.

High Risk, Examples

change_project_status High
invite_calendar_attendee High
share_file_link High
update_customer_data High

Critical

Tools at this level trigger external communication, payments, data deletions, or permission changes. They always require explicit confirmation and often step-up auth. Autonomous execution is not permitted.

Critical Risk, Examples

emails.send_external Critical
bulk_export Critical
delete_user Critical
change_permissions Critical
execute_payment Critical
send_contract Critical

Forbidden for AI

Tools at this level may neither be read nor executed by an AI agent. They are completely hidden from tool discovery for agents. The classification is a hard boundary, not a policy decision.

Forbidden, Examples

read_private_key Forbidden for AI
read_password Forbidden for AI
read_raw_access_token Forbidden for AI
disable_audit_log Forbidden for AI
export_full_database_without_approval Forbidden for AI

What an agent must not see, it must not find. Forbidden tools do not appear in discovery responses.

Idempotency

Many tools are called multiple times in agent workflows, through retries, parallel execution, or faulty planning. MCP-first recommends making tools idempotent where possible: the same input leads to the same result on repeated calls, without generating undesired duplicates or side effects.

The tool create_download_link is a good example. If a valid link for a file with the same fileId and the same expiresAt already exists, the tool returns it rather than creating a second one:

create_download_link(fileId: "f_abc123", expiresAt: "2025-12-31T23:59:00Z")

→ First call: creates new link, returns linkId + url
→ Second call (same parameters, link still valid): returns same link
→ Third call (link expired): creates new link

Idempotency must be explicitly implemented, it does not arise automatically. The tool contract field sideEffects should describe the behavior on repetition so that agents and clients can plan accordingly.

Dry Run

Risky tools, especially those with bulk effects, should offer a dry run mode. The dry run performs all validations and checks but produces no real side effects. The result shows what would happen, without anything actually happening.

emails.bulk_send with dryRun: true delivers a complete preview:

// Request
{
  "tool": "emails.bulk_send",
  "params": {
    "templateId": "tmpl_newsletter_q3",
    "segmentId": "seg_active_customers",
    "dryRun": true
  }
}

// Response
{
  "dryRun": true,
  "recipientCount": 4821,
  "sampleMessages": [
    {
      "to": "[email protected]",
      "subject": "Your Q3 update is here",
      "previewText": "Dear Anna, in the third quarter we…"
    },
    {
      "to": "[email protected]",
      "subject": "Your Q3 update is here",
      "previewText": "Dear Max, in the third quarter we…"
    }
  ],
  "warnings": [
    "43 recipients have no first name, fallback 'Hello' will be used.",
    "12 email addresses are marked as undeliverable and will be skipped."
  ],
  "missingPermissions": [],
  "expectedSideEffects": [
    "4821 external emails will be sent.",
    "4821 audit events of type email.external.sent will be created.",
    "Campaign status will be set to 'sent'."
  ],
  "estimatedDurationMs": 18400
}

Only after reviewing the dry run result and explicit confirmation by the user does the real call happen without dryRun. The agent must not skip the dry run when the tool declares it as required.