
Tool Use & The Model Context Protocol
From JSON Schema definitions to MCP servers — master the tool-use APIs from Anthropic and OpenAI, design tools that models actually use correctly, and learn the protocol turning every agent into an interoperable system.
What you will learn
The moment an LLM can call a function, it stops being a chatbot and starts being a system. Tool use is the foundation of every agent in this course — without it, the model can only talk; with it, the model can act. This chapter covers the production-grade tool-use APIs from Anthropic and OpenAI, the Model Context Protocol that's standardizing the integration layer, and the practical patterns (parallel tools, strict schemas, error handling) that separate a demo from a deployment.
tool_use blocks; your code executes them and returns tool_result. 3) Parallel tool calls are now default — design for concurrency, not sequence. 4) MCP is the universal tool-server protocol — write tools once, run them in any agent stack.The Tool-Use Mental Model
A tool is a function the model can request — never invoke directly. The model doesn't run code; it produces structured JSON saying "please run get_weather with location="Tokyo"." Your code runs the function, gets the answer, and feeds the result back into the next model turn. That round-trip is the entire mechanism.
Anthropic Tool Use API
Anthropic's tool use API takes a list of tool definitions, each a JSON Schema. The model returns content blocks of type text, tool_use, or both.
import anthropic client = anthropic.Anthropic() tools = [{ "name": "get_weather", "description": "Get current weather for a location.", "input_schema": { "type": "object", "properties": { "location": {"type": "string", "description": "City, country"}, "unit": {"type": "string", "enum": ["c", "f"]}, }, "required": ["location"], }, }] response = client.messages.create( model="claude-sonnet-4-6", max_tokens=1024, tools=tools, messages=[{"role": "user", "content": "What's the weather in Tokyo?"}], ) if response.stop_reason == "tool_use": for block in response.content: if block.type == "tool_use": result = execute_tool(block.name, block.input) # feed result back as the next user turn
Tool choice modes
The tool_choice parameter controls how aggressively the model uses tools:
| Mode | Behavior | System-prompt overhead | Use when |
|---|---|---|---|
auto (default) | Model decides whether to call a tool | ~346 tokens | Most cases — let the model judge |
any | Model must call some tool | ~313 tokens | You know a tool is needed but not which one |
tool | Model must call a specific named tool | ~313 tokens | Forced extraction; structured output via tool |
none | No tools available this turn | ~346 tokens | Final summarization step after tool results |
auto/none, and 313 with any/tool (per Anthropic's tool-use docs). On top of that, every tool definition adds its own JSON schema cost. Cache the system prompt + tool definitions with cache_control — they don't change between turns and you'll pay the cached-token price (10% of input) instead of full price.Parallel tool calls
Claude 4.x emits multiple tool_use blocks in a single turn when the model judges them independent. Your runtime must execute them concurrently and return all tool_results before the next model turn:
import asyncio async def run_tools(tool_uses): # Fire all tool calls concurrently coros = [execute_tool(t.name, t.input) for t in tool_uses] results = await asyncio.gather(*coros, return_exceptions=True) return [ {"type": "tool_result", "tool_use_id": t.id, "content": str(r), "is_error": isinstance(r, Exception)} for t, r in zip(tool_uses, results) ]
OpenAI Function Calling
OpenAI's function calling API mirrors Anthropic's shape. Two key features to know:
strict: true— guarantees the model's output conforms to your schema. Eliminates the "LLM returned invalid JSON" failure mode entirely. Introduced August 2024.parallel_tool_calls: bool— same parallel-tool behavior as Claude. Defaulttrue; setfalseif your tools have side effects that must serialize.- Responses API (2025) — replaces Chat Completions for new agent code. Built-in tools (
web_search,file_search,computer_use) require no setup.
from openai import OpenAI client = OpenAI() response = client.responses.create( model="gpt-5", input="Find recent news about MCP adoption.", tools=[ {"type": "web_search"}, # built-in {"type": "function", "function": { "name": "save_summary", "strict": True, # schema-conformant "parameters": {"type": "object", "properties": { "title": {"type": "string"}, "summary": {"type": "string"}, }, "required": ["title", "summary"]}}], )
Model Context Protocol (MCP) — The Tool-Use Standard
The biggest 2024–2026 development in tool use isn't a new model — it's MCP, an open protocol Anthropic released in November 2024 that's now the universal way to expose tools to LLMs. By mid-2025 every major IDE (VS Code Copilot, Cursor, Zed, JetBrains), every major agent framework, and ChatGPT itself supports MCP servers. The practical effect: you write a tool once and any agent in any stack can use it.
- Reimplement each tool for each framework (LangChain, OpenAI, Anthropic SDK, Claude Code…)
- Vendor-locked tool definitions
- Auth, rate-limiting, schemas duplicated everywhere
- Tool changes require updating N clients
- Write a tool server once (stdio or HTTP+SSE)
- Any MCP-compatible client connects it
- Auth + caching + observability handled at protocol layer
- Tool changes ship to one server, all clients benefit
from mcp.server.fastmcp import FastMCP mcp = FastMCP("weather-server") @mcp.tool() def get_weather(location: str, unit: str = "c") -> dict: """Get current weather for a location.""" # your real implementation here return {"location": location, "temp": 22, "unit": unit} if __name__ == "__main__": mcp.run() # stdio transport — Claude Desktop / Code reads this
Once registered (in ~/.claude.json or any client's MCP config), the tool is automatically available to every agent in that client. Schema, types, and docs are inferred from the Python signature.
Designing Good Tools — Heuristics That Work
The model's tool-selection accuracy depends almost entirely on how you describe and shape your tools. From Anthropic's official guidance and accumulated production experience:
- Descriptions are documentation for the model, not for you. Write what the tool does, when to use it, when not to use it, and what the output looks like. Models pay attention to all four.
- Names should be verbs or noun-verbs.
get_weatherbeatsweather.send_emailbeatsemail. - Few broad tools beat many narrow ones. A single
search(query, filters)outperformssearch_users,search_orders,search_productsas separate tools — fewer for the model to choose between. - Constrain inputs aggressively.
enumfor known values,patternfor regex-validated strings,minimum/maximumfor numbers. Strict schemas keep the model from inventing. - Return structured output. JSON, never prose. The next model turn parses it; humans read the trace.
- Surface errors as content, not exceptions. Return
{"error": "rate_limited", "retry_after": 30}withis_error: true. The model can react to it; an exception kills the loop.
read_database(query) and send_slack_message(channel, body). Both succeed on the happy path. The agent occasionally sends Slack messages with database row IDs in them. What's the most likely root cause and what's the fix?Show answer
read_database returns raw rows the model treats as the answer. Fix: reshape read_database to return a summary object ({rows: [...], summary: "3 results", display_format: "table"}), and add to send_slack_message's description: "Body must be human-readable text, not raw data structures." The model picks up the cue.Error Handling — Where Most Agents Break
The single most common production agent failure isn't bad reasoning — it's a tool that errored, returned something unexpected, and the model couldn't recover. Three rules:
- Always return — never raise. A raised exception bubbles up and the loop dies. Catch the exception, format it as a tool result with
is_error: true, and let the model decide whether to retry, fall back, or give up. - Make errors actionable.
{"error": "city_not_found", "suggestion": "Try a more specific name like 'Tokyo, Japan'"}teaches the model how to recover.{"error": "500"}teaches it nothing. - Set per-tool timeouts. A 30-second tool can deadlock a 2-second agent. Wrap every tool call in
asyncio.wait_foror equivalent.
cache_control — token overhead is meaningful at scale. 3) Parallel tool use is default; design for concurrent execution. 4) MCP is the standard tool-server protocol — write tools as MCP servers and they work in every agent stack. 5) Errors as structured content beat exceptions every time.- Anthropic — Tool use overviewplatform.claude.com
- OpenAI — Function calling guideplatform.openai.com
- OpenAI — Introducing Structured Outputs in the APIopenai.com
- Model Context Protocol — Specificationmodelcontextprotocol.io
- MCP — Python SDK (FastMCP)github.com
Finished reading?