Most introductions to MCP stop at tools, resources, and prompts and call it a day. Fair enough — that's 80% of what people use. But the protocol has a second tier of features that are the difference between a tool that fires and forgets and one that feels alive while it works. Three of them are worth your time: sampling, progress, and logging. They share a theme — the server stops being purely reactive and starts talking back mid-call.
Sampling: the server borrows the model
Here's the inversion that surprises everyone. Normally the model calls the server. Sampling lets the server call the model — through the client. The server, in the middle of handling a tool call, says "I need a completion for this," and the client runs it on whatever model the host is using.
Why would a server want that? Because some tools are themselves little reasoning tasks. A "summarize this database row" tool needs an LLM to do its job. Without sampling, the server author has to bring their own API key, their own model, their own billing. With sampling, the server stays model-agnostic: it asks the client to do the inference, and the client uses the user's existing model and credentials. The server never sees a key.
The flow has a human in it by design. A sampling request goes to the client, the client (per the spec) is expected to let the user review and approve what gets sent and what comes back, then the completion returns to the server. That approval step exists precisely because "a third-party server can trigger model calls" is exactly the kind of capability you don't hand over blind.
The 2025-11-25 revision made sampling considerably more powerful by adding tool support to it — tools and toolChoice parameters on a sampling request. That sounds small. It isn't. It means a server can ask the client to run an LLM that can itself call tools — a server-initiated agentic loop. The server becomes an orchestrator that rents reasoning from the client on demand. Whether that's elegant or a recipe for confusing control flow depends entirely on how carefully you use it.
Progress: don't make the user stare at nothing
A tool that takes thirty seconds with no feedback feels broken even when it's working. Progress notifications fix that. When a client makes a request, it can attach a progressToken. The server then emits notifications/progress messages tagged with that token as it works — "processed 40 of 200 files" — and the host can render a live bar instead of a spinner that might as well be a freeze.
It's a one-way street: the server pushes, nobody waits for an ack. That's what makes it cheap. The mechanism is just notifications keyed to the token the client handed out, so the host knows which in-flight request each update belongs to.
# inside a FastMCP tool, using the injected context
@mcp.tool()
async def index_docs(folder: str, ctx: Context) -> str:
files = list_files(folder)
for i, f in enumerate(files):
embed(f)
await ctx.report_progress(i + 1, len(files)) # -> notifications/progress
return f"Indexed {len(files)} files"
ctx is FastMCP's handle for talking back to the client mid-call. report_progress is the wrapper over the raw notification. The model doesn't see these updates; the user does. That distinction matters — progress is a UX channel, not a reasoning channel.
Logging: structured server-side messages
The same ctx carries logging. A server can emit leveled log messages — debug, info, warning, error — over notifications/message, and the host can surface them in a console or a debug pane. This is how you get visibility into what a remote server is doing without SSH-ing into the box it runs on.
await ctx.info("cache miss, fetching from upstream")
await ctx.warning("upstream slow, falling back to stale copy")
Two rules keep this from biting you. Set log levels honestly so a host can filter, and never log secrets — those messages can travel to a client and a UI you don't control. A server logging an API key at debug is a leak waiting for someone to turn debug on.
Tasks: the long-running upgrade
Progress notifications keep a user informed during a call that's still holding the connection open. But what about work that takes minutes, or that should survive a dropped connection? The 2025-11-25 spec added experimental tasks for exactly this: a way to track durable requests, poll their status, and retrieve a result later instead of blocking on one long-held request.
You can picture any of these slow operations as a small state machine:
That NeedsModel detour is sampling; the self-loop is progress; the terminal states are where tasks let the client come back later. Tasks shipped as experimental, which means treat it as a moving target — but the direction is clear: MCP is growing the vocabulary for work that doesn't finish in one round-trip.
When to bother
Skip all of this for a simple, fast, local tool. A function that returns in 200ms doesn't need progress, doesn't need to borrow the model, and has nothing interesting to log. Reaching for these features on a trivial tool is the same over-engineering reflex that makes people wrap one function in a whole server.
Reach for them when the work is slow (progress), when the tool genuinely needs reasoning it shouldn't pay for itself (sampling), or when you'll be debugging a server you can't easily attach to (logging). Used well, they're the difference between a tool that works and a tool that's pleasant to use. Used reflexively, they're just more surface to get wrong.
Leave a Reply
Your email address will not be published.