MCP·Jan 15, 2026·5 minmcp llm engineering

MCP Clients and Host Integration

We built a server last time. It worked, in the sense that the Inspector could call it. But a server with nothing connected to it is a vending machine in an empty room. The interesting half of MCP is the side that consumes — and that side has more moving parts than people expect, mostly because the words are sloppy.

Let's fix the words first.

Host, client, server — three roles, not two

People say "MCP client" to mean roughly everything that isn't the server. The spec is more precise, and the precision pays off the moment something breaks.

A host is the application the user actually sits in front of: Claude Desktop, an IDE, a chat product, your own agent app. The host owns the model, the conversation, and — crucially — the trust decisions. Should this tool be allowed to run? The host answers that.

A client is a connector inside the host. Here's the part that trips people up: it's one client per server, not one client total. If your host talks to three MCP servers, it spins up three clients, each holding a single stateful session to one server. The host orchestrates them; the clients each mind their own connection.

A server is what we built — the thing exposing tools, resources, and prompts.

A host containing three clients, each connected to one MCP server
One client per server: the host orchestrates, each client owns a single connection.

Why bother with this distinction? Because the host is the security boundary. The model can want to call a tool all day; the client passes that request to one specific server, and the host decides whether it's allowed. Mixing up "the model asked" with "the tool ran" is where a lot of careless thinking about MCP safety starts.

Wiring a server into a host

The most common host in early 2026 is still a desktop or IDE app, and most of them configure servers with a JSON file that lists what to launch. Claude Desktop's looks like this:

{
  "mcpServers": {
    "weather-demo": {
      "command": "python",
      "args": ["/abs/path/to/weather_server.py"]
    },
    "filesystem": {
      "command": "npx",
      "args": ["-y", "@modelcontextprotocol/server-filesystem", "/Users/me/projects"]
    }
  }
}

Two servers, two clients. The host reads this, launches each command as a subprocess, and speaks stdio to it. Restart the host and the weather tool plus a read-only filesystem server show up in the tool list. The filesystem entry is one of the official reference servers — there's a small catalog of them (Git, Fetch, Memory, and friends) precisely so you don't write the boring ones yourself.

VS Code and other IDEs use the same shape with their own filename and a bit more UI around it. The pattern is identical: declare a launch command for stdio servers, or a URL for remote Streamable HTTP servers. The Claude API itself accepts an mcp_servers parameter, which means "the host" doesn't have to be a GUI at all — it can be your own backend calling the model and handing it servers programmatically.

Discovery is dynamic, and that's the point

Here's a property that separates MCP from baking tools into a prompt: the host doesn't hardcode what the server can do. It asks. After connecting, the client calls tools/list, resources/list, and prompts/list, and whatever comes back is what the model sees this session.

That has a real consequence. A server can change its tools and the host picks it up — the spec even defines a notifications/tools/list_changed message so a server can tell connected clients "my capabilities changed, re-fetch." A deploy that adds a tool doesn't require touching every host. The menu updates itself.

The flip side: you're trusting the server to describe itself honestly, every session. A server that quietly swaps a benign tool's description for a malicious one between connects is a known attack pattern ("rug pull"), and dynamic discovery is exactly what enables it. Convenience and exposure, same mechanism.

Building your own host

You don't have to use someone else's app. The Python and TypeScript SDKs both ship a client side, and standing up a minimal host is a short loop:

  1. Start a session to a server (stdio or HTTP).
  2. Call list_tools() and feed those schemas to the model as available functions.
  3. When the model emits a tool call, route it through the client's call_tool().
  4. Return the result to the model and continue the conversation.

That loop is the host. Frameworks like LangChain and LangGraph wrap it with adapters so MCP tools drop into an existing agent graph, but underneath it's those four steps. If you've written a function-calling loop before, you've written 90% of an MCP host already — the only new part is that the tools arrive over a connection instead of being defined inline.

The part nobody warns you about

Connect five servers, each with eight tools, and your model now sees forty tools at once. Accuracy drops. Tool selection gets fuzzy. The model starts reaching for plausible-but-wrong tools because the menu is too long to reason over cleanly. This isn't an MCP flaw — it's the same tool-sprawl problem any agent hits — but MCP makes it easy to over-connect, which means it's easy to walk into.

So the integration question isn't "which servers can I add." It's "which servers does this host actually need for the jobs it does." A host wired to two well-chosen servers will outperform one wired to ten, and it'll be far easier to reason about when a tool call goes sideways. Restraint is the feature here. The protocol gives you a thousand servers to plug in; your job is to plug in the three that matter.

Leave a Reply

Your email address will not be published.