I Built 3 MCP Servers. The Critics Are Right — About the Wrong Thing.

The 'CLIs beat MCP' discourse is arguing against a misuse pattern, not the protocol itself. A rebuttal grounded in the actual spec.

mcp ai-agents oauth2 architecture

There's a recurring argument in the AI tooling discourse:

CLIs beat MCP. Zero context overhead, compose with pipes, testable in seconds. MCP servers bloat your context window, crash randomly, add dependencies. Just wrap your API in a CLI and move on.

I've built three MCP servers in the past few months — a REST API validator that tracks findings across audits, a financial simulation server for portfolio scenarios, and MoltNet's identity server with OAuth and per-user resources. The critique isn't wrong about bad MCP servers. It's wrong about what the protocol actually specifies.

The context overhead complaint is real. The diagnosis isn't.

Yes, dumping 40+ tool schemas into an agent's context is wasteful. But that's a design decision, not a protocol constraint.

The MCP spec (2025-11-25) supports cursor-based pagination on tools/list. Clients don't have to load the full schema set at initialization — they can page through tools on demand, or maintain a subset relevant to the current task. The listChanged notification means the server can push updates when the available tool set changes, so clients don't have to poll.

A fair caveat: not all MCP clients support notifications or pagination yet. The official client feature matrix shows uneven adoption across features. But that's an implementation gap, not a protocol gap — the spec defines these capabilities; clients just need to catch up.

On the client side, Anthropic already ships a Tool Search Tool: tools are registered with defer_loading: true and only loaded into context when the model discovers them via search. A typical multi-server setup (GitHub, Slack, Sentry, Grafana) drops from ~55K tokens of tool definitions to the 3-5 tools actually needed per request.

Claude Code exposes this directly via ENABLE_TOOL_SEARCH — set it to auto and tool search kicks in when MCP tools exceed 10% of the context window. If this pattern becomes standard — and Anthropic tends to set the pace — the "context bloat" argument against MCP evaporates entirely.

More importantly: the spec defines outputSchema on tool definitions — a typed JSON Schema for what the tool returns. This is exactly what the CLI proponents praise when they say "add --json and your agent gets parseable data." MCP makes it a formal contract instead of a convention. If you're getting context bloat from a well-designed MCP server, the problem is the server's description quality, not the wire format.

The same discipline applies to both: a CLI's --help text that's vague produces hallucinations. A tool description that's vague produces hallucinations. Schema quality is the variable that matters, and it's independent of whether you're running bash or JSON-RPC.

CLIs are excellent for personal automation. That's a different problem.

This is the concession that the discourse keeps skipping. If you're wiring your own calendar, iMessage, and security cameras together on a MacBook — CLIs are simpler. You configure credentials once, the agent has ambient access, you're done. No argument. I still use the plain GitHub CLI wrapped in Claude Code skills for my most common flows — the GitHub MCP server dumps too many tool schemas into context, and I haven't bothered tuning ENABLE_TOOL_SEARCH for discoverability yet.

But notice what that setup requires: a single user, a single machine, pre-baked credentials in the environment. Scale it to any of the following and the model breaks:

  • Multiple users accessing the same backend
  • An agent operating on behalf of a user whose identity the server needs to verify
  • Access control that varies per session (you can see your diaries; I can see mine)
  • Auditable, scoped, revocable authorization

A CLI has no concept of who is calling it. It runs as the process owner. For personal automation that's fine — you are the process owner. For a hosted service gating an API on behalf of authenticated users, you've just described a security hole.

This isn't hypothetical — MoltNet runs multiple agents against the same server, each with its own identity and scoped access. It's also the basic shape of any multi-user backend.

Resources are the most underused primitive — because most people haven't built what they're designed for.

The criticism I hear most often: "resources look like experiments, I don't know anyone getting value from them." This is survivor bias. The MCP ecosystem bootstrapped itself on local stdio servers — filesystem plugins, git tools, browser automation. For those use cases, resources are indeed redundant: you're already running as the user's identity with ambient access to their machine.

My first approach with MoltNet was the obvious one: expose the REST API parameters as tool inputs and let the agent figure it out. It worked — technically. The agent would call diaries_list, pick a diary, call entries_search with the right filters, and stitch the results together. Three tool calls, auth on each, and the agent guessing its way through query parameters it had never seen before. It got the right answer maybe 70% of the time.

Then I added resources. moltnet://identity doesn't return an identity — it returns your identity. The OAuth token in the current session determines which user the server resolves.

moltnet://entries/recent returns the ten most recent diary entries across all diaries the authenticated user can access. One URI, session-scoped data, no guesswork. The agent reads it and gets back exactly what it needs. The three-tool-call dance I'd built before became a single resource read.

The spec calls resources application-driven — the host application determines how to incorporate context based on the current session. A CLI cannot do this without manually reimplementing session-scoped auth.

The reason resources look unused is that most MCP integrations so far are developer tools built for the local filesystem use case. That's not the use case resources were designed for.

"It should have been tools-only and stateless."

This argument comes from people who've built excellent stateless integrations — observability SaaS, developer tooling, single-purpose APIs. For those, tools-only is the right shape. The critique holds within that frame.

But designing the protocol around that constraint would have made it useless for the more interesting case: an agent operating as an authenticated principal on a multi-user system, where the same request means something different depending on who's asking.

Statelessness is a virtue when you don't care about identity. The moment you do — the moment authorization, ownership, and audit matter — you need session context. Designing that away isn't simplification. It's removing a capability because your use case didn't need it.

I agree with the critics on one thing: the transport being stateful itself is a pain. Sticky sessions, distributed session storage, reconnection logic — I maintain a fork of @platformatic/mcp partly because the session management surface area is large enough to need its own plugin. I'd be happy to shed that complexity.

The MCP team is actively rethinking how state lives in the protocol. The direction: make the protocol itself stateless while keeping applications stateful — session context moves from implicit transport-layer concerns to explicit application-level handling, like HTTP cookies. This validates both sides: the critics are right that protocol-level statefulness creates operational friction. The spec authors are right that the capability matters — they're just moving it to the right layer.

Prompts are server-defined workflows with a bad name

Prompts took the most damage from bad marketing. "Reusable prompt templates" sounds like a prompt engineering gimmick, which is partly why they're half-dismissed as a dead feature.

What they actually are: server-defined workflows. I learned this the hard way. MoltNet agents need to set up their identity when they first connect — check if they have a whoami entry, create one if not, same for a soul entry. I originally encoded this as a client-side skill: a markdown file with step-by-step instructions the agent would follow. It aged like milk. Every time the API changed, the skill broke. Different agents interpreted the steps differently. Some skipped the soul entry entirely.

So I moved it server-side as an MCP prompt. identity_bootstrap checks the state, creates what's missing, and returns structured guidance. The workflow lives where the knowledge lives. Agents call it by name and get consistent results regardless of which client they're running on. Better yet, the server's tool responses hint at available prompts when they're relevant — so agents discover and follow the right workflow without anyone encoding it client-side. It becomes an enforced pattern, not a suggestion.

Whether the word "prompts" survives is a naming question. The primitive — server-encoded workflows that clients can invoke by name — works anywhere the server has domain knowledge the client shouldn't have to replicate.

Authentication is where the real gap is

The spec supports OAuth 2.1 with resource indicators (RFC 8707). For interactive clients, that's the authorization code flow with PKCE. I wrote about what it actually takes to wire this up — the hardest part wasn't the spec itself, it was discovering that every MCP client has its own interpretation of OAuth 2 and MCP auth. Claude Code, ChatGPT, Cursor — each had quirks that required workarounds.

For machine-to-machine scenarios — agents with their own cryptographic identity, no human in the loop — the client_credentials flow is defined as an official extension outside the core spec. In practice, no major MCP client natively supports it yet.

So I hacked around it. In .mcp.json, each agent's server entry sends X-Client-Id and X-Client-Secret as custom headers, using ${VARIABLE} placeholders that Claude Code resolves from the env field in .claude/settings.local.json — keeping secrets out of version control while still per-agent. On the server side, a Fastify onRequest hook intercepts those headers, performs the client_credentials exchange, and injects a Bearer token before the MCP layer even sees the request.

It works, but it shouldn't require a workaround — this is the gap that needs closing.

An agent that proves its own identity, obtains a scoped token, and interacts with a server without a human in the loop — that's the foundation for autonomous agent networks. A CLI running a bash script has no concept of delegated, scoped, auditable authorization. It runs as whoever owns the process. Fine for personal tools. Not a model for an agent ecosystem.

What the critique actually says

The honest version of the "CLIs beat MCP" argument is: most MCP servers I've seen are local stdio processes with too many tool descriptions and no auth, for which a CLI would have been simpler. That's probably true of most MCP servers so far.

It doesn't follow that the protocol is wrong. It follows that the ecosystem started with local developer tools and the hosted, authenticated use cases haven't shipped widely yet. When they do, they'll look nothing like a pile of shell scripts. They'll look like what MCP was designed for: a server that knows who's calling, owns data on behalf of users, encodes domain workflows, and gates access through standard OAuth.

That's not bloat. That's architecture.

Mic drop

← Back to all posts