MCP Server Development Services: What to Scope Before You Build

A buyer-grade MCP server development playbook: when to build, what tools to expose, how to handle OAuth, approvals, logs, and evals.

Saturday, June 20, 2026Omid Saffari
MCP Server Development Services: What to Scope Before You Build

Build an MCP server only when an AI system needs controlled access to one useful business capability. If the request is just "connect our whole API to ChatGPT," scope a smaller tool surface first: the server, permissions, approvals, logs, and evaluation tests are the product.

The Verdict: Build One Narrow Capability, Not A Mirror Of Your API

MCP server development is worth buying when an AI feature needs a controlled way to use your business systems. MCP, short for Model Context Protocol, is an open-source standard for connecting AI applications to external systems such as data sources, tools, and workflows. That makes it useful, but it does not make the scope obvious.

The mistake is treating MCP as an API-export project. A billing product with dozens of endpoints does not need its entire API turned into AI-callable tools. It needs a small task-level surface that a user can understand, approve, and audit. For example: "find overdue invoices and draft a reminder" is a useful MCP tool surface. "Expose the entire billing API" is a risk surface.

Use MCP server development when one of these is true:

  • Your product has data or actions that buyers now expect to reach from AI assistants, copilots, or internal agents.
  • Your team needs one shared integration layer instead of rebuilding tool calls separately for ChatGPT, Claude, Cursor, Codex, or internal agent code.
  • The workflow has clear permissions, predictable inputs, and an obvious human approval point before any sensitive action.
  • The value is in the business rule, not in giving the model more raw access.

Skip it, or delay it, when the workflow is still vague. If nobody can name the user, the allowed action, the denied action, the approval rule, and the success test, an MCP server will only make the ambiguity easier to call from more places. A normal API, admin screen, or internal automation may be the better first build.

For broader agent scope, start with the same control logic we use in AI Agent Development Services: What to Scope Before You Build: one workflow, one owner, explicit tool permissions, logs, and a handoff path. MCP is the connection layer. The product is the bounded feature around it.

What An MCP Server Actually Exposes

An MCP server exposes selected capabilities to an MCP client, which sits inside an AI host such as a desktop assistant, IDE, chatbot, internal agent, or product feature. Cloudflare's MCP documentation breaks the roles down plainly: hosts are the AI applications, clients are embedded in those hosts, and servers expose tools, prompts, and resources that clients can use.

The buyer-level distinction matters because each surface has a different risk profile.

MCP surfaceWhat it gives the AI systemBuyer rule
ToolsCallable actions such as querying a database, calling an API, creating a record, or running a calculationTreat every tool as a permissioned operation with inputs, outputs, limits, and approval rules
ResourcesContext such as files, database schemas, customer records, policy docs, or application-specific data identified by a URITreat resources as read access and decide what can be selected automatically vs explicitly
PromptsStructured message templates and instructions that clients can discover and useTreat prompts as workflow shortcuts, not hidden policy

The official MCP tools spec says tools can be invoked by language models and are identified by a unique name plus schema metadata. That means the shape and description of the tool are not implementation details. They are part of the product interface. A vague tool named update_customer invites the model to guess. A narrow tool named draft_customer_followup with fields for customer ID, reason code, tone, and approval status is easier to evaluate and safer to ship.

Resources are different. The MCP resources spec describes them as a standard way for servers to share context such as files, database schemas, or application-specific information. A resource should answer, "what does the AI need to know?" Tools answer, "what is the AI allowed to do?"

Prompts are different again. The MCP prompts spec describes prompt templates as structured messages and instructions that clients can discover, retrieve, and customize with arguments. A prompt can help a support lead run "summarize refund risk" or help a sales operator run "prepare renewal brief," but it should not smuggle permissions that the tool layer refuses to state.

The First Scope Should Fit On One Page

The first MCP server should fit on a one-page scope because the first job is to prove a controlled capability, not a platform. A useful first build names the user, the system, the allowed tools, the denied operations, the approval point, the logs, and the evaluation tests.

Here is a practical reference scope for an operations team.

Workflow: An ops lead wants an AI assistant to review overdue invoices and prepare follow-up tasks in the CRM.

Good first MCP server:

  • search_overdue_invoices, read invoices by customer ID, invoice status, and days overdue.
  • draft_payment_reminder, generate a draft message with invoice references and account context.
  • create_followup_task, create a CRM task only after a human approves the draft.

Do not include in v1:

  • refund_invoice
  • change_payment_terms
  • delete_customer
  • send_email_without_review
  • Any tool that accepts arbitrary SQL, arbitrary URLs, or free-form API paths.

That first scope is commercially useful because the buyer can evaluate it. Did the assistant find the right invoices? Did it avoid paid invoices? Did it draft a reminder in the right tone? Did it ask for approval before creating a task? Did the log show who approved what?

  1. Write the workflow sentence

    Use one sentence: "The assistant helps [role] do [task] in [system] with [approval point]." If the sentence needs five commas, the scope is too broad.

  2. Define the tool allowlist

    Name the exact tools, inputs, outputs, and denied operations. Prefer task-level tools over raw API wrappers.

  3. Add the approval boundary

    Mark which tools are read-only, which tools create drafts, and which tools require human approval before changing a system of record.

  4. Create the eval set

    Write a small fixed set of realistic tests before the demo: happy path, missing customer, paid invoice, duplicate invoice, wrong tenant, permission denied, tool timeout, and approval rejected.

Cloudflare's MCP best practices line up with this scope: do not wrap a full API schema, use fewer well-designed tools, narrow the permissions, write detailed parameter descriptions, and run evaluation tests after updates. That is the buyer's checklist. If a vendor cannot show those artifacts, the demo is not enough.

Local vs Remote Is A Security Decision

Local vs remote MCP is not a preference setting. It decides where the tool runs, who can reach it, how identity is proven, and how much audit evidence the system needs.

The current MCP transport specification says MCP uses JSON-RPC messages encoded as UTF-8 and defines two standard transports: stdio and Streamable HTTP. In practice, local MCP uses stdio, where the host launches a local server process and exchanges messages through standard input and output. Remote MCP uses Streamable HTTP, where clients connect to a hosted server over the Internet.

DecisionLocal MCP over stdioRemote MCP over Streamable HTTP
Best fitDeveloper tools, local files, private machine context, single-user workflowsShared product capabilities, internal tools, customer-facing assistants, multi-user access
IdentityUsually inherited from the local machine or host setupMust be explicit through authentication and authorization
Security riskLocal data exposure, local process trust, host configuration mistakesOAuth scope design, tenant isolation, rate limits, audit logs, exposed endpoint risk
Operational workPackaging, host installation, local debuggingHosting, auth, monitoring, versioning, logging, incident response

Cloudflare describes the same split: remote MCP connections use Streamable HTTP and OAuth authorization, while local MCP connections use stdio on the same machine. The buyer rule is simple. If more than one user, assistant, or customer needs the capability, treat it as remote software with real auth, observability, and release management. If the capability is only for a developer's local environment, local MCP may be enough.

Streamable HTTP also matters because the MCP specification says it replaces the older HTTP+SSE transport from the 2024-11-05 protocol version. That does not mean every legacy server disappears overnight, but it does mean a new build should not start from the old transport unless a host requirement forces it.

Permissions, Approvals, And Tool Filtering Are The System

The control layer is not a nice-to-have around an MCP server. It is the system a buyer is paying for.

The MCP tools spec is explicit that tools are model-controlled, meaning a language model can discover and invoke them based on context and the user's prompt. The same spec says applications should provide UI that shows which tools are exposed, clear indicators when tools are invoked, and confirmation prompts for operations so a human stays in the loop.

That is the right default for business systems. A model can suggest. A person or policy should decide when a sensitive tool call actually changes money, customer status, permissions, production data, or outbound communication.

For OpenAI-based builds, the Agents Python SDK now exposes the same design choices directly. It supports hosted MCP server tools, Streamable HTTP MCP servers, HTTP with SSE MCP servers, and stdio MCP servers. It also supports approval policies through require_approval and an on_approval_request callback, plus tool filtering, tool-list caching, and tracing.

Use those controls deliberately:

  • Tool filtering: expose only the tools the current role and workflow need.
  • Approvals: require approval for write actions, outbound messages, refunds, permission changes, and irreversible operations.
  • Metadata: pass tenant IDs, user IDs, trace IDs, and policy context with tool calls where the runtime supports it.
  • Caching: cache tool lists only when tool definitions are stable and safe to reuse.
  • Tracing: log tool listing, tool calls, inputs, outputs, approvals, denials, and errors in a way a human can review.

Cloudflare's authorization guide adds the remote-server side of the same control layer. It says MCP authorization uses a subset of OAuth 2.1 so users can grant limited access without sharing API keys or credentials. It also describes options such as Cloudflare Access, third-party OAuth providers like GitHub or Google, existing auth providers such as Auth0 or WorkOS, and a Worker-handled authorization flow.

For a buyer, this translates into one hard requirement: no shared admin token as the product. A prototype can start with a simple token in a private environment. A production MCP server needs scoped identity, consent or policy-based access, and logs that show which user allowed which action.

What Breaks After The Demo

The demo usually breaks later because the tool surface was designed for the happy path. Month two exposes the missing decisions: ambiguous tools, stale schemas, weak tenant context, vague errors, no approval handoff, and no regression tests.

Common failures look like this:

FailureWhat it looks likeFix before launch
Oversized toolsOne tool accepts free-form instructions or arbitrary API pathsSplit into task-level tools with tight schemas
Weak descriptionsThe model picks the wrong tool or passes invalid fieldsWrite parameter descriptions with constraints and examples
Missing tenant contextA tool can read or act across the wrong customer/accountRequire tenant ID, role, and policy context on every call
No approval pathDraft actions silently become live actionsSeparate draft, approve, and execute tools
Vague errorsThe model retries a bad call or gives a confident wrong answerReturn structured, model-visible errors with safe next steps
No evalsA tool update changes behavior without anyone noticingRun fixed tests after every schema, prompt, or permission change

The fastest way to de-risk this is to write the evals before the final polish. A good v1 eval set does not need to be large. It needs to cover the cases that would embarrass the business: wrong account, missing permission, outdated data, duplicate action, denied approval, and external-system outage.

Here is the standard we would use for the overdue-invoice example:

  • The assistant must not show invoices outside the user's tenant.
  • The assistant must not create a CRM task until approval is granted.
  • The assistant must label missing invoice data as missing, not infer it.
  • The assistant must not send or schedule the reminder in v1.
  • The assistant must log the invoice IDs used in the draft.
  • The assistant must return a safe error when the billing API times out.

This is also where a buyer should be clear about the assistant vs agent boundary. If the workflow mainly needs retrieval, drafting, and human review, an assistant may be enough. If it needs tool use across systems, approvals, and state changes, use the decision rule in AI Assistant vs AI Agent: Which Should You Build First? before expanding the MCP surface.

Build vs Use An Existing MCP Server

Use an existing MCP server when the job is a common SaaS surface and the vendor server already gives you the permission model you need. Build a custom MCP server when the value is proprietary workflow logic, custom data rules, or a product capability your customers need to use through AI hosts.

GitHub's MCP server is a useful example. GitHub offers a hosted remote server and a local server, and its documentation describes toolsets that control groups of capabilities such as repos, issues, pull_requests, actions, and code_security. That is the shape to copy: grouped capability surfaces, not an indiscriminate endpoint dump.

The buy-vs-build rule:

  • Use vendor MCP for commodity systems where the vendor's permission model matches your workflow.
  • Build custom MCP for internal systems, proprietary products, regulated workflows, multi-tenant data, or customer-facing capabilities.
  • Build a wrapper only when it adds real safety or business logic, not just a new protocol around the same broad API.
  • Delay MCP when a normal UI or one-off workflow automation solves the problem with less operational load.

For a SaaS product, the strongest custom use case is not "we have MCP." It is "our customers can ask their AI assistant to perform a narrow, safe action in our product without sharing API keys or leaving an audit trail gap." That can become a product feature. It still starts with one controlled capability.

The Delivery Checklist

A fixed-scope MCP server build should leave the buyer with artifacts they can operate after launch. Code is only one deliverable.

Ask for these deliverables:

  • One-page workflow scope with user role, system of record, allowed actions, denied actions, and approval rules.
  • Tool inventory with names, descriptions, input schemas, output schemas, error shapes, and examples.
  • Resource inventory with data classes, tenant boundaries, retention assumptions, and selection rules.
  • Prompt inventory if the server exposes workflow prompts.
  • Auth design with OAuth scopes or equivalent policy, token handling, and user mapping.
  • Approval design for every write action or sensitive read.
  • Audit-log schema for tool calls, approvals, denials, errors, and acting user.
  • Evaluation set with the pass/fail cases that must run after changes.
  • Deployment mode decision: local stdio, remote Streamable HTTP, or both.
  • Handoff guide for adding, removing, or deprecating tools.

The practical v1 should be boring in the right places. One endpoint for MCP, a few tools, narrow scopes, readable logs, explicit approvals, and a test set that catches the obvious failure modes. If that proves useful, v2 can add more tools. If v1 cannot prove business value with a small surface, a larger MCP server will not fix it.

What is MCP server development?

MCP server development is the design and build of a server that exposes selected tools, resources, or prompts through Model Context Protocol so an AI host can use a business system in a controlled way.

Is MCP the same as API integration?

No. MCP can sit on top of APIs, but the server should expose task-level capabilities that are safe for an AI system to use. A raw API integration moves data. A good MCP server defines what the AI can do, what it cannot do, and when a human must approve.

Should an MCP server be local or remote?

Use local MCP when the capability runs on one machine, such as a developer tool or private local context. Use remote MCP when multiple users, assistants, or customers need the capability, and budget for OAuth, tenant isolation, logs, monitoring, and version control.

Does MCP make an AI agent safe?

No. MCP standardizes access. Safety comes from narrow tools, scoped permissions, approval prompts, audit logs, structured errors, and evals that keep working after the demo.

How much should the first MCP server include?

The first build should include one workflow, a small allowlist of tools, explicit denied operations, an approval path for sensitive actions, logs, and a fixed eval set. Add breadth only after the first workflow is reliable.

Last Updated

Jun 20, 2026

CategoryAI Features

More from AI Features

View all AI Features articles
Newsletter

One letter, every Sunday. Working systems — not hot takes.

Build logs, working systems, and field notes from running a portfolio of AI ventures. Sent weekly, never more.

Weekly. No spam. Unsubscribe anytime.