Building MCP servers with FastMCP: The good, the bad and the ugly
If you have spent any time building tools for LLM agents recently, you have almost certainly run into the Model Context Protocol (MCP) and, close behind it, FastMCP. MCP is the open standard that connects language models to your tools and data; FastMCP is the framework that makes building those connections pleasant. In fact, FastMCP 1.0 was folded into the official MCP Python SDK back in 2024, and the maintained standalone project now powers a large share of the MCP servers running in the wild. This post walks through what FastMCP is good at, where it falls short, whether it is ready for production, and a few questions that come up constantly: can you just wrap an existing API, what happens if you would rather write in JavaScript or Go, and what should you worry about from a security standpoint.
What FastMCP actually gives you
The appeal of FastMCP is that it collapses a lot of protocol boilerplate into ordinary Python. You decorate a function as a tool, and the framework generates the JSON schema, validates inputs, and produces the documentation the model sees. On the client side, you point at a URL and transport negotiation, authentication, and the protocol lifecycle are handled for you. The framework organizes everything around three pillars: servers that expose tools, resources, and prompts; clients that connect to any MCP server whether local or remote; and apps that render interactive UIs directly in the conversation. For most teams the server side is the entry point, and the experience of going from a plain function to a working tool in a few lines is genuinely the reason FastMCP became the default.
Here is a complete, working server. Save it as server.py, install FastMCP with pip install fastmcp, and run it with python server.py:
from fastmcp import FastMCP
mcp = FastMCP("Weather Demo")
@mcp.tool
def get_forecast(city: str, days: int = 1) -> dict:
"""Return a short weather forecast for a city."""
# In a real server you would call a weather API here.
return {"city": city, "days": days,
"summary": f"{days}-day outlook for {city}: sunny."}
@mcp.tool
def convert_temp(value: float, to: str = "C") -> float:
"""Convert a temperature between Celsius and Fahrenheit."""
if to.upper() == "F":
return round(value * 9 / 5 + 32, 1)
return round((value - 32) * 5 / 9, 1)
if __name__ == "__main__":
# STDIO by default; for remote use: mcp.run(transport="http", port=8000)
mcp.run()The pros
The biggest win is developer velocity. Schema generation, validation, and documentation come from your type hints, so there is very little ceremony between an idea and a callable tool. The framework is also feature-complete against the protocol in a way most hand-rolled servers are not: it supports the full client and server lifecycle, multiple transports (STDIO for local development and Streamable HTTP for remote access), and the newer interactive features like progress reporting, elicitation, and sampling. Because it is so widely adopted, the ecosystem around it is deep, with first-class integrations for OAuth providers such as Auth0, WorkOS AuthKit, AWS Cognito, Azure Entra ID, and GitHub, plus documented paths for ChatGPT, the Anthropic API, and Claude. That maturity matters: you are building on the same plumbing that a very large fraction of production MCP servers already rely on.
The cons
The honest tradeoffs are worth naming. The first is that the abstraction can hide complexity you eventually need to understand. Sessions, transports, and the well-known OAuth discovery routes all work invisibly until you mount the server under a path prefix or put it behind a load balancer, and then you suddenly need to know exactly how they behave. The second is that the documentation tracks the main development branch, so features can appear in the docs before they ship in a release, which occasionally bites you when you copy an example that depends on something unreleased. The third is more philosophical: it is so easy to expose a tool that you can end up shipping a sprawling, poorly curated surface that confuses the model rather than helping it. Easy is not the same as well-designed.
Is it production-ready?
Yes, with the usual caveats that apply to any web service. FastMCP can run as a standard ASGI application served by Uvicorn or Gunicorn, scaled horizontally, fronted by nginx for TLS termination, and supervised as a systemd service. There are a few production-specific details the docs are explicit about, and ignoring them is where most teams get hurt. If you run multiple instances behind a load balancer, you generally need stateless HTTP mode, because the default in-memory sessions do not survive a request being routed to a different instance, and sticky sessions are unreliable since many MCP clients use fetch internally and drop cookies. Streamable HTTP relies on server-sent events, so your reverse proxy must disable buffering and raise its timeouts or streaming responses silently break. And if you use OAuth, production requires an explicit JWT signing key plus persistent, encrypted storage for tokens so they survive restarts and can be shared across hosts. None of this is exotic, but it is the difference between a demo and a service. For teams that would rather not operate this themselves, the FastMCP team also offers a managed gateway, Prefect Horizon, aimed at exactly this set of concerns.
Can I just wrap my own API?
Technically, yes, and FastMCP makes it almost trivial. If you already have a FastAPI application, a single call to FastMCP.from_fastapi turns its endpoints into MCP tools by reading the OpenAPI spec, and you can even serve your REST API and the MCP interface from the same app. But the more interesting answer is that you probably should not do this as your end state. The FastMCP maintainers themselves caution that LLMs perform noticeably better against a small set of well-designed, purpose-built tools than against a one-to-one mirror of a REST API, and they recommend the auto-conversion mainly for bootstrapping and prototyping. The reason is that REST endpoints are designed for programmers who already know what they want, whereas tools are consumed by a model that has to infer intent from names, descriptions, and parameter shapes. A sprawling auto-generated surface with cryptic operation IDs and dozens of overlapping parameters tends to confuse the model. So wrap your API to get moving, then curate: collapse related endpoints into intent-shaped tools, give them clear names and descriptions, and keep parameters simple.
What if I want to use JavaScript, Go, or another language?
This is the most important thing to understand about FastMCP: it is a Python framework, and despite the name it is not a multi-language project. The good news is that MCP is an open protocol, not a Python one, so you are not locked in. There are official, first-party SDKs for TypeScript and JavaScript, Go, C#, Java, Kotlin, Rust, Swift, Ruby, and PHP, all maintained under the Model Context Protocol organization. If your stack is Node, the TypeScript SDK is the most mature non-Python option and is what most JavaScript MCP servers are built on; if you are in Go, the official Go SDK is the right starting point. What you give up by leaving Python is the specific ergonomics FastMCP layers on top of the raw SDK, the decorator-driven schema generation and the batteries-included integrations. Some of those conveniences have informal equivalents in other ecosystems, but none is as dominant in its language as FastMCP is in Python. A reasonable rule of thumb: if the team is Python-first, use FastMCP; if not, reach for the official SDK in your language rather than forcing Python into your stack just for MCP.
The wider MCP ecosystem and official SDKs
It helps to remember what you are actually building on. MCP is an open standard, often described as a USB-C port for AI applications: a single, standardized way to connect a model to external data, tools, and workflows instead of writing a bespoke integration for every pairing. That openness is the whole point, and it is why the protocol is supported across a broad range of clients and servers, including assistants like Claude and ChatGPT and developer tools like VS Code, Cursor, and MCPJam. You build a server once and it works everywhere that speaks MCP.
Because the standard is language-neutral, the Model Context Protocol organization maintains official SDKs in a wide spread of languages, so you are rarely forced to leave your existing stack. At the time of writing the first-party SDKs cover Python (the same SDK that absorbed FastMCP 1.0), TypeScript and JavaScript, Go, C#, Java, Kotlin, Rust, Swift, Ruby, and PHP. The practical implication is that FastMCP is best understood as the premier Python experience on top of MCP rather than the only way in: if you are on Node, reach for the TypeScript SDK; on the JVM, the Java or Kotlin SDK; in systems code, Go or Rust; and so on. Each of these can build both servers and clients, and because they all implement the same wire protocol, a server written in one language talks to a client written in another without anyone needing to care which is which.
Security concerns
Security with FastMCP splits into two layers: the transport and the tools themselves. On the transport side, the framework gives you the right primitives but does not force you to use them. Authentication is strongly recommended for any remote server, and several clients will refuse to connect without it; FastMCP supports bearer tokens, JWT, and full OAuth 2.1 with providers like Auth0, WorkOS, Cognito, Azure, and GitHub. A few details deserve attention. Custom routes, including health checks, are deliberately never protected by the auth middleware, so do not hang anything sensitive off them. CORS should never use a wildcard origin in production; list exact origins. And the OAuth proxy issues its own JWTs rather than forwarding upstream tokens, which means production demands an explicit signing key and encrypted at-rest storage for tokens rather than the convenient development defaults, which on Linux are ephemeral and on Mac or Windows live in the system keyring.
The deeper risk, though, is not specific to FastMCP at all: it is that you are handing a language model the ability to invoke real tools. Every tool you expose is an action an agent might take based on text it reads, and that text can come from untrusted sources, which opens the door to prompt-injection-driven tool calls. FastMCP will faithfully execute whatever your function does, so the burden is on you to scope each tool narrowly, validate and sanitize inputs even though the schema is generated for you, enforce least privilege on whatever credentials the server holds, and think hard before exposing anything destructive or irreversible. Authentication tells you who is calling; it does nothing to constrain what a confused or manipulated model asks for once it is in. Treat tool design as a security surface, not just an ergonomics one. Sandboxing the execution environment and adding human-in-the-loop confirmation for high-impact actions are worth the friction.
The bottom line
FastMCP is the most pleasant way to build MCP servers in Python, it is genuinely production-capable, and its popularity means you are standing on well-worn ground. The caveats are the ones you would expect: respect the deployment details when you scale, treat auto-converting an API as a starting point rather than a destination, reach for an official SDK if you are not in Python, and remember that the easy part is exposing a tool while the hard part is deciding what a model should be allowed to do with it.
