What the heck is spec coding?
While some of us are still trying to grasp what vibe coding means, a new concept just got dropped: spec coding. Well what the heck is spec coding and how does it compare to vibe coding? Is it just another loaded term for something simple? If it is useful, how should I write a spec? Do you have any examples? How can I use it with Claude Code or Codex?
Let’s break it down.
First, a quick recap on vibe coding
Vibe coding, a term popularized by Andrej Karpathy in early 2025, is the idea of building software by describing what you want in natural language and letting an AI model generate the code. You don’t read the code, you don’t really care how it works under the hood, you just go with the vibes. If something breaks, you paste the error back and let the AI fix it. It’s fast, it’s fun, and for small projects or quick prototypes, it can feel like magic.
But the cracks show up fast. The moment your project grows beyond a single-page app, vibe coding starts to fall apart. Without a clear plan, the AI generates inconsistent code, makes conflicting architectural decisions across files, and loses track of what it already built. You end up spending more time debugging and rewriting than you saved by skipping the planning phase. The vibes stop vibing pretty quickly.
Enter spec coding
Spec coding, or more formally Spec-Driven Development (SDD), flips the approach. Instead of jumping straight into prompting an AI to write code, you start by writing detailed specifications first. You define the what and the why before the how. Think of it as giving the AI a blueprint instead of a vague wish.
The concept gained serious traction when GitHub released spec-kit, an open-source toolkit designed specifically for this workflow. The idea behind spec-kit is that specifications become executable artifacts that directly generate working implementations, rather than just serving as documentation that gets tossed aside once coding begins.
How the spec coding workflow actually works
The spec-driven workflow follows a structured multi-step process. Using spec-kit as an example, it typically goes like this:
First, you establish a constitution for your project. These are governing principles about code quality, testing standards, and design patterns that the AI must follow throughout the entire build.
Next, you write a specification describing what you want to build in detail, focusing on user stories and functional requirements. Crucially, you don’t specify the tech stack here. You focus on the product, not the implementation.
Then comes the planning phase where you bring in your technical choices. You tell the AI your preferred tech stack and architecture, and it generates a detailed implementation plan including data models, API contracts, and research documents.
After that, the plan gets broken into ordered, actionable tasks with dependency management and file-level specificity. Tasks that can run in parallel are flagged, and the whole thing reads like a proper development roadmap.
Finally, the AI executes the tasks in order, building the project incrementally while respecting all the specifications and constraints you defined earlier. Each phase produces a document that the AI uses as context for the next phase. Nothing is lost between prompts.
Vibe coding vs spec coding: the real difference
The easiest way to think about it is this: vibe coding is like telling a contractor to just start building your house based on a conversation you had over coffee. Spec coding is like handing them a full set of architectural blueprints first.
With vibe coding, you iterate through trial and error. You prompt, see what comes out, fix it, prompt again. The AI has no memory of the big picture because there is no big picture. Each prompt is basically a fresh start with whatever context fits in the window.
With spec coding, the AI always has a reference point. The specification documents serve as persistent context that guides every implementation decision. When the AI writes a database model, it checks the spec. When it builds an API endpoint, it checks the plan. When it generates a UI component, it knows exactly how it should behave because the behavior was defined upfront.
That said, it’s not really an either/or situation. Vibe coding is great for exploration, quick prototypes, and throwaway scripts. Spec coding shines when you’re building something that needs to work reliably, scale, or be maintained by a team. Many developers are finding that the sweet spot is starting with a vibe coding session to explore ideas, then switching to a spec-driven approach once the direction is clear.
Why this matters beyond developer productivity
The rise of spec coding comes at an interesting time. Anthropic recently published research on AI’s labor market impacts, finding that AI-exposed occupations like computer programmers already have 75% task coverage by AI tools. But here’s the nuance: actual usage still lags far behind theoretical capability. AI could theoretically speed up most programming tasks, but in practice, adoption is uneven and many workflows haven’t been restructured to take full advantage of it.
Spec coding is an attempt to close that gap. By giving AI agents better structure and context, you unlock more of their capability. It’s not about replacing developers — it’s about changing the job from writing code line by line to defining intent and reviewing output. The developer becomes more of an architect and less of a bricklayer.
How to write a spec that actually works
Knowing about spec coding is one thing. Writing a spec that actually produces good results from an AI agent is another. A bad spec leads to bad output no matter how structured your workflow is. Here are the principles that separate a useful spec from a useless one.
Start with the problem, not the solution
The most common mistake is jumping straight to describing features. Instead, start by explaining the problem you’re solving and who you’re solving it for. Give the AI context about the domain, the users, and the pain points. An AI that understands why something needs to exist will make far better decisions about how to build it.
Be specific about behavior, not implementation
Describe what the system should do from the user’s perspective. Instead of writing “build a REST API with CRUD endpoints,” write “a staff member should be able to create a new client profile, update their contact information, and deactivate their account without deleting their records.” The first tells the AI what technology to use. The second tells it what the software needs to accomplish. Let the AI figure out the implementation during the planning phase.
Define boundaries and constraints explicitly
Specs should clearly state what the system does and does not do. If your application should never allow users to delete records permanently, say so. If data must be encrypted at rest, say so. If the application needs to work offline, say so. AI agents are eager to build features. Without explicit boundaries, they’ll add things you never asked for or miss security requirements you assumed were obvious.
Write user stories, not feature lists
User stories force you to think in terms of workflows rather than isolated features. A feature list says “document upload, document search, document sharing.” A user story says “as a paralegal, I need to upload signed retainer agreements to a client’s secure vault so that attorneys can access them during case preparation.” The user story carries context that shapes how the feature gets built.
Include acceptance criteria
Every user story should have clear conditions for when it’s done. What does success look like? What edge cases matter? Acceptance criteria give the AI concrete targets to build toward and give you a checklist to validate the output against. Without them, you’re back to vibing.
Separate concerns into logical groups
A good spec organizes requirements into logical modules or domains. Group related functionality together so the AI can reason about dependencies and build things in the right order. A flat list of fifty requirements is harder for both humans and AI to work with than five well-organized sections of ten requirements each.
Example: spec coding a CRM for a law firm
Theory is great, but let’s see what this actually looks like in practice. Imagine you’re building a CRM specifically designed for a small law firm. The firm gets leads through a website intake form and through phone calls that come in via their Google Maps and Google Search listings. Those leads need to be tracked, followed up on, and eventually converted into paying clients. Once converted, client profiles need to be created, billable hours need to be logged, and legal documents need to be stored securely.
Here’s how you might write the spec for this project using a spec-driven workflow.
The constitution
Before writing a single requirement, you’d establish governing principles for the project. For a law firm CRM, this might include rules like: all client data must be encrypted at rest and in transit, the system must maintain a complete audit trail for every record change, no client record may ever be permanently deleted (only soft-deleted for compliance), and all document storage must use server-side encryption on S3. These principles act as guardrails that the AI must respect throughout the entire build.
The specification
This is where you describe the system from the firm’s perspective. You’d organize it into logical modules:
Lead Intake and Tracking — When a potential client fills out the website intake form, the system should capture their name, contact information, case type, and a brief description of their legal issue. When someone calls the firm’s phone number listed on Google Maps or Google Search, a staff member should be able to manually log that call as a new lead with the same information. Every lead should have a status that moves through stages: new, contacted, consultation scheduled, consultation completed, retained, and lost. The system should show a dashboard of all active leads with their current status and the date of last contact.
Follow-up and Conversion — Staff members should be able to schedule follow-up tasks for any lead, with due dates and optional notes. The system should surface overdue follow-ups prominently so nothing falls through the cracks. When a lead decides to retain the firm, converting them to a client should carry over all their lead information into a full client profile without requiring re-entry of data.
Client Profiles and Case Management — Each client profile should include personal information, case details, assigned attorneys, and a timeline of all interactions. Attorneys and paralegals should be able to add notes to a client’s file. Each client can have multiple matters or cases, and each matter should track its own status, assigned team members, and key dates.
Time Tracking and Billing — Attorneys should be able to log billable hours against a specific client and matter. Each time entry should capture the date, duration, a description of the work performed, and the billing rate. The system should be able to generate a summary of unbilled hours for any client or matter. Paralegals and staff should also be able to log time, potentially at different billing rates.
Secure Document Vault — Each client should have a dedicated document vault. Authorized users should be able to upload, download, view, and organize documents into folders. All documents must be stored on Amazon S3 with server-side encryption (AES-256). Access to a client’s vault should be restricted to team members assigned to that client. The system should log every document access, upload, and download for audit purposes. Documents should never be permanently deleted — only moved to an archived state that administrators can review.
The planning phase
With the spec written, you’d move to the planning phase and bring in your technical choices. You might tell the AI something like: the backend should use Node.js with Express, the database should be PostgreSQL, the frontend should use React, documents should be stored on S3 using the AWS SDK with SSE-S3 encryption, and the application should be containerized with Docker for deployment. The AI would then generate a detailed implementation plan including the database schema, API contract, S3 bucket configuration, and authentication flow.
The task breakdown
The AI would then break the plan into ordered tasks. It might start with database models and migrations, then move to the authentication layer, then the lead management API, followed by the client conversion workflow, time tracking, and finally the document vault with S3 integration. Each task would specify exact file paths, dependencies on other tasks, and which tasks can run in parallel.
Why this works better than vibe coding
If you tried to vibe code this same CRM, you’d run into problems almost immediately. The AI might build the lead tracking without thinking about how leads convert to clients, so you’d end up with duplicate data structures. It might store documents locally instead of on S3 because you forgot to mention it in that particular prompt. It might skip encryption entirely because the security requirements weren’t in context when it built the upload feature. With a spec, all of these requirements are captured upfront and the AI can reference them at every step.
The spec also becomes living documentation. As the firm’s needs evolve — maybe they want to add email integration or a client portal — you update the spec first, then let the AI plan and implement the changes within the existing architecture. No context is lost, no architectural decisions are contradicted, and new features are built on a solid foundation rather than bolted onto a pile of vibe-coded spaghetti.
Do you need special tools for spec coding?
Not really. You can practice spec-driven development with nothing more than a text editor and any AI coding assistant. The core idea is about writing structured specifications before you code, and that doesn’t require any particular tooling.
That said, tooling makes the workflow significantly smoother. GitHub’s spec-kit is currently the most popular option. It’s an open source CLI tool that scaffolds your project with templates, sets up the right directory structure, and gives your AI agent a set of commands that map to each phase of the spec-driven workflow. It supports over 30 AI agents out of the box, so chances are whatever tool you’re already using is compatible.
To get started, you just need Python 3.11+, Git, and uv for package management. Install the CLI, run specify init with your preferred AI agent, and the toolkit takes care of the rest — creating your project structure, installing agent-specific commands, and setting up the scripts that manage your spec-driven workflow.
But even without spec-kit, you can apply the principles manually. Create a project folder with a spec document, a plan document, and a task list. Write your requirements first, then your technical plan, then break it into tasks, and only then start prompting your AI to implement. The structure is what matters, not the specific tool.
Using spec coding with Claude Code
Claude Code has first-class support in spec-kit. When you initialize a project with specify init my-project --ai claude, it automatically installs spec-kit as skills in the .claude/skills directory. This means Claude Code picks up the spec-driven commands natively — no extra configuration needed.
The commands show up as slash commands with a slightly different naming convention than other agents. Instead of dotted syntax, Claude Code uses dashes: /speckit-constitution, /speckit-specify, /speckit-plan, /speckit-tasks, and /speckit-implement. You run them in order, and each one builds on the output of the previous phase.
A typical workflow looks like this: you open your project directory in Claude Code, run the constitution command to set up your project principles, then use the specify command with a detailed description of what you want to build. Claude Code will generate the spec, create a feature branch, and set up the directory structure. From there you refine the spec through clarification, generate the technical plan, break it into tasks, and finally let Claude Code implement everything step by step.
One thing that makes Claude Code particularly well-suited for spec coding is its ability to research and validate decisions. During the planning phase, you can ask it to research specific technical choices, check compatibility between libraries, or verify that the architecture makes sense for your use case. This turns the planning phase into a genuine design review rather than a rubber stamp.
Using spec coding with Codex CLI
OpenAI’s Codex CLI also works with spec-kit, though the setup is slightly different. You initialize with specify init my-project --ai codex --ai-skills. The --ai-skills flag is important here — Codex CLI recommends using skills mode, and spec-kit installs its commands as Codex skills in the .agents/skills directory.
The naming convention for Codex is different from other agents. Instead of slash commands, you invoke spec-kit phases with a dollar sign prefix: $speckit-constitution, $speckit-specify, $speckit-plan, $speckit-tasks, and $speckit-implement. The workflow itself is the same — constitution first, then specification, planning, task breakdown, and implementation.
The experience is similar to Claude Code in that you’re working from the terminal with an AI agent that has access to your file system. The spec-kit templates and directory structure are identical regardless of which agent you choose, which means you could start a project with Codex and later switch to Claude Code (or vice versa) without losing any of your spec artifacts.
What about other tools?
Spec-kit supports a long list of AI agents beyond Claude Code and Codex. GitHub Copilot, Cursor, Gemini CLI, Windsurf, Roo Code, Qwen, and many others all have official integration. For most of these, the commands use the standard dotted slash syntax like /speckit.plan and /speckit.implement. If your preferred agent isn’t on the list, there’s a generic mode where you point spec-kit at your agent’s command directory and it sets everything up accordingly.
The community around spec-kit is also growing fast. There are extensions for integrating with project management tools like Jira and Azure DevOps, extensions for code review and security audits, and even a pirate-speak preset that turns your entire spec workflow into nautical terminology (specs become “Voyage Manifests” and tasks become “Crew Assignments”). The point is that the ecosystem is flexible enough to fit into whatever workflow you already have.
So is spec coding just a loaded term for something simple?
Kind of, yes. At its core, spec coding is just the old software engineering wisdom of planning before you build, applied to the new world of AI-assisted development. Experienced engineers have always written specs before coding. What’s new is that those specs are no longer just documentation for humans — they’re executable instructions for AI agents.
But giving it a name matters. Naming the practice creates a shared vocabulary for teams to rally around. It sets expectations. When someone says they’re doing spec-driven development, you know they’re not just throwing prompts at an AI and hoping for the best. They’re following a structured process with reproducible artifacts and quality gates.
The ecosystem around spec coding is growing fast. GitHub’s spec-kit already supports a huge range of AI agents including Claude Code, GitHub Copilot, Cursor, Gemini CLI, Windsurf, and many others. Community extensions are popping up for everything from Jira integration to security audits. It’s still early days, but the direction is clear: as AI coding tools get more powerful, having structured ways to direct them will become essential.
Whether you call it spec coding, spec-driven development, or just “doing it properly,” the underlying message is simple: give AI better instructions and you’ll get better software. The vibes were fun while they lasted, but specs are what ship.