Why Spec-Driven Development Tools Fail in the Enterprise

The spec-driven development (SDD) movement is gaining momentum. Tools like Amazon Kiro, GitHub Spec Kit, and BMad Method promise to bring structure to AI-assisted coding. And they deliver on that promise, as long as you start from scratch.

But most enterprise teams don’t start from scratch. They maintain existing systems with hundreds of thousands of lines of code, established architectures, and years of accumulated business logic. When these teams try to adopt the popular SDD tools, they run into a wall.

This post compares the three most popular spec-driven tools and explains why the AI Unified Process (AIUP) at unifiedprocess.ai takes a fundamentally different approach. One that actually works for enterprise software development.

The Three Contenders

Amazon Kiro

Kiro is Amazon’s agentic IDE built on Amazon Bedrock. It follows a three-phase workflow: Requirements, Design, and Implementation. You describe a feature in natural language. Kiro generates user stories with acceptance criteria in EARS notation, creates a technical design document, and breaks it down into implementation tasks.

The workflow is clean and structured. But it was designed for greenfield projects. Even Amazon’s own product team has acknowledged this gap. As one Broadcom engineer put it: most developers don’t start from a greenfield idea. They start from an existing codebase, a messy bug, or a design they already agreed on. Kiro has since added Design-first and Bugfix workflows to address this, but the core assumption remains: you are building something new.

For enterprise teams, Kiro’s mandatory three-phase pipeline creates overhead that doesn’t match their reality. A single-line bug fix in a legacy system should not trigger a full spec generation pipeline. And when existing applications are large, it becomes impractical for LLMs to create specifications without exceeding context limits.

GitHub Spec Kit

Spec Kit is GitHub’s open-source toolkit for SDD. It provides a CLI and templates that guide you through four phases: Specify, Plan, Tasks, and Implement. It is agent-agnostic and works with GitHub Copilot, Claude Code, Gemini CLI, and others.

Spec Kit introduces a powerful concept: the constitution.md file that establishes non-negotiable principles for your project. This is useful for organizations with opinionated stacks. The workflow is well-structured and produces a clear trail from intent to implementation.

But Spec Kit has the same fundamental problem. Its templates are generic and require manual customization for existing codebases. The documentation overhead can be significant, sometimes generating more markdown files to review than it would take to build the feature itself. And as Martin Fowler’s analysis pointed out, introducing SDD tools into an existing, entangled codebase is still mostly hand-wavy when it comes to practical migration strategy.

BMad Method

BMad Method takes a different approach: a multi-agent framework where specialized AI agents (Analyst, Product Manager, Architect, Developer, QA) collaborate through file-based communication. It claims to be scale-adaptive, automatically adjusting from bug fixes to enterprise systems.

BMad is the most ambitious of the three. With 19+ specialized agents and 50+ workflows, it provides comprehensive coverage. It even has an enterprise test architecture extension.

But this complexity is also its weakness. The framework requires Node.js 22+, has a steep learning curve, and the agent orchestration adds significant overhead. The v6 Alpha is still recommended for new projects, which tells you something about maturity. And despite claiming enterprise readiness, the real-world evidence for large-scale brownfield adoption is thin.

The Common Problem

All three tools share the same fundamental assumptions:

1. You start from a clean slate. The workflows are optimized for “describe what you want and generate it.” They don’t account for 500-table Oracle schemas, established architectural patterns, or fifteen years of accumulated business rules.

2. One developer drives the process. Enterprise development involves multiple stakeholders, cross-team dependencies, review processes, and compliance requirements. These tools assume a single person goes from spec to code.

3. The tech stack is flexible. These tools happily suggest whatever framework fits best. But enterprise teams have standardized stacks. If your organization runs Spring Boot with Vaadin and jOOQ on OpenShift, the tooling needs to understand and respect that.

4. Specs are created from nothing. In reality, enterprise systems already have requirements, often in the form of existing code, database schemas, and running business processes. The challenge is not to create specs from scratch but to work with what exists and extend it incrementally.

5. The methodology exists only at the coding level. All three tools focus on turning specs into code. But enterprise development is a broader process that includes requirements engineering, stakeholder alignment, testing strategy, and continuous delivery. A tool that only covers the “code generation” step misses most of the picture.

A Different Approach: AI Unified Process (AIUP)

The AI Unified Process at unifiedprocess.ai was designed from the ground up for the reality of enterprise software development. It takes a fundamentally different approach.

Requirements Stay at the Center

AIUP doesn’t start with code generation. It starts with requirements. Business requirements catalogs, entity models, use case diagrams, and use case specifications drive the entire process. These are artifacts that business stakeholders can review and validate, not markdown files that only developers see.

This is critical for enterprise teams where business alignment is not optional. Every line of code traces back to a business requirement. When a bug appears, you don’t dig through code to understand what the system was supposed to do, you go back to the specification.

Iterative, Not Waterfall

Unlike tools that follow a linear spec-to-code pipeline, AIUP runs in four agile phases, Inception, Elaboration, Construction, and Transition, each with multiple short iterations. Specifications, code, and tests improve together through continuous cycles. This is not a return to Big Design Up Front. It is iterative improvement where each cycle builds on the previous one.

Critics of SDD argue that it only works with exhaustive specifications that force deterministic output. AIUP addresses this directly: perfect specifications are impossible and unnecessary. The real value comes from iterative improvement where tests ensure consistent behavior regardless of how the AI generates code.

Technology-Specific Tooling

This is where AIUP differs most visibly from generic SDD tools. Instead of one-size-fits-all templates, AIUP provides technology-specific plugins for Claude Code:

The aiup-core plugin handles stack-agnostic requirements engineering: requirements catalogs, entity models, use case diagrams, and specifications. It works with any technology.

The aiup-vaadin-jooq plugin adds concrete implementation skills for Java web applications: Flyway database migrations, Vaadin UI implementation, Karibu unit tests, and Playwright E2E tests. It even integrates MCP servers for Vaadin, jOOQ, Karibu Testing, and JavaDocs, giving the AI direct access to framework-specific documentation and patterns.

This plugin architecture is extensible. More technology-specific plugins are in development. The key insight is that enterprise teams need tooling that speaks their language, not generic templates that require hours of customization.

Built for Brownfield

AIUP works at the feature level within bounded contexts, not at the “generate an entire app” level. You can introduce it into an existing project, create specifications for the next feature you are building, and let AI handle the implementation within your established architecture.

There is no mandatory full-system specification. There is no requirement to retrofit your entire codebase with specs before you can start. You start where you are, specify what you are building next, and let the iterative process improve your documentation coverage over time.

Testing as the Safety Net

AIUP treats tests as the primary safety net for AI code generation. Comprehensive tests, unit tests with Karibu, integration tests, E2E tests with Playwright, ensure that the system behaves consistently regardless of how the AI generates or regenerates code. This enables safe refactoring, safe modernization, and safe evolution of your codebase.

This is different from SDD tools that treat testing as an optional step or generate tests as an afterthought. In AIUP, the test strategy is planned from the Inception phase and executed throughout Construction and Transition.

Summary: The Right Tool for the Right Job

Aspect	Kiro	Spec Kit	BMad	AIUP
Primary audience	Solopreneurs, startups	Individual developers	Small teams	Enterprise teams
Greenfield/Brownfield	Greenfield-first	Greenfield-first	Greenfield-first	Brownfield-ready
Process scope	Code generation	Code generation	Full project lifecycle	Full development lifecycle
Tech stack	Stack-agnostic	Stack-agnostic	Stack-agnostic	Technology-specific plugins
Requirements engineering	User stories from prompts	PRD from prompts	Agent-generated	Business-validated use cases
Stakeholder involvement	Minimal	Minimal	Minimal	Continuous validation
Testing strategy	Generated tests	TDD recommended	QA agent	Planned from Inception
Iterative improvement	Linear pipeline	Linear pipeline	Sprint-based	Agile phases with iterations
Enterprise readiness	Limited	Experimental	Claims enterprise support	Designed for enterprise
Tooling	Proprietary IDE	Open-source CLI	Open-source agents	Skills + MCP

Conclusion

Amazon Kiro, GitHub Spec Kit, and BMad Method are valuable contributions to the spec-driven development movement. They solve a real problem: bringing structure to AI-assisted coding. For greenfield projects, prototypes, and small teams, they work well.

But enterprise software development is a different game. It requires working with existing systems, respecting established architectures, involving business stakeholders, and planning for long-term maintainability. The popular SDD tools were not designed for this reality.

The AI Unified Process bridges this gap. It brings the discipline of spec-driven development to the enterprise context, with requirements at the center, iterative improvement as the method, technology-specific tooling as the enabler, and comprehensive testing as the safety net.

If you are an enterprise team exploring spec-driven development, don’t force greenfield tools into a brownfield world. Start with a methodology that was built for your reality.

Learn more at unifiedprocess.ai.