One of the recurring questions I hear about Spec-Driven Development is this: where does the spec end and the implementation begin? The line is often blurry. Specs leak into class names. Implementation choices sneak into use case descriptions. After a few weeks, nobody can tell anymore what is intent and what is accident.
The AI Unified Process (AIUP) draws this line differently. Instead of two layers (spec and code), AIUP works with three: What, Harness, and How. The middle layer is where most of the leverage lives, and it is what makes AIUP different from other SDD approaches.
A short note on the word harness. Since February 2026, the industry has adopted the term Harness Engineering, first coined by Mitchell Hashimoto and then taken up by OpenAI, Martin Fowler and Birgitta Böckeler, and Red Hat. The idea is the same as in AIUP: build a structured environment around the AI agent so the result becomes predictable. AIUP goes one step further, because the harness alone is not enough. You also need a clear What on top and a verifiable How below it. That is what this post is about.
Let me walk you through each layer.
The What: Use Cases and Entity Model
The What is the problem space. It describes the system without saying anything about how the system is built.
Two artefacts live here:
Use cases describe behaviour. They follow Ivar Jacobson’s Use-Case 2.0 thinking: a use case is a slice of value that a user wants from the system. A use case has a goal, a main success scenario, alternative flows, and acceptance criteria. It is written in the language of the domain, not the language of the code.
The entity model describes structure. What are the business objects? What are their attributes? What are the invariants and relationships between them? Again, this is domain language. No tables, no columns, no JPA annotations.
Together, the What forms a complete picture of the problem. A domain expert or a business owner should be able to read it and say “yes, that is the system we want.” If they cannot, the What is not finished.
This is the same idea as classic requirements engineering. Nothing new so far. But it is exactly the part that Harness Engineering as described by OpenAI does not cover. Birgitta Böckeler made this point in her first reaction to the OpenAI write-up: the harness focuses on internal quality and maintainability, but the verification of functionality and behaviour is missing. In AIUP, that verification starts here, in the What.
The interesting part starts with the next layer.
The How: The Implementation
Let me jump to the How first, because it is the layer most developers think about. The How is the actual code that runs in production: Spring Boot services, Vaadin views, jOOQ queries, REST endpoints, configuration files.
In a traditional project, developers write all of this by hand. In an AI-assisted project, much of it is generated. Either way, the How is the solution. It changes often. It depends on technology choices. It can be replaced without changing the What.
The question is: how do you make sure the How actually reflects the What? In small projects, you can rely on a good developer who has read the spec. In larger projects, or in AI-assisted projects, this does not scale. You need something in between.
That something is the Harness.
The Harness: Where the Architect Lives
The Harness is the bridge between the What and the How. It is the set of guardrails that shape every piece of implementation, whether it is written by a human or generated by AI. The Harness is the architect’s main deliverable.
This is also where AIUP overlaps most with Harness Engineering. The OpenAI team described three building blocks: context engineering, architectural constraints, and a kind of “garbage collection” that fights entropy. The Red Hat article puts it in a short slogan: structure in, structure out. AIUP says the same thing, but it gives the architect a concrete set of artefacts to build:
Skills encode repeatable know-how. A skill might describe how to build a Vaadin view that follows the project’s layout conventions. Another skill might describe how to write a jOOQ query that respects the data access patterns. Another might describe how to write a Karibu test for a view. Skills are reusable. They are written once and applied many times. They are what make AI-generated code feel consistent across modules.
MCP servers give the AI access to the context and tools it needs. A jOOQ MCP server lets the AI inspect the actual database schema instead of guessing. A Karibu Testing MCP server lets it run tests. An Atlassian MCP server gives it access to the use cases stored in Confluence or Jira. Without MCP, the AI works blind. With MCP, it works with the same information a senior developer would have. This is the context engineering part of Harness Engineering, made concrete.
Guidelines set the rules. Naming conventions. Package structure. Test strategy. Which libraries to use and which to avoid. What goes into a service layer and what stays in a view. These are the rules a good codebase always has, but usually only in the heads of two or three senior developers. With AIUP, they are written down and applied automatically. This is the architectural constraints part of Harness Engineering.
Together, skills, MCP servers, and guidelines form executable architecture. This is the part I find most interesting. In traditional projects, architecture often ends as a slide deck or a Confluence page nobody reads. In AIUP, the architecture is the Harness, and the Harness shapes every line of code the AI produces. If the architect changes a guideline, every future artefact reflects it. If a skill is improved, every future view built with it gets better. Architecture stops being documentation and becomes a live system.
The Diagram
Here is how the three layers fit together:

The What flows into the Harness. The Harness shapes the How. Traceability runs in both directions: every piece of code should map back to a use case, and every use case should be covered by tests that exercise the code.
This bidirectional traceability is what closes the gap that Birgitta Böckeler flagged. A pure harness can guarantee that the code is well structured. It cannot tell you whether the code does what the business asked for. AIUP adds that link.
Two Jobs of the Harness
The Harness does two things at once, and both matter.
It enables the AI. Skills tell it how to build things. MCP servers tell it what is there. Without enablement, the AI works from generic patterns and produces generic code. The result is correct in isolation but inconsistent with the rest of the project.
It also constrains the AI. Guidelines tell it what not to do. They prevent the AI from inventing a new package structure, picking a different test framework, or using a library the project does not want. Without constraint, every module looks different and the codebase drifts.
A Harness that only enables is permissive. It produces working code that does not fit. A Harness that only constrains is restrictive. It blocks the AI without helping it. A good Harness does both.
Why This Separation Matters
Three reasons.
Specs stay clean. When the architect owns the Harness, the spec can stay free of implementation details. There is no temptation to write “use a Vaadin Grid here” in a use case, because there is a proper place to capture that decision: the Harness.
Architecture becomes real. Instead of a deliverable that ends up in a drawer, architecture becomes a working part of the development environment. The architect’s value is measurable: better skills, better guidelines, better MCP integration lead directly to better code.
AI scales without chaos. In ad-hoc AI usage, every developer prompts differently and the codebase drifts. In AIUP, the Harness is shared. The AI behaves consistently because the guardrails are consistent.
What This Means for the Roles
The roles change too.
The requirements analyst owns the What. Use cases and entity model. No technology. Validated by domain experts.
The software architect owns the Harness. Skills, MCP servers, guidelines. This is now a hands-on role. An architect who only draws diagrams is not enough. An architect in AIUP writes skills, configures MCP servers, and maintains guidelines. In Harness Engineering terms, the architect is the harness engineer.
The software engineer works mostly in the How, but in a different way than before. Instead of writing every line by hand, the developer drives the AI through the Harness, reviews the result, and refines either the code or the Harness when something does not fit.
The test engineer sits across all three layers. Tests are derived from use cases (the What), run against the implementation (the How), and use the testing skills and MCP tools the architect provides (the Harness). This is where behavioural verification happens. It is also the layer that the OpenAI version of Harness Engineering leaves open.
Closing Thought
Most SDD discussions stop at “separate the spec from the implementation.” The Harness Engineering movement adds a useful insight: there is a third layer in between, and it is where the real engineering work for AI-assisted development is done. AIUP shares that view, but it does not stop there. The Harness alone gives you well-structured code. It does not give you the right code. For that you need the What above it and verifiable tests below it. The three layers together turn intent into running software in a way that is both repeatable and verifiable.
If you are building business applications with AI assistance, the Harness is where I would invest first. The best use cases in the world will not save you if every developer prompts the AI differently. And the most powerful AI will not help you if it has no guardrails. Build the Harness, connect it to clear use cases on one side and to real tests on the other, and AI-assisted development starts to feel like engineering again.
If you want to learn how to apply this in practice with Spring Boot, Vaadin, and jOOQ, take a look at my Spec-Driven Development Workshop or visit unifiedprocess.ai.


