Spec-Driven Development with Multi-Agent Systems: Documentation as Foundation
When implementing spec-driven, test-driven multi-agent workflows for complex existing applications, one critical question emerges: Should you first document all existing functionality and current state before proceeding? The answer is a resounding yes—but not in the traditional sense of "writing documentation."
Introduction
Modern software development increasingly relies on multi-agent systems to handle complex workflows, from code generation to testing and deployment. However, when applying these approaches to legacy systems—particularly consumer-facing (ToC) applications with accumulated complexity—teams face a fundamental challenge: How do you enable agents to work effectively with systems they've never seen before?
This article explores why specification-driven development requires a structured approach to documenting current system behavior, and how to implement this documentation strategy to support both human developers and AI agents.
The Context: Why Legacy Systems Break Multi-Agent Workflows
The Multi-Agent Prerequisite
Multi-agent systems operate on a fundamental principle: they can only work with cognitively separable objects. Unlike human developers who can "read code and infer business logic," agents require explicit:
- Specifications
- Contracts
- Invariants
- Test oracles
The Legacy System Reality
Most existing ToC applications suffer from common issues:
- Accumulated functionality: Features built incrementally without comprehensive design
- Documentation debt: Real behavior ≠ README ≠ Product Manager's memory
- Hidden rules: Business logic embedded in:
- Conditional branches
- Magic numbers
- Historical bug workarounds
- Implicit assumptions
The Inevitable Failure Pattern
Without proper documentation groundwork, spec-driven multi-agent workflows typically fail in predictable ways:
- Incomplete specifications: Missing edge cases and implicit behaviors
- Unstable tests: Tests that don't capture real system behavior
- Agent conflicts: One agent fixes functionality while breaking another
- Human intervention loops: Developers forced back into firefighting mode
Core Concepts: The Four Pillars of Current-State Documentation
The solution isn't traditional comprehensive documentation, but rather a structured "archeological excavation" focused on enabling multi-agent workflows.
1. Feature Inventory (Not Feature Documentation)
Instead of describing "what we built," document what exists:
## Feature Modules
- User registration / authentication
- Content browsing
- Order processing
- Payment handling
- Refund management
- Push notifications
- Risk control / limitation logicFor each module, answer only three questions:
### Order Processing
- Under what conditions can users place orders
- What internal state changes occur after successful order placement
- What are the primary known failure modesKey principles:
- ❌ No flowcharts or implementation details
- ❌ No architectural explanations
- ✅ Only factual behavioral descriptions
2. Critical Business Invariants
These form the lifeline of spec-driven + TDD + multi-agent systems:
## System Invariants
- Paid orders cannot revert to "unpaid" status
- One user can only have one active order at any time
- Refunded orders cannot initiate additional refunds
- Banned users cannot trigger any write operationsThese statements are:
- Code-independent: Don't rely on implementation details
- UI-agnostic: Don't depend on interface specifics
- Highly stable: Rarely change over time
- Test-friendly: Easily converted to property tests and regression tests
3. Accepted Anomaly Catalog
This captures the messy but accepted reality of production systems:
## Known but Accepted Exceptions
- Payment success with callback failure may show frontend failure (requires manual reconciliation)
- Network instability may cause duplicate requests (backend uses idempotent keys)
- Legacy user data missing fields triggers compatibility logicWhy this matters:
- Agents naturally try to "fix" inelegant behavior
- Some system "messiness" is deliberately accepted
- Without documentation, agents will "helpfully" break production logic
4. Authority Hierarchy
Define conflict resolution rules when specifications, tests, and human knowledge disagree:
## Truth Source Priority
1. Production behavior (logs + data)
2. Existing core tests (if any)
3. Explicitly written specifications
4. Human memory / verbal agreementsThis hierarchy becomes critical for multi-agent conflict resolution.
Analysis: The Three-Layer Specification Architecture
The most crucial architectural decision involves how to handle specification evolution. The answer is a layered approach that separates historical facts from future intentions.
Layer 1: Current State Spec (Frozen, Read-Only)
Purpose: Historical system behavior snapshot
spec/
├─ current/
│ ├─ 2025-01/
│ │ ├─ order.md
│ │ ├─ payment.md
│ │ └─ invariants.mdRules:
- ✅ Only update when facts actually change
- ❌ No "polish" modifications
- ❌ No completeness requirements
- ✅ Explicit "uncertain/uncovered" sections allowed
Layer 2: Target Spec (Living, Modifiable)
Purpose: Desired future system behavior
spec/
├─ target/
│ ├─ order-v2.md
│ ├─ payment-refactor.mdAll new requirements, refactoring goals, and behavior modifications go here. This serves as:
- Primary agent execution input
- Test design source
- Code review reference
Layer 3: Difference Documentation (Bridge)
Purpose: Explicit change management
## Diff: current → target
- Current: Users can have multiple pending orders
- Target: Only one active order allowed
- Risks:
- Legacy data compatibility
- Concurrent request handlingThis layer provides:
- Human decision points
- Agent risk awareness
- Natural TDD test case generation
Why This Architecture Works
The three-layer approach solves critical problems:
- Prevents reference loss: Historical behavior remains accessible for regression testing
- Enables change tracking: Clear distinction between "what is" vs "what should be"
- Supports agent reasoning: Agents understand they're "building future" not "explaining past"
Implementation Strategy: Practical Rollout
Phase 1: Minimum Viable Documentation
Rather than attempting comprehensive documentation, start with immediate needs:
- Select one module requiring near-term changes
- Document only that module using the four pillars:
- Feature boundaries
- Invariants
- Accepted anomalies
- Authority hierarchy
- Create:
spec/current-state.md - Then begin: test writing and agent integration
- Repeat for subsequent modules
Phase 2: Specification Evolution Process
New requirements workflow:
- New requirement → Write target spec
- Agent/human development → Code changes + tests
- Production deployment → Behavior confirmation
- Generate new current snapshot → Add time-stamped version (don't overwrite)
Phase 3: Error Handling
When current specs are discovered to be wrong:
Scenario A: Initial misunderstanding
## Addendum (Discovered 2025-02)
Previous description omitted:
- Under condition XXX, system actually behaves YYYScenario B: System actually changed
Create new snapshot: spec/current/2025-03/
Implications and Best Practices
For Multi-Agent Workflows
- Agent Specialization: Different agents can focus on different specification layers
- Conflict Resolution: Clear authority hierarchy prevents agent conflicts
- Regression Prevention: Historical specs enable "behavior preservation" agents
For Development Teams
- Reduced Context Switching: Specifications serve as shared understanding
- Safer Refactoring: Clear change boundaries and risk documentation
- Onboarding Acceleration: New team members understand both current and target states
For Legacy System Migration
- Incremental Modernization: Module-by-module specification and refactoring
- Risk Management: Explicit anomaly documentation prevents "fixing" accepted behavior
- Audit Trail: Complete change history for compliance and debugging
Code Example: Specification Template
# Order Processing Module - Current State (2025-01)
## Feature Boundaries
- Order creation: Authenticated users with valid payment methods
- Order modification: Within 30-minute window, status-dependent
- Order cancellation: User-initiated or system timeout
## Business Invariants
- Order.total_amount == sum(OrderItem.price * OrderItem.quantity)
- Order.status transitions: draft → pending → paid → fulfilled
- Cancelled orders cannot transition to any other status
## Known Anomalies
- Mobile app may show "processing" for up to 5 minutes after payment
- Concurrent order attempts may create duplicate drafts (cleaned by daily job)
## Authority Hierarchy
1. Production order_events table
2. Existing payment integration tests
3. This specification document
4. Product team requirementsConclusion
Successfully implementing spec-driven development with multi-agent systems requires treating documentation not as a one-time effort, but as foundational infrastructure. The key insights are:
- Documentation must serve agents first, humans second: Agents require explicit, structured information that humans can often infer
- Historical accuracy trumps completeness: Better to have accurate partial documentation than comprehensive but incorrect specifications
- Layered specifications enable evolution: Separating "what is" from "what should be" prevents the loss of crucial historical context
The three-layer specification architecture—current state (frozen), target state (living), and difference documentation (bridge)—provides a robust foundation for multi-agent workflows while maintaining the historical context necessary for safe system evolution.
For teams facing similar challenges with legacy systems, the recommendation is clear: invest in structured current-state documentation first. It's not just documentation—it's the foundation that makes everything else possible.
The bottom line: Spec-driven development doesn't begin with writing specifications—it begins with understanding what you actually have.
