r/SpecDrivenDevelopment

▲ 9 r/SpecDrivenDevelopment+2 crossposts

OpenSpec template — spec-driven dev for fork-and-go

GitHub repo:

https://github.com/arananet/openspec-template

Template I use for every new project. Core rule: every feature/bugfix needs a YAML spec (acceptance criteria + test plan) before code. Enforced by a pre-commit hook, a deterministic CI check, and an agentic spec-vs-code review.

Setup is one command (bash setup.sh).

When you open the fork in Claude Code, it reads CLAUDE.md, interviews you for project details, customizes the README, and scaffolds your first spec. Same instructions apply to Codex CLI and Copilot via AGENTS.md and .github/copilot-instructions.md.

What's in the box: CodeQL, gitleaks, dep-review, OSSF Scorecard, SBOM + cosign signing + SLSA provenance on releases, DCO, doc-drift check, lint stack, Dependabot auto-merge for patches, cost-capped AI workflows, optional CODEOWNER-gated issue auto-fix agent.

Local scripts/openspec CLI (pure bash) handles scaffold/check — no external dependency.

MIT, feedback welcome.

u/arananet — 5 days ago

I'm a tech lead planning to roll out coding agents (Claude Code, Codex, Copilot CLI) with Spec-Driven Development as the methodology around them. The challenge is that almost every SDD example I've found with spec-kit, Kiro and the usual blog posts assumes a single team working in a single repo. My reality is quite different:

  • A platform built from many components like UI, HTTP services, event-driven consumers, warehouse and cache, all of them spread across multiple repos.
  • Ownership split across several teams. My team owns some backend components but not all.
  • Most interesting features are vertical slices: they touch the UI, a couple of services we own, an event consumed by another team, and sometimes a data contract owned by data platform.

The naive SDD flow assumes the agent can see the whole slice and one team can ship it. Neither is true here. What I'm trying to figure out is how SDD actually works when a single feature is spread across repos and teams that can't be coordinated by an agent alone.

If you've done this for real, I'd love to hear how you handle:

  1. Where the spec lives. One spec for the whole slice, or one per component with a coordinating doc on top? Who owns it, who signs off on it?
  2. Cross-repo context. How does the agent reason about a slice when half the relevant code lives in a repo it can't see, owned by a team it can't talk to?
  3. Coordinating contract changes. When the slice requires a new event or API owned by another team, how does that negotiation flow? Does it block the work, run in parallel, get stubbed?
  4. Sequencing. Who builds first, who deploys first, how do you avoid "we shipped, they aren't ready"?
  5. What didn't work. Approaches you tried and abandoned, frameworks that demoed well and fell over, anti-patterns worth warning me away from.

Not looking for theory; looking for real examples and experiences

reddit.com
u/Double_Appearance741 — 13 days ago

Hello!

I've started experimenting with SDD quite a lot in the past months from easier to complex tasks, to try to utilize agentic coding in a way that results in a more efficient dev workflow. I see a lot of benefits a part from efficiency, such as better documented features and higher quality outputs, mainly due to spending more time planning and analysing the tasks.

At the same time, are there any studies showing empirical results of this workflow, or is it too early to ask? Would be interesting to see what the average net-effect is, especially due to the increase cost of AI-usage.

I'm strongly believing in this new workflow from my own experience, but I can also see the somewhat new bottlenecks popping up, especially during review processes etc. Though, it'd be interesting to see some hard facts.

reddit.com
u/cajmorgans — 11 days ago

I like openspec, but I can accept how it gives feeling that spec can be updated at scales.

For me, specs are disposable. What are not and could be maintained are requirements. The difference is that requirements explain what needs to be done and spec say how it would be implemented and tested.

I do not know any SDD framework that maintains a set of requirement aligned with the code. Do you ?

reddit.com
u/stibbons_ — 11 days ago

In this video, I walk through a custom OpenSpec schema that formally captures Architectural Decision Records (ADRs) and preserves them in a persistent folder. This ensures that every new change proposal "reads" your previous tech choices (like moving from Server Side Rendering to a split frontend/backend) before suggesting new designs. Would love to hear your thoughts and feedback.

u/harikrishnan_83 — 11 days ago

I wrote a piece about "agent harnesses" - both the "inner" harness (the coding agent / assistant), the "outer" harness (all the parts *you* bring to it), and the piece I feel is missing (and am building, free + open source). If you find this information helpful, please subscribe (link below).

>If the inner harness provides a set of core capabilities, the outer harness is everything you bring to it. Böckeler's framework breaks it into two categories: feedforward controls and feedback controls.

>Feedforward controls, or "guides", are everything that shapes behavior before the agent acts, with the goal of preventing mistakes before they happen. They come in several flavors:

>Guidance - CLAUDE.md files, architecture docs, coding conventions. Either auto-loaded by the agent or indexed so the agent reads them on demand when relevant.

>Skills - reusable procedures the agent activates based on their description matching the task.

>Specs - instructions the human explicitly tells the agent to read and follow. (I wrote about spec-driven development in Think Before You Prompt.)

>On the other side are feedback controls - post-action observers "optimised for LLM consumption." (She calls these "sensors.") Deterministic feedback comes from tools with fixed, repeatable outputs: linters, type checkers, test runners, build scripts. LLM-based feedback uses a second model to evaluate what the first model produced: code reviewers, spec-compliance checkers, evaluator agents - or the agent itself closing what Osmani calls the "self-verification loop" by observing its own output through a browser or screenshot tool.

>Deterministic feedback catches what rules can express; LLM-based feedback catches what only judgment can - architectural drift, spec misinterpretation, subtle regressions. Boris Cherny, creator of Claude Code, noted that giving the model a way to verify its work improves quality by 2–3×. The practitioner heuristic "hooks over prompts for reliability" is a statement about preferring feedback over feedforward - feedback doesn't depend on the agent's attention. My Agent Validator tool is a configurable feedback loop runner for both types - deterministic checks and LLM-based reviews.

>Two other pieces round out the outer harness: persistent memory and codebase preparation. Without cross-session recall, every conversation starts cold - the agent re-learns your codebase, your conventions, your past mistakes. And agents perform dramatically better on clean, well-structured code - the outer harness isn't only what you configure, it's also what you've already cleaned up.

>All four connect through the steering loop: "Whenever an issue happens multiple times, the [harness] should be improved to make the issue less probable to occur in the future, or even prevent it." When something goes wrong, you can improve a feedforward control (prevent it next time), add a feedback control (catch it next time), save it to memory (so the agent doesn't repeat it across sessions), or clean up the code that confused the agent in the first place - or some combination of the above.

>The human's job is the steering loop - channeling what goes wrong into better feedforward, feedback, memory, and code. I wrote about what the human actually does in issue #3.

>This is becoming one of the main functions of the human software engineering role - cultivating the harness.

Full link: https://codagent.beehiiv.com/p/harnesses-explained

u/paulcaplan — 12 days ago