Agent on Morgoth

Spec-Driven Development

Wed, 17 Jun 2026 10:00:00 +0800

Spec-Driven Development

Spec-Driven Development (SDD) is a methodology for building software with AI coding agents where the specification, not the code, is the primary artifact. Instead of prompting an agent ad-hoc and hoping the output is correct, you first write down what you want — requirements, scenarios, constraints — and let the spec drive the plan, the tasks, and finally the implementation.

The core idea: AI agents are powerful but unpredictable. A natural-language prompt is lossy and easy to misread, and the larger the change, the more drift accumulates. A spec turns intent into a reviewable, version-controlled, executable contract. Humans and agents align before code is written, and the spec stays as living documentation afterwards.

Why SDD

The classic “vibe coding” loop (prompt → diff → fix → prompt again) breaks down as soon as the task is non-trivial:

Ambiguity. The agent fills gaps with assumptions you never see.
Drift. Multi-step changes lose the original intent halfway through.
No source of truth. Once the chat is gone, why the code looks this way is gone too.
Hard to review. Reviewing a 600-line diff is much harder than reviewing a one-page spec that produced it.

SDD addresses these by inserting an explicit, cheap-to-review specification step in front of code generation. The phases are roughly:

Specify — what to build and why (requirements, user stories, scenarios).
Plan / Design — the technical approach, architecture, trade-offs.
Tasks — a checklist of small, independently verifiable units of work.
Implement — the agent executes tasks against the spec.

The tools below differ mainly in how heavy this process is and whether the spec is a living document or a throwaway scaffold.

GitHub Spec Kit

https://github.com/github/spec-kit

GitHub’s official, opinionated take on SDD. It ships a CLI (specify, installed via uv) that bootstraps a project and wires a full set of slash commands into your agent of choice.

The workflow is a sequence of gated phases:

/speckit.constitution — establish project principles and governance.
/speckit.specify — define requirements and user stories.
/speckit.plan — create the technical implementation strategy.
/speckit.tasks — generate an actionable task breakdown.
/speckit.implement — execute the tasks systematically.

Plus optional helpers: /speckit.clarify (resolve ambiguities before planning), /speckit.analyze (cross-artifact consistency checks), and /speckit.checklist (custom quality gates).

uv tool install specify-cli
specify init my-project
specify integration list # 30+ supported agents

Pros

Backed by GitHub; mature, well-documented, actively developed.
Genuinely thorough — the constitution + clarify + analyze steps catch ambiguity and inconsistency early.
Technology-agnostic; supports 30+ agents (Copilot, Claude Code, Gemini CLI, Cursor, Codex CLI, …).
TDD-compatible task generation and parallel-task markers.

Cons

Heavyweight. The full phase sequence is overkill for small changes.
Requires Python 3.11+, uv, and Git just to bootstrap.
Strongly greenfield-flavored; the constitution/specify flow assumes you are starting something new rather than evolving an existing codebase.

Best for: new projects, larger features, and teams that want a rigorous, repeatable process with explicit governance and review gates.

OpenSpec

https://github.com/Fission-AI/OpenSpec

A lightweight, change-oriented alternative. Rather than treating the spec as a one-shot scaffold, OpenSpec maintains living specs and models work as a series of change proposals layered on top of them.

The repository keeps two key areas:

specs/ — the current, agreed-upon truth (what the system does today).
changes/ — proposed deltas (a proposal, spec changes, design notes, and a task checklist) for work in flight.

The loop is intentionally small:

/opsx:propose — open a new change proposal.
/opsx:apply — implement the proposed tasks.
/opsx:archive — fold a completed change back into the specs and archive it.

npx openspec init # also: openspec update to refresh agent instructions

Pros

Lightweight and fluid — iterates without rigid phase gates.
Equally at home in brownfield codebases; the spec/change split is built exactly for evolving existing systems.
Specs are durable living documentation, not throwaway scaffolding.
Node-based, minimal setup; integrates with 25+ tools.
MIT licensed.

Cons

Younger and smaller community than Spec Kit.
Fewer built-in quality gates — the discipline of clarify/analyze is on you.
The change/archive model takes a little practice to internalize.

Best for: ongoing work on existing codebases, incremental features, and developers who want alignment-before-code without ceremony.

Other notable tools

Two more worth knowing about, since they take meaningfully different shapes:

Amazon Kiro

https://kiro.dev/

An agentic IDE (rather than a CLI add-on) with SDD as a first-class feature. From a prompt it generates three artifacts — requirements.md (EARS-style acceptance criteria), design.md, and tasks.md — and keeps them in sync as you build. Pros: integrated end-to-end experience, structured requirements, strong onboarding. Cons: it’s a whole IDE to adopt and is more of a walled garden than the editor-agnostic CLIs above. Best for: developers willing to switch IDEs for a tightly integrated spec-to-code workflow.

BMAD-METHOD

https://github.com/bmad-code-org/BMAD-METHOD

“Breakthrough Method of Agile AI-Driven Development.” Less a spec format and more an agentic team: specialized agents (Analyst, PM, Architect, Scrum Master, Dev) collaborate to produce a PRD and architecture, then shard them into hyper-detailed story files that carry full context into implementation. Pros: rich, role-based planning; great for complex, multi-feature products. Cons: the most ceremony of the bunch; a real learning curve. Best for: ambitious greenfield products where up-front planning genuinely pays off.

How to choose

Tool	Weight	Greenfield	Brownfield	Setup	Living specs
Spec Kit	Heavy	Excellent	OK	Python + uv	Per-feature
OpenSpec	Light	Good	Excellent	Node	Yes (specs/)
Kiro	Medium	Excellent	Good	Dedicated IDE	Yes
BMAD-METHOD	Heavy	Excellent	OK	Node/agents	PRD + stories

A practical rule of thumb:

Starting a new project and want rigor → Spec Kit.
Evolving an existing codebase and want speed → OpenSpec.
Want an all-in-one IDE experience → Kiro.
Building an ambitious product with heavy planning needs → BMAD-METHOD.

All four are MIT/open and agent-agnostic enough to try in an afternoon. The common thread — and the real takeaway — is this: write the spec first, review the spec (not the diff), and let intent, not vibes, drive the agent.

Resources

https://github.com/github/spec-kit

https://github.com/Fission-AI/OpenSpec

https://kiro.dev/

https://github.com/bmad-code-org/BMAD-METHOD

Agent

Sat, 19 Jul 2025 10:50:10 +0800

Agent

What is an agent

An AI agent is an LLM wrapped in a loop that can act, not just answer. The model decides which tool to call, observes the result, and repeats until the goal is met. The essential ingredients are:

A model — the reasoning engine.
Tools — functions it can call (run a command, read a file, search the web).
A loop — plan → act → observe → re-plan, until done.
Context / memory — the task, prior steps, and relevant project knowledge.

A plain chatbot returns text. An agent edits files, runs tests, fixes the failures, and reports back — it closes the loop on real work.

What is an agent plugin

A plugin packages reusable capability so you don’t re-teach the agent every time. Across today’s tools a plugin typically bundles some of:

Slash commands — custom prompts/workflows you invoke by name.
Subagents — specialized helpers the main agent can delegate to.
Skills — domain knowledge loaded on demand.
Hooks — scripts that fire on events (e.g. before a commit).
MCP servers — connections to external systems (see MCP).

Plugins make an agent extensible and shareable: install one and the agent gains new commands and integrations instantly.

Three mainstream agents

Agent	Form	Strength	Trade-off
Claude Code	CLI / IDE	Deep autonomy, rich plugin + MCP + skills ecosystem	Terminal-first; needs a Claude subscription/API
Cursor	Full IDE (VS Code fork)	Best inline editing UX, multi-model, great for exploration	You switch editors; agent autonomy is shallower
GitHub Copilot	IDE extension + coding agent	Native GitHub/PR integration, huge install base	Less open/customizable, ecosystem-locked

The differences in short:

Claude Code is the most agentic and the most open to extend — plugins, skills, hooks and MCP make it programmable. Best when you want the agent to drive multi-step work autonomously.
Cursor wins on editing experience. It’s an IDE, so completions, diffs and context-picking feel native. Best for hands-on, in-editor coding.
Copilot wins on integration and ubiquity. It lives where your code already is (GitHub, VS Code, JetBrains) and its coding agent can take an issue to a PR. Best for teams already standardized on GitHub.

There is no single winner — pick by how much autonomy vs. inline control you want, and how much you care about extensibility vs. zero-setup integration.

Developing a plugin

Using Claude Code’s plugin system as the concrete example, a plugin is just a directory with a manifest plus the components it adds:

my-plugin/
├── .claude-plugin/plugin.json # name, version, description
├── commands/ # slash commands (markdown prompts)
├── agents/ # subagent definitions
├── skills/ # on-demand knowledge
├── hooks/ # event scripts
└── .mcp.json # MCP servers to wire in

The workflow is roughly:

Scaffold the directory and plugin.json manifest.
Add a command — a markdown file whose body is the prompt the agent runs.
Optionally add subagents, skills, hooks, or MCP servers.
Install it locally to test, then publish via a marketplace/repo so others can install it.

You’re not writing model code — you’re writing prompts, configuration, and small scripts that compose into reusable behavior.

Resources

https://docs.claude.com/en/docs/claude-code

https://cursor.com/

https://github.com/features/copilot

MCP

Sat, 19 Jul 2025 10:50:06 +0800

MCP

What is MCP

MCP (Model Context Protocol) is an open standard, introduced by Anthropic in late 2024, for connecting AI agents to external tools and data. Think of it as “USB-C for AI”: instead of every agent writing a bespoke integration for every service, a tool exposes itself once as an MCP server, and any MCP client (Claude Code, Cursor, Copilot, …) can use it.

It’s a client–server protocol (JSON-RPC over stdio or HTTP). A server can expose three kinds of capability:

Tools — actions the model can invoke (query a DB, create an issue).
Resources — data the model can read (files, records, docs).
Prompts — reusable prompt templates the user can trigger.

The win is decoupling: build an integration once, use it from any agent.

Popular open-source MCP servers

A few widely used, open-source servers to get started with:

Filesystem — read/write local files within allowed paths.
Git / GitHub / GitLab — inspect repos, manage issues and PRs.
Postgres / SQLite — run queries against a database.
Fetch — retrieve and convert web pages to text.
Puppeteer / Playwright — drive a real browser for testing and scraping.
Slack — read and post messages.
Memory — a persistent knowledge graph across sessions.

The reference servers live in the official repo, and the community has built hundreds more (cloud providers, search, monitoring, etc.).

Developing an MCP server

The fastest path is an official SDK — Python (FastMCP) or TypeScript. A minimal Python server:

from mcp.server.fastmcp import FastMCP

mcp = FastMCP("demo")

@mcp.tool()
def add(a: int, b: int) -> int:
 """Add two numbers."""
 return a + b

if __name__ == "__main__":
 mcp.run() # speaks MCP over stdio

The steps are:

Create the server and declare tools (functions), resources, and prompts — docstrings and type hints become the schema the model sees.
Pick a transport: stdio for local tools, HTTP for remote/shared servers.
Register it with your client (e.g. an .mcp.json entry or claude mcp add).

Clear tool names, tight descriptions, and small focused tools matter more than quantity — that’s what lets the model pick the right action reliably.

Resources

https://modelcontextprotocol.io/

https://github.com/modelcontextprotocol/servers

https://github.com/modelcontextprotocol/python-sdk