Skip to content

AgentOps vs GSD (Get Shit Done)

GSD v1.34 is a full-featured spec-driven development framework for AI coding agents. It installs as slash commands into 7 runtimes (Claude Code, Gemini CLI, OpenCode, Codex, Copilot, Cursor, Antigravity) and solves "context rot" by spawning fresh-context subagents for each task. Trusted by engineers at Amazon, Google, Shopify, and Webflow.

Comparison updated April 2026. See the GSD repo for current features.


At a Glance

Aspect GSD AgentOps
Philosophy "Ship fast — fresh context per agent" "Operational layer for coding agents; technically a context compiler"
Core strength Multi-agent orchestration with context isolation, multi-runtime support Cross-session memory, validation gates, knowledge flywheel
GitHub glittercowboy/get-shit-done boshu2/agentops
Latest v1.34.2 (April 2026) v2.37.2 (April 2026)
Scale 53 commands, 46 workflows, 16 agents 50+ skills, compiled CLI, hooks, schemas
Primary use Spec-driven development with phased execution Ongoing codebase work with persistent memory

What GSD Does Well

1. Fresh Context Per Agent

GSD's core innovation. Every spawned agent gets a clean 200K context window. Orchestrators stay thin, agents are disposable. This eliminates context rot — the quality degradation that happens as an AI fills its context window during long sessions.

2. Wave-Based Parallel Execution

Plans are grouped into dependency waves. Plans within a wave run in parallel (each with a fresh agent), waves run sequentially. Includes STATE.md file locking with atomic creation and spin-wait jitter.

Text Only
Wave 1: [Plan A, Plan B, Plan C]  ← parallel, fresh 200K each
           ↓ (all complete)
Wave 2: [Plan D, Plan E]          ← parallel, depends on Wave 1
           ↓ (all complete)
Wave 3: [Plan F]                  ← sequential, depends on Wave 2

3. Model Cost Tiers

Four profiles (quality/balanced/budget/inherit) with per-agent model assignments. Each profile maps agents to opus/sonnet/haiku. This means routine plan-checking can run on budget models while critical execution stays on quality models.

4. Auto-Repair on Task Failure

When a task fails during execution, GSD auto-classifies the failure as RETRY (with adjustment), DECOMPOSE (break into sub-steps), or PRUNE (remove and escalate). Budget-controlled with a default of 2 attempts. This is structured recovery, not blind retries.

5. Comprehensive Validation Pipeline

Not just "human verification" anymore. GSD v1.27 has: - 8-dimension plan checker (max 3 iterations) - Nyquist validation audit (test coverage mapping) - Post-execution verification agent - User acceptance testing with auto-diagnosis - Cross-phase verification debt audit

6. Seven Runtime Support

Installs to Claude Code, Gemini CLI, OpenCode, Codex, Copilot, Cursor, and Antigravity. The installer transforms file content per runtime at install time (tool name mapping, agent frontmatter, hook events, path conventions).

7. Advisor Mode and Fast Path (New in v1.34)

/gsd:advisor provides research-backed discussion without execution. /gsd:fast skips planning for trivial tasks. Multi-repo workspace support added for cross-project orchestration.

7. Deep State Management

Full .planning/ directory with 20+ artifact types: - PROJECT.md, REQUIREMENTS.md, ROADMAP.md, STATE.md - Per-phase directories with research, plans, summaries, verification, UAT - Session handoff (HANDOFF.json, continue-here.md) - Persistent threads, seeds, debug knowledge base, todos


Where GSD Falls Short

No Cross-Session Learning

GSD has persistence (state files, handoffs, threads) but no knowledge flywheel. There is no mechanism to extract what was learned in one session and inject it into the next. Every session starts with the same agent intelligence — the system does not get smarter over time.

Text Only
┌─────────────────────────────────────────────────────────────────┐
│                         GSD                                     │
│                                                                 │
│  Session 1: discuss → plan → execute → verify → Done            │
│                                                   ↓             │
│                                              (state saved)      │
│                                                   ↓             │
│  Session 2: resume-work → (same state, same intelligence)       │
│                                                                 │
│  Session 100: (agents are no smarter than session 1)            │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────────┐
│                      AGENTOPS                                   │
│                                                                 │
│  Session 1: research → plan → pre-mortem → crank → vibe → retro │
│                                              ↓                  │
│                                      (learnings extracted)      │
│                                              ↓                  │
│                                      (scored and stored)        │
│                                                                 │
│  Session 2: (inject prior knowledge) → better starting point    │
│                                                                 │
│  Session 100: (agent is a domain expert with scored knowledge)  │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

No Strategic Goals or Direction

GSD executes phases within a project but has no mechanism for measuring progress toward higher-level objectives. No equivalent to GOALS.md, ao goals measure, or /evolve.

No Pre-Implementation Failure Prevention

GSD validates after execution (verify-work, UAT). The plan checker validates plan quality but does not simulate failure modes before implementation begins. AgentOps runs /pre-mortem to catch issues before code is written.

No Issue Graph or Dependency Tracking

GSD has wave-based parallelism for plans within a phase, but no cross-phase issue graph. There is no equivalent to beads (issue tracking with dependencies), which means no mechanical way to track blocked work across phases.

No Multi-Model Validation Council

GSD's review command supports cross-AI peer review (Gemini, Claude, Codex), but this is a review tool, not a multi-perspective validation council that runs adversarial analysis from different viewpoints simultaneously.


Feature Comparison

Feature GSD AgentOps Winner
Multi-runtime support ✅ 7 runtimes ⚠️ Claude Code primary GSD
Fresh context per agent ✅ Core design ⚠️ Swarm workers GSD
Model cost tiers ✅ 4 profiles ❌ Not yet GSD
Auto-repair on failure ✅ RETRY/DECOMPOSE/PRUNE ⚠️ Crank retries GSD
Context rot detection ✅ Hooks at 35%/25% ❌ Not yet GSD
Prompt injection defense ✅ Advisory hook ❌ Not yet GSD
Wave-based parallelism ✅ Built in ✅ Crank waves Tie
Plan validation ✅ 8-dimension checker ✅ Pre-mortem + council Tie
Human-in-loop gates ✅ Configurable gates ✅ Multiple gates Tie
State persistence ✅ .planning/ directory ✅ .agents/ directory Tie
Cross-session learning ❌ No flywheel ✅ Extract → score → inject AgentOps
Knowledge maturity ❌ No scoring ✅ Maturity tracking + decay AgentOps
Pre-mortem simulation ❌ Post-execution only ✅ Before implementation AgentOps
Multi-model council ❌ Sequential review ✅ Multi-perspective AgentOps
Issue graph execution ❌ Phase-scoped waves ✅ Beads + dependencies AgentOps
Strategic goals ❌ No goal tracking ✅ GOALS.md + evolve AgentOps
Compiled CLI ❌ Node.js tools ✅ Go binary (ao) AgentOps

Workflow Comparison

GSD Workflow

Text Only
/gsd:new-project     →  PROJECT.md, REQUIREMENTS.md, ROADMAP.md, config.json
         ↓
/gsd:discuss-phase   →  Capture decisions (CONTEXT.md)
         ↓
/gsd:plan-phase      →  Research → plan → 8-dimension verify (max 3 iterations)
         ↓
/gsd:execute-phase   →  Wave-based parallel execution (fresh agent per plan)
         ↓                  └── Node repair on failure (RETRY/DECOMPOSE/PRUNE)
/gsd:verify-work     →  UAT with auto-diagnosis
         ↓
/gsd:ship            →  Create PR from phase work
         ↓
       [next phase or complete-milestone]

AgentOps Workflow

Text Only
/research     →  Explore codebase + inject prior knowledge
     ↓
/plan         →  Break into dependency-tracked issues (beads)
     ↓
/pre-mortem   →  Simulate failure modes before building
     ↓
/crank        →  Execute unblocked waves → validate → commit
     ↓
/vibe         →  Multi-aspect code validation (council optional)
     ↓
/post-mortem  →  Extract learnings → score → store for next session

Architecture Comparison

Aspect GSD AgentOps
Commands 53 prompt-based slash commands 50+ skill definitions
Agents 16 specialized (fresh context each) Skill-driven (swarm for parallelism)
CLI tooling Node.js (gsd-tools.cjs, 15 modules) Go binary (ao, structured subcommands)
Hooks 5 JS hooks (statusline, context monitor, prompt guard, workflow guard, update check) Shell hooks (session lifecycle, tool gates, knowledge injection)
State .planning/ (Markdown + JSON) .agents/ (Markdown + JSON)
Config .planning/config.json (40+ options) .agentops.json + GOALS.md
Install npm package, 3000-line installer Shell script + Go binary
Parallelism Wave-based with file locking Wave-based via crank + swarm

Overhead Comparison

Text Only
                    SETUP TIME              ONGOING OVERHEAD
                    ══════════              ════════════════

GSD:                ████████░░░░░░░░        ████████░░░░░░░░
                    (npm install + init)    (moderate — .planning/ management)

AgentOps:           ████████░░░░░░░░        ████████░░░░░░░░
                    (install + init)        (moderate — hooks + .agents/)


                    SESSION VALUE           LONG-TERM VALUE
                    ═════════════           ═══════════════

GSD:                ████████████████        ████████░░░░░░░░
                    (strong execution)      (state persists, no learning)

AgentOps:           ████████████████        ████████████████
                    (strong execution)      (knowledge compounds)

Trade-off: GSD optimizes for execution quality per session. AgentOps optimizes for cumulative intelligence across sessions.


Use Case Fit

GSD is Best For

Use Case Why
Greenfield projects Strong project setup + phased execution
Multi-runtime teams 7 runtimes with one install
Cost-sensitive work Model cost tiers control spend
Complex single-phase work Wave parallelism + auto-repair
Teams standardizing process Clear phases with configurable gates

AgentOps is Best For

Use Case Why
Long-running codebases Knowledge flywheel compounds value
Repeated maintenance Learns from past sessions
Complex multi-phase work Issue graph + dependency execution
Risk-averse engineering Pre-mortem + council + vibe gates
Strategic direction GOALS.md + evolve loop

When to Choose GSD

  • You work across multiple AI runtimes and need one workflow
  • You want model cost control at the per-agent level
  • Your work is project-scoped (clear start and end)
  • You value fresh context per agent for quality in long sessions
  • You want auto-repair when tasks fail during execution

When to Choose AgentOps

  • You work on the same codebase across many sessions
  • You want the system to get smarter over time
  • You want failure prevention before implementation, not just verification after
  • You want dependency-tracked issue execution across work phases
  • You value strategic goal tracking and measured progress

Can They Work Together?

Partially. GSD and AgentOps both manage state directories and workflow orchestration, so running both simultaneously would create friction. However:

  • GSD's fresh-context-per-agent pattern is a technique AgentOps' swarm could adopt
  • GSD's model cost tiers solve a problem AgentOps does not yet address
  • AgentOps' knowledge flywheel fills GSD's biggest gap (no cross-session learning)

The most practical combination: use GSD for greenfield projects where you need fast phased execution, then bring AgentOps in when the project enters maintenance and long-term development where accumulated knowledge matters.


The Bottom Line

Dimension GSD AgentOps
Philosophy Fresh context, fast execution Knowledge compounds
Overhead Moderate Moderate
Persistence State files (no learning) Knowledge flywheel
Validation 8-dimension plan check + UAT Pre-mortem + council + vibe
Parallelism Wave-based, fresh agents Wave-based, swarm workers
Cost control 4-tier model profiles Not yet
Best for Strong execution per session Cumulative intelligence across sessions

GSD is a serious framework for structured AI-assisted development. AgentOps differentiates on the knowledge flywheel — the system that makes every session smarter than the last.


The Honest Assessment

GSD is not a lightweight tool anymore. It is a comprehensive development framework with 53 commands, 16 agents, wave-based parallelism, auto-repair, model cost tiers, and deep state management. Dismissing it as "simple meta-prompting" is inaccurate.

Where GSD wins: Execution quality within a session. Fresh context per agent, cost control, auto-repair, and 7-runtime portability.

Where AgentOps wins: Intelligence across sessions. The knowledge flywheel (extract, score, inject, decay) has no equivalent in GSD. After 50 sessions on the same codebase, AgentOps is operating with accumulated domain knowledge while GSD agents start fresh every time.

Text Only
Session 1:   GSD and AgentOps roughly equal
Session 10:  AgentOps has a library of scored learnings
Session 50:  AgentOps agents get injected domain expertise; GSD agents do not
Session 100: AgentOps is a domain expert; GSD is still starting from its .planning/ state

[← vs. SDD](vs-sdd.md) · [Back to Comparisons](README.md)