Skip to content

Changelog

All notable changes to this project will be documented in this file.

The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.

[Unreleased]

Added

  • Strict Delegation Contract for /rpi, /discovery, and /validation — top-level orchestrator skills now declare strict sub-skill delegation as the default. Each skill points to the new canonical reference skills/shared/references/strict-delegation-contract.md which documents the contract, anti-pattern rationalizations, and supported compression escapes (--quick, --fast-path, --no-retro, --no-forge, --skip-brainstorm, --no-scaffold, --no-behavioral, --allow-critical-deps). There is no --full flag — strict delegation is always on.
  • Orchestrator Compression Anti-Pattern learning at .agents/learnings/2026-04-19-orchestrator-compression-anti-pattern.md, surfaced via ao inject at session start. Includes detection phrases, corrective actions, and rationalizations to reject.
  • Orchestrator-owned step markers in skills/crank/SKILL.md (STEP 3a.3, STEP 6.5 slop-scan, STEP 8.7) plus an "Inline Work Policy" footer documenting which steps are intentionally inline vs delegated.

Removed

  • Archived AO↔Olympus bridge integration: removed docs/ol-bridge-contracts.md, docs/architecture/ao-olympus-ownership-matrix.md, MemRL policy contracts, skills/*/scripts/ol-*.sh, cli/cmd/ao/inject_ol_test.go, and associated CLI types (OLConstraint, gatherOLConstraints, .ol/ directory collector). Olympus predecessor's useful patterns live on inside ao.

Changed

  • --no-lifecycle in /discovery renamed to --no-scaffold for semantic clarity — the flag controls STEP 4.5 scaffold auto-invocation only, not broader lifecycle checks. --no-lifecycle is honored as a deprecated alias through v2.40.0; when both flags are passed, they are equivalent. Other skills (/crank, /validation, /implement, /evolve) retain --no-lifecycle with its existing lifecycle-skill-invocation semantics.
  • /discovery flags table expanded: --auto is now explicitly documented (was transitively honored but undocumented); --interactive scope clarified ("research + plan gates, not pre-mortem").
  • /validation flags table expanded: --complexity=<level> syntax formalized to match /rpi and /discovery; --interactive scope documented.
  • /rpi --interactive flag scope note added: applies to discovery (research + plan) and validation (Gate 1, Gate 2); does NOT override pre-mortem or vibe council autonomy.

Fixed

  • Orchestrator compression vulnerability — a live compression was observed 2026-04-19 where /rpi was invoked but phases were inlined instead of delegated. This release documents the anti-pattern (forged learning + loud skill text), scaffolds future enforcement (shared contract reference used by all 6 orchestrator skills), and explicitly defers runtime hook enforcement to a follow-up initiative. It does not mechanically prevent compression yet — the durable fix depends on ao inject surfacing the forged learning on future session starts. See .agents/research/2026-04-19-rpi-skill-dag-audit.md for the audit and .agents/plans/2026-04-19-rpi-dag-hardening.md for the remediation plan.

[2.37.2] - 2026-04-15

Added

  • Swarm evidence validation — AgentOps now ships a swarm-evidence schema and validator, and wires that proof surface into validation and release gates.
  • Lead-only worker git guard — worker sessions now have an explicit lead-only git guard in the hook chain, reducing accidental write authority in multi-agent runs.
  • Compile and harvest operator controlsao compile adds runtime preference plus --reset and --repair controls, while harvest now reports excluded low-confidence candidates and top near-misses.

Changed

  • Release and pre-push validation — local release, pre-push, and command coverage gates now validate more of the hook, evidence, and Codex runtime surface before publish.
  • Codex/runtime artifacts and docs — compile, evolve, post-mortem, swarm, and related runtime docs and artifacts were decomposed and synchronized to better match shipped behavior.
  • Flywheel backlog bookkeeping — next-work aggregates, consumed markers, and enum normalization were cleaned up so carry-forward work is recorded consistently.

Fixed

  • Pre-mortem gate ambiguity — the crank pre-mortem gate now denies ambiguous state by default instead of failing open.
  • CLI and shell reliability edgesao rpi serve --run-id now accepts legacy 8-hex IDs, ao mine --dry-run emits a single clean JSON payload, and bash invocations are sanitized to bypass unsafe shell aliases.
  • Compile, harvest, and release drift — compile repair defaults, malformed frontmatter salvage, YAML parse error surfacing, CI fixture drift, shellcheck drift, and Codex artifact metadata drift were corrected.

[2.37.1] - 2026-04-15

Added

  • Dream morning packets — Dream can now emit ranked morning work packets with evidence, target files, exact follow-up commands, and queue/bead handoff metadata.
  • Dream yield telemetry and long-haul corroboration — overnight reports now record packet-confidence telemetry and can trigger a bounded long-haul corroboration pass when the first pass produces weak morning output.

Changed

  • Dream decision flow — overnight runs now prefer cheaper evidence corroboration before slower council fan-out, so strong runs stay short and extended runtime is reserved for genuinely weak output.

Fixed

  • Headless Claude Dream council — Dream now uses Claude's working JSON output contract for headless council runs and normalizes the returned envelope before validation.
  • Dream close-loop and report surfaces — overnight runs now write real close-loop callbacks and post-loop report artifacts instead of leaving placeholder pending steps.
  • Retrieval ratchet release gate fallback — the retrieval-quality release check now falls back to checked-in eval data when a local manifest is absent.

[2.37.0] - 2026-04-14

Added

  • Windows install and smoke coveragescripts/install-ao.ps1 adds a first-class Windows install path, and the blocking windows-smoke gate exercises PowerShell install, local ao doctor, and Windows-sensitive Go packages.
  • Compile commandao compile makes knowledge compilation a first-class CLI surface with docs and tests.
  • Local LLM forge pipelineao forge can now redact, summarize, structurally review, and queue transcript-derived wiki pages with Dream worker integration.
  • Dream curator and evolve sub-cycle — Dream gained a local curator adapter plus ao evolve --dream-first|--dream-only, allowing overnight knowledge passes to feed the daytime improvement loop.
  • .agents wiki surfaces — INDEX, LOG, wiki directories, and search integration formalize .agents/ as a Karpathy-style knowledge wiki with index-first navigation.
  • Operational quality surfaces — beads audit/cluster commands, swarm preflight advice, status quality signals, retrieval eval queries, and a retrieval-quality CI ratchet broaden release-time proof.

Changed

  • Knowledge scoring and search behavior — inject now deduplicates by content hash, boosts indexed pages, weights stability, and search can pull Dream vault and wiki sources with stronger local recall.
  • Overnight and RPI internals — overnight, lifecycle, search, inject, harvest, and RPI flows were decomposed into smaller helpers while tightening proof paths, mixed-mode provenance, and worktree cleanup.
  • Public framing and contributor docs — README, philosophy, planning/post-mortem docs, and reference surfaces now better match the context-compiler and operational-layer story.

Fixed

  • Windows overnight liveness — Windows process checks no longer rely on Unix signal(0) semantics.
  • Dream RunLoop status invariants — live-tree hash coverage now exercises every terminal RunLoop status, and degraded reflects the current rollback semantics.
  • Release retag safety — release tooling now preserves annotated tags, validates audit artifact manifests and refs, and cancels stale reruns before duplicate publish attempts.
  • Post-mortem and closure audits — metadata links, evidence-only closure packets, parser-path handling, and closure packet evidence modes were normalized.
  • Codex and runtime reliability — same-thread lifecycle restart, root-scoped fallback reads, JSON config writes, bridge contract validation, and next-work proof-path handling were hardened.

[2.36.0] - 2026-04-11

Added

  • Evolve operator commandao evolve now exposes the v2 autonomous improvement loop directly in the CLI, including --max-cycles, --queue, --beads-only, --quality, --compile, and strict-quality passthrough flags.
  • Autodev program contract — root PROGRAM.md gives evolve/autodev a repo-local operating contract with mutable and immutable scope, validation commands, escalation policy, and stop conditions.
  • Beads stale-scope toolingao beads verify|lint|harvest adds first-class stale-citation checks for bead-driven planning and RPI recovery.
  • RPI discovery artifacts — RPI can now persist and consume discovery artifacts, with tests and docs covering the --discovery-artifact path.
  • Dream RunLoop invariant coverageTestRunLoop_LiveTreeHashInvariant_AllStatuses locks the IsCorpusCompounded() and live-tree mutation invariant across deterministically reproducible terminal statuses.
  • Dream failed-summary contract coverage — regression tests now lock the finalizeOvernightSummary contract for MEASURE consecutive-failure halts and persisted iteration history.
  • Dream operator modeao overnight start|run|report|setup adds a private overnight lane with shared dream.* config, keep-awake defaults, scheduler/bootstrap guidance, council-ready runner packets, and DreamScape-style morning summaries
  • Nightly live retrieval proof — the dream-cycle now runs ao retrieval-bench --live --json, emits retrieval proof in nightly summaries, and keeps a visible artifact trail for flywheel health
  • Pattern-to-skill drafts — repeated patterns can now generate review-only skill drafts under .agents/skill-drafts/ during flywheel close-loop
  • Fresh-repo onboarding welcome — new session-start routing helps first-time repos enter discovery, implementation, or validation without needing the full RPI lane first
  • Docs-site and contribution proof surfaces — GitHub Pages navigation, comparison pages, behavioral-discipline guidance, strategic-doc validation patterns, and a first-skill guide expand the public proof surface

Changed

  • RPI wave recovery integrated — recovered RPI wave work landed across Dream, council, stale-scope planning, discovery artifacts, CI hardening, and Codex runtime surfaces.
  • Council --mixed strict contract documentedskills/council/references/cli-spawning.md documents that /council --mixed requires Codex CLI and emits a hard error instead of silently falling back to Claude-only.
  • Plan and pre-mortem skill bodies decomposed — focused reference files now carry the detailed pre-decomposition, scope-mode, mandatory-check, output, wave-matrix, and task-creation guidance while keeping the top-level skills within lint budgets.
  • Bead-input pre-flight wired into planning skills/plan and /pre-mortem invoke ao beads verify <bead-id> for full-complexity, aged, or prior-session bead inputs before decomposition or validation.
  • Operational-layer framing — README, onboarding, docs, comparisons, and linked surfaces now consistently explain AgentOps as bookkeeping, validation, primitives, and flows for coding agents
  • Dream runtime positioning — the public GitHub nightly is now documented as a proof harness, while ao overnight is documented as the private local compounding engine
  • Codex default path — native hooks, install copy, runtime smoke coverage, and checked-in Codex artifacts are aligned around the native-plugin path on supported Codex versions
  • Validation guidance — behavioral-discipline and strategic-doc review are now first-class references alongside code review and runtime validation

Fixed

  • Windows Codex installer — Codex installation now has a Windows path instead of assuming Unix shell behavior.
  • golangci-lint v2 contract — the local lint wrapper and CI configuration now pin the v2 behavior expected by the repository.
  • security-toolchain-gate CI — deterministic fixture generation in cli/internal/overnight/fixture/gen_fixture.go is annotated as a non-cryptographic seeded-random use, avoiding a false-positive semgrep blocker.
  • Recovered RPI validation blockers — validation drift from the recovered RPI wave was cleared before retagging the release.
  • Stale-scope reference placement — shared stale-scope validation guidance now lives under skills/shared/references/ so heal.sh --strict can resolve it consistently.
  • Release and CI drift — resolved docs-site Liquid/frontmatter issues, headless runtime smoke portability problems, pre-push shim test drift, and compile-skill headless command drift caught during release prep
  • Codex install and artifact drift — fixed stale slash-command references, refreshed checked-in artifact metadata, added a Codex compile wrapper, and corrected plugin/marketplace mismatches exercised by smoke coverage
  • Runtime proof stability — promoted Codex runtime smoke into the blocking smoke path and fixed related shellcheck and install-surface rough edges

Removed

  • DevOps-rooted tagline — public framing no longer leads with the old DevOps-layer tagline; the Three Ways lineage remains supporting doctrine instead of the category label

[2.35.0] - 2026-04-07

Added

  • Codex native hooks — AgentOps hooks now install natively into Codex CLI v0.115.0+ via ~/.codex/hooks.json; 8 hooks wired (session-start, inject, flywheel-close, prompt-nudge, quality-signals, go-test-precommit, commit-review, ratchet-advance); installer enables codex_hooks feature and upgrades from hookless fallback to native hook runtime
  • Knowledge compiler skill — renamed athena → /compile with Karpathy-style incremental compilation, pluggable LLM backend (AGENTOPS_COMPILE_RUNTIME=ollama|claude), interlinked markdown wiki output at .agents/compiled/
  • App struct dependency injectionApp struct carries ExecCommand, LookPath, RandReader, Stdout, Stderr seams; gc bridge, events, executor, context relevance, tracker health, and stream modules accept injected dependencies instead of mutable package-level vars
  • Test shuffle in CI-shuffle=on added to validate.yml and Makefile test targets, exposing and fixing 6 ordering-dependent tests (cobra flag leaks, maturity var leaks, env var leaks)

Changed

  • CLI internal extraction (waves 5-13) — business logic extracted from cmd/ao monolith into 15 internal/ domain packages (rpi, search, context, quality, goals, lifecycle, bridge, forge, mine, plans, knowledge, storage, pool, taxonomy, worker) using Options struct pattern for dependency injection
  • Goals test migration — 7 goals test files moved from cmd/ao to internal/goals as external test package (goals_test) with t.Parallel() and direct goals.Run*() calls replacing cobra command wiring
  • Test isolationresetCommandState now saves/restores 10 maturity globals; resetFlagChangesRecursive resets flag values to defaults; RPILoop and toolchain tests clear AGENTOPS_RPI_RUNTIME* env vars via t.Setenv

Fixed

  • Defrag test flag leakTestDefragOutputDirFlag used cmd.Flags().Lookup("output") which matched the root persistent --output flag; changed to cmd.LocalFlags().Lookup("output")
  • Goroutine leak false positiveTestRunGoals_GoroutineLeak used goleak.VerifyNone which caught goroutines from parallel tests; switched to goleak.IgnoreCurrent() to only detect leaks within the test itself
  • Secret scan false positives — excluded .gc/ directory and Getenv/os.Environ patterns from secret pattern scan
  • Codex skill validation — added output_contract as valid schema key, cross-vendor/knowledge as valid tiers, fixed $/ prefix in codex forge/post-mortem/scenario skills
  • Scenario CLI snippets — replaced non-existent --source/--scope flags with valid --status variants

Removed

  • Coverage percentage CI gates — removed coverage-ratchet job, check-cmdao-coverage-floor.sh, .coverage-baseline.json, and associated BATS tests; percentage gates blocked CI during architectural refactors without catching bugs
  • fire.go — FIRE loop (find-ignite-reap-escalate) superseded by gc sling + bead dispatch; formatAge helper moved to inject_predecessor.go
  • rpi_workers.go — per-worker health display superseded by gc agent health patrol; ao rpi workers subcommand removed from CLI and docs

[2.34.0] - 2026-04-05

Added

  • Stage 4 Behavioral Validation — new validation tier between council/vibe and production:
  • Holdout scenarios stored in .agents/holdout/ with PreToolUse isolation hook preventing implementing agents from seeing evaluation criteria
  • Satisfaction scoring (0.0-1.0 probabilistic) in verdict schema v4, replacing boolean-only PASS/FAIL
  • Agent-built behavioral specs generated during /implement Step 5c
  • /scenario skill for authoring and managing holdout scenarios
  • ao scenario init|list|validate CLI commands (4 subcommands, 11 tests)
  • STEP 1.8 in /validation pipeline evaluating holdout scenarios + agent specs
  • schemas/scenario.v1.schema.json defining the holdout scenario format
  • Flywheel gate commandao flywheel gate checks readiness for retrieval-expansion work (research closure, rho threshold, holdout precision@K)
  • Citation confidence scoringcitationEventIsHighConfidence with bucketed confidence (0/0.5/0.7/0.9) gates MemRL rewards on match quality
  • Retrieval bench refactor — train/holdout splits, section-aware scoring (scoreBenchSections), manifest-based benchmark cases
  • Proof-backed next-work visibilityclassifyNextWorkCompletionProof unifies completed-run, execution-packet, and evidence-only-closure proof types; context explain and stigmergic packet now report proof-backed suppressions
  • Three-gap contract proof gates — lifecycle gap mapping gates added to GOALS.md
  • Cross-vendor execution--mixed flag for Claude + Codex council judges
  • Gas City bridge — gc as default executor for RPI phase execution with L1-L3 tests
  • 149 L2 integration tests — AI-native test shape ("L2 first, L1 always") validated at scale; coverage floor raised 78.8% → 81.0%
  • Test coverage hardening — GPG commit-signing fixes, root-skip guards for containerized CI, 350+ lines of vibecheck detector/metrics tests, maturity.go empty-content bugfix

Changed

  • Codex parity hookcodex-parity-warn.sh now supports opt-in blocking mode via AGENTOPS_CODEX_PARITY_BLOCK=1 (exit 2 instead of advisory)
  • 12-factor doctrine — compressed from 474 to 114 lines, reframed as supporting lens rather than product definition
  • Skill count — 65 → 66 (added /scenario)
  • Research skill — now persists reusable findings to .agents/findings/registry.jsonl with finding-compiler refresh
  • Closure integrity audit — accepts durable closure packets without scoped-file sections as valid evidence
  • Proof-backed legacy entriesshouldSkipLegacyFailedEntry uses CompletionEvidence field (proof-only, no heuristic fallback)
  • readQueueEntries — returns all non-consumed entries; proof filtering is downstream via shouldSkipLegacyFailedEntry

Fixed

  • 6 CI failure categories resolved in one commit (f1b83b25)
  • Cobra test registrationscenario and flywheel gate added to expectedCmds
  • Citation feedback test — assertion corrected for recorded confidence preference (0.5 not 0.7)
  • RPI hardening — UAT version pre-flight, goals history filter, proof-backed suppression, fail-closed gates, cross-epic handoff contamination, bare ag- prefix guard
  • Branch consolidation — 10 stale Codex branches analyzed, cherry-picked (9 commits, ~3,500 lines), and deleted; 25 orphaned worktrees pruned
  • git rerere enabled — conflict resolution memory for future merges

[2.33.0] - 2026-04-02

Added

  • Backlog hygiene gates — added bd-audit.sh, bd-cluster.sh, and Crank/Codex guidance for cleaning stale or mergeable beads before execution
  • Retrieval benchmarking and global scope — added ao retrieval-bench, benchmark corpora, --live, --global, and nightly IR regression coverage
  • /red-team adversarial validation — added a persona-based validation skill plus checked-in Codex runtime artifacts
  • Software factory operator lane — added a CLI/operator surface and Claude factory startup routing for software-factory workflows
  • Flywheel maintenance utilities — added global garbage purge tooling and nightly retrieval benchmarking for knowledge quality tracking

Changed

  • Release policy — removed the enforced release cadence gate so releases no longer block on a minimum wait between tags
  • Knowledge operator surfaces — plan and validation now wire knowledge operator surfaces directly into execution flow
  • Proof and runtime docs — goals, RPI docs, and contributor guidance now reflect the expanded proof surfaces and hookless runtime behavior

Fixed

  • Codex artifact parity — restored checked-in Codex parity for red-team and cleaned Codex runtime metadata/frontmatter drift across crank, forge, post-mortem, release, and swarm artifacts
  • Retrieval quality — replaced exact-substring filtering with token-level matching and tuned penalty, deduplication, and OR-fallback behavior
  • Harvest metadata preservation — promotion now preserves source metadata and fills missing maturity, utility, and type fields safely
  • Release tooling — release artifact directories are created safely and audit artifacts now resolve against release tag names
  • Documentation and link drift — repaired the post-mortem Codex link and aligned runtime docs around the newer startup and lifecycle flows

[2.32.0] - 2026-04-01

Added

  • Knowledge activation skill — new /knowledge-activation skill and CLI surfaces for activating cross-domain knowledge at runtime, with operator surface consumption and ranked intelligence context
  • Session intelligence engine — complete runtime engine with explainability, ranked context assembly, and trust policy enforcement
  • Runtime selection for ao rpi serve — serve now supports explicit runtime selection for Claude and Codex execution modes
  • Quality signals hook — new quality-signals.sh hook with test coverage for session quality telemetry
  • Pre-push gate expansion — 9 checks migrated from CI-only to the local pre-push gate for faster feedback
  • Inject stability warnings and status dashboard — closed 3 harvest items with signal tests and dashboard improvements

Changed

  • README refresh — product-minded rewrite with gain-framing and Strunk-style prose fixes
  • Philosophy doc — new docs/philosophy.md and observations section added to README
  • Documentation alignment — repo front doors and codex artifact guidance unified across entry points
  • Claude Code architecture lessons — retry budgets, stability flags, quality signals, and orchestration patterns applied to skills
  • Homebrew formula — updated to v2.31.0 with pre-built binaries

Fixed

  • Post-mortem closure integrity — normalized file parsing for closure integrity audits
  • CI reliability — resolved CI failures across codex refs, test pairing, hook coverage, worktree handling, docs parity, hook portability, and codex lifecycle
  • Lookup nested scanningao lookup now scans nested global knowledge directories correctly
  • Pre-push test stubs — added test stubs for new pre-push checks, skip non-shell in shellcheck

Dependencies

  • Bumped codecov/codecov-action from 5 to 6
  • Bumped DavidAnson/markdownlint-cli2-action from 22 to 23

[2.31.0] - 2026-03-30

Added

  • 9 lifecycle skills — bootstrap, deps, design, harvest, perf, refactor, review, scaffold, and test skills wired into RPI with auto-invocation and mechanical gates
  • ao harvest — cross-rig knowledge consolidation extracts and catalogs learnings from sibling crew workspaces
  • ao context packet — inspect stigmergic context packets for debugging inter-session handoff state
  • Hook runtime contract — formal Claude/Codex/manual event mapping with runtime-aware hook tooling
  • Evidence-driven skill enrichment — production meta-knowledge, anti-patterns, flywheel metrics, and normalization defect detection baked into 9 skill reference files
  • Research provenance — pending learnings now carry full research provenance for discoverability and citation tracking
  • Context declarations — inject, provenance, and rpi skills declare their context requirements explicitly
  • Goals and product output templates/goals and /product produce evidence-backed structured output

Changed

  • Three-gap context lifecycle contract — README, PRODUCT.md, positioning docs, and operational guides reframed around the context lifecycle model
  • Dual-runtime hook documentation — runtime modes table and troubleshooting updated for Claude + Codex hook coexistence

Fixed

  • CI reliability — resolved 4 pre-existing CI failures, restored headless runtime preflight, repaired codex parity drift checks
  • ao lookup retrieval — fixed retrieval gaps that caused lookup to return no results
  • Embedded sync — using-agentops SKILL.md and .agents/.gitignore now written correctly on first session start
  • Closure integrity — 24h grace window for close-before-commit evidence, normalized file parsing
  • Skill lint compliance — vibe, post-mortem, crank, and plan skills trimmed or restructured to stay under 800-line limit
  • Codex tool naming — added CLAUDE_TOOL_NAMING rule and fixed 5 Claude-era tool references in codex skills
  • ASCII diagram consistency — aligned box-drawing characters across 23 documentation files
  • Fork exhaustion prevention — replaced jq with awk in validate-go-fast to prevent fork bombs on large repos

[2.30.0] - 2026-03-24

Added

  • Codex hookless lifecycle supportao codex runtime commands, lifecycle fallback, and Codex skill orchestration now cover hookless sessions end to end
  • PROGRAM.md autodev contract — Added a first-class PROGRAM.md contract for autodev flows and taught /evolve and related RPI paths to use it
  • Long-running RPI artifact visibility — Mission control now exposes run artifacts and evaluator output so long-running RPI sessions are replayable and easier to inspect

Changed

  • Codex runtime maintenance flow — Refreshed Codex bundle hashes, lifecycle guards, runtime docs, and release validation coverage around the expanded Codex execution path

Fixed

  • Codex RPI scoping and closeout — Tightened objective scope, epic scope, closeout ownership, and validation gaps in the Codex RPI lifecycle
  • Release gate reliability — Restored headless runtime coverage, runtime-aware Claude inventory checks, and release-gate coherence validation
  • Reverse-engineer repo hygiene — Repo-mode reverse engineer now ignores generated and temp trees when identifying CLI and module surfaces

[2.29.0] - 2026-03-22

Added

  • Model cost tiers and config writesao config can now assign per-agent models by cost tier and persist repo configuration changes directly
  • Search brokerage over session history and repo knowledgeao search now wraps upstream cass results with repo-local AgentOps artifacts by default
  • Reviewer and post-mortem reference packs — Added model-routing, iterative-retrieval, confidence-scoring, write-time-quality, and conflict-recovery guidance across council, research, swarm, vibe, compile, and related skills

Changed

  • Competitive comparison and CLI docs — Refreshed comparison docs, release smoke coverage, and command documentation around the expanded search/config surface

Fixed

  • Flywheel proof and citation loop — Added deterministic proof fixtures, preserved exact research provenance, and made citation feedback artifact-specific so flywheel health reflects real closure state
  • Search alignment with forged session history — Search now stays aligned with forged session artifacts and fallback behavior
  • Hook-launched validation — Pre-push and release gates now isolate inherited git env/stdin correctly and cover newer hook scripts in integration tests
  • Codex council profile parity — Source and checked-in Codex council docs are back in sync for the shared profile contract

[2.28.0] - 2026-03-21

Added

  • Node repair operator — Crank now classifies task failures as RETRY (transient), DECOMPOSE (too complex), or PRUNE (blocked) with budget-controlled recovery
  • Knowledge refresh auto-trigger — Lightweight compile defrag runs automatically at session end via new SessionEnd hook
  • Configurable review agents — Project-level .agents/reviewer-config.md controls which judge perspectives council and vibe spawn
  • Three-tier plan detail scaling — Plan auto-selects Minimal, Standard, or Deep templates based on issue count and complexity
  • Adversarial ideation — Brainstorm Phase 3b stress-tests each approach with four red-team questions before user selection

Fixed

  • Crank SKILL.md line limit — Consolidated duplicate References sections to stay under 800-line skill lint limit
  • Codex skill parity — Synced all five competitive features to skills-codex with reference file copies

[2.27.1] - 2026-03-20

Fixed

  • Flywheel golden signals always shown — Golden signals were gated behind --golden flag, causing ao flywheel status to report "COMPOUNDING" while the hidden golden signals analysis showed "accumulating". Golden signals now compute and display by default.

[2.27.0] - 2026-03-20

Added

  • Flywheel golden signals — Four derived health indicators (velocity trend, citation pipeline, research closure, reuse concentration) that distinguish knowledge compounding from noise accumulation; accessible via ao flywheel status --golden
  • Forge-to-pool bridge — Forge auto-writes pending learnings as markdown to .agents/knowledge/pending/ for close-loop pool ingestion
  • SessionStart citation primingao lookup wired into SessionStart hook to close the citation gap between inject and session context
  • Skill catalog quality — Improved descriptions, extraction patterns, and reference linking across skill catalog

Fixed

  • .agents/.gitignore scope — Replaced broad !*/ pattern with explicit subdirectory list to prevent accidental tracking
  • Codex runtime skill parity — Hardened Codex runtime skill discovery and validation
  • Codex install smoke tests — Fixed test assertions for install path edge cases

Changed

  • CLI reference docs — Regenerated with updated date stamps

[2.26.1] - 2026-03-16

Fixed

  • RPI stops after Phase 2 — Restructured rpi, discovery, and validation orchestrator skills as compact DAGs with execution sequence in a single code block; eliminates LLM stopping between phases due to ### section headings acting as natural breakpoints
  • Test grep patterns for DAG headings — Updated test-tuning-defaults.sh to match new complexity-scaled gate headings after DAG restructure

Changed

  • Goals reimagined — GOALS.md rebuilt from first principles with fitness gate fixes
  • README progressive disclosure — Lead with moats, collapse detail into expandable sections
  • CLI reference docs — Regenerated with updated date stamps
  • Doctor + findings helpers — Added CLI test coverage for extracted helpers

[2.26.0] - 2026-03-15

Added

  • BF6–BF9 test pyramid levels — Regression (bug-specific replay), Performance/Benchmark, Backward Compatibility, and Security (in-test) bug-finding levels with language-specific patterns for Go and Python
  • Test pyramid decision tree expansion — 4 new routing questions for BF6–BF9 in the "When to Use" guide
  • RPI phase mapping for BF6–BF9 — Bug fix → BF6 mandatory, hot-path → BF7 benchmark, format change → BF8 compat fixture, secrets → BF9 redaction tests
  • regen-codex-hashes.sh — Manifest hash regeneration script for Codex skill maintenance

Changed

  • Go standards — Added benchmark tests (BF7), backward compat with testdata/compat/ (BF8), regression test naming convention (BF6), security tests for path traversal (BF9)
  • Python standards — Added Hypothesis property-based testing (BF1), pytest-benchmark patterns (BF7), backward compat with parametrized fixtures (BF8), regression test naming (BF6), secrets redaction tests (BF9)
  • Coverage assessment template — Extended BF pyramid table from BF1–BF5 to BF1–BF9

Fixed

  • Codex skill audit — 60+ findings fixed across all 54 Codex skills; removed orphaned claude-code-latest-features.md and claude-cli-verified-commands.md references
  • Skill lint warnings — Resolved all warnings in crank, rpi, recover skills
  • README skill references — Corrected broken references and linked orphaned templates
  • Skill linter refs — Fixed directory reference and backtick formatting in reverse-engineer-rpi
  • CHANGELOG sync hook — Replaced broken awk extraction with sed; awk failed on em-dash UTF-8 content producing header-only syncs
  • Plugin version parity — Added pre-commit check that warns when .claude-plugin/ manifest versions don't match the release version

[2.25.1] - 2026-03-15

Fixed

  • Codex BF pyramid parity — Synced BF1/BF2/BF4 bug-finding level selection into skills-codex implement, post-mortem, and validation skills
  • Codex Claude backend cross-contamination — Removed orphaned backend-claude-teams.md files (Claude primitives: TeamCreate, SendMessage) from 4 Codex skills (council, research, shared, swarm)
  • Dead converter rule — Removed stale sed substitution for backend-claude-teams.md rename in converter script
  • Swarm reference integrity — Added Reference Documents section to swarm SKILL.md; updated validate.sh to check only Codex-native backend references

[2.25.0] - 2026-03-14

Added

  • L0–L7 test pyramid standard — Shared reference doc (standards/references/test-pyramid.md) defining 8 test levels, agent autonomy boundaries (L0–L3 autonomous, L4+ human-guided), and RPI phase mapping
  • Test pyramid integration across RPI lifecycle — Discovery identifies test levels, plan classifies tests by level, pre-mortem validates coverage, implement selects TDD level, crank carries test_levels metadata, validation audits coverage, post-mortem reports gaps
  • RPI autonomous execution enforcement — Three-Phase Rule mandates discovery → implementation → validation without human interruption; anti-patterns table documents 7 failure modes
  • Evolve autonomous execution enforcement — Each cycle runs a complete 3-phase /rpi --auto; anti-patterns table documents 6 failure modes; large work decomposed into sub-RPI cycles
  • Codex skill standard — New standards/references/codex-skill.md with tool mapping, prohibited primitives, two-phase validation, DAG-first traversal, and prompt constraint boundaries
  • Codex-native overrides — Durable overrides for crank, swarm, council that survive regeneration
  • DAG-based Codex smoke testscripts/smoke-test-codex-skills.sh validates 54 skills with dependency-ordered traversal
  • Codex skill API contractdocs/contracts/codex-skill-api.md with conformance validator
  • Output contract declarationsoutput_contract field on council, vibe, pre-mortem, research skills with canonical finding-item schema

Changed

  • Codex converter rewrite — Strips Claude primitives instead of mapping to unavailable tools; rewrites reference files through codex_rewrite_text
  • CI pipeline — Removed codex skill parity check (skills-codex/ now manually maintained); fixed shellcheck and embedded sync issues

Fixed

  • Converter primitive stripping — Task primitives (TaskCreate, TeamCreate, SendMessage) properly stripped instead of mapped to non-existent Codex equivalents
  • Embedded hook sync — Added missing test-pyramid.md and codex-skill.md to CLI embedded references
  • ShellCheck SC1125 — Fixed em-dash in shellcheck disable directive in smoke test script
  • Skill line limits — Moved verbose autonomy rules to reference files to stay under tier-specific line budgets

[2.24.0] - 2026-03-12

Added

  • Error & rescue map template — Pre-mortem Step 2.5 with 3 worked examples (HTTP, database, LLM)
  • Scope mode selection — Pre-mortem Step 1.6 with 3-mode framework (Expand/Hold/Reduce) and auto-detection
  • Temporal interrogation — Pre-mortem Step 2.4 walks implementation timeline (hour ½/4/6+) for time-dependent risks
  • Prediction tracking — Pre-mortem findings get unique IDs (pm-YYYYMMDD-NNN) correlated through vibe and post-mortem
  • Finding classification — Vibe separates CRITICAL (blocks ship) from INFORMATIONAL findings
  • Suppression framework — Vibe loads default + project-level suppression patterns for known false positives
  • Domain-specific checklists — Standards skill extended with SQL safety, LLM trust boundary, and race condition checklists, auto-loaded by vibe
  • RPI session streak tracking — Post-mortem Step 1.5 shows consecutive session days and verdict history
  • Persistent retro history — Post-mortem Step 4.8 writes structured JSON summaries to .agents/retro/ for cross-epic trend analysis
  • Prediction accuracy scoring — Post-mortem Step 3.5 scores HIT/MISS/SURPRISE against pre-mortem predictions
  • Commit split advisor — PR-prep Phase 4.5 suggests bisectable commit ordering (suggestion-only)
  • Council finding auto-extraction — Significant findings from WARN/FAIL verdicts staged for flywheel consumption

Changed

  • Post-mortem examples condensed — Verbose examples replaced with concise 4-mode summary to stay under skill line limit

[2.23.1] - 2026-03-12

Fixed

  • Resolved all golangci-lint quality findings
  • Synced embedded standards after skill audit fixes
  • Synced Codex bundle after skill audit fixes
  • Resolved audit findings across council, vibe, standards skills

[2.23.0] - 2026-03-11

Added

  • Discovery and validation phase orchestrators — New /discovery and /validation skills decompose the RPI lifecycle into independently invocable phases (research+plan+pre-mortem and vibe+post-mortem)
  • Stigmergic packet scorecard — Ranked scoring for flywheel knowledge packets so higher-utility learnings surface first
  • Pinned work queue/evolve gains a pinned work queue with blocker auto-resolution for directed improvement loops
  • Per-package coverage ratchet — Pre-push gate enforces per-package coverage baselines that only move upward
  • Fast pre-push mode--fast flag for diff-based conditional checks, skipping unchanged packages
  • Standards auto-loading — Go and Python coding standards injected automatically into /crank and /swarm workers
  • 271 test functions — Four internal packages (pool, ratchet, resolver, storage) brought to 100% coverage

Changed

  • README restructured — Extracted reference material into dedicated docs, reducing README from 679 to 472 lines
  • RPI skill refactored/rpi now delegates to /discovery and /validation phase orchestrators instead of inlining all phases
  • Go and Python test conventions — Canonical standards enriched with assertion quality rules, naming conventions, and table-driven test guidance
  • Documentation alignment — Lifecycle, flywheel, primitive chain, and positioning docs updated to reflect current architecture

Fixed

  • Goal runner deadlock — Fixed goroutine deadlock in goal runner and added job timeouts to prevent stalls
  • 17 CLI bugs from deep audit — Addressed goroutine leaks, race conditions, panics, buffer overflows, and nil-check inconsistencies
  • Session close reliability — Resolved pre-existing session_close issues surfaced by vibe council review
  • ~50 zero-assertion tests — Upgraded smoke tests from no-op to behavioral assertions across cmd/ao and internal packages
  • Test file hygiene — Merged _extra_test.go and cov*_test.go files into canonical <source>_test.go names
  • CI stability — FIFO test skip on Linux, embedded skill sync, coverage ceiling adjustments, crank SKILL.md trimmed below 800-line limit
  • Auto-extract quality gate — Added quality gate to prevent low-fidelity auto-extracted learnings from entering the knowledge store

[2.22.1] - 2026-03-10

Added

  • Repo-native redteam harness — Added a packaged redteam pack and prompt runner to security-suite for repeatable repository-local security exercises
  • Findings management commands — Added CLI commands for listing and managing saved findings from the terminal

Changed

  • Closed-loop prevention validation — Completed the end-to-end finding compiler and prevention-ratchet validation path so saved findings feed back into earlier planning and task validation more reliably
  • Runtime contract parity — Localized shared Claude runtime reference packs into the source skills and regenerated Codex artifacts so source and generated bundles stay aligned

Fixed

  • Finding metadata injection — Exposed finding metadata consistently in inject output and JSON integrations after the merged findings work landed
  • Release gate regressions — Restored goals/package coverage, learning coherence, and hook-fixture isolation so the local release gate matches the shipped tree again

[2.22.0] - 2026-03-09

Added

  • Finding registry — Council findings are saved to a persistent registry and automatically fed back into planning and validation, so the same class of bug is caught earlier next time
  • Repo execution profiles.repo-execution-profile.json lets skills and runtimes adapt to each repository's validation gates, startup reads, and done-criteria
  • Headless team backend — Multi-agent workflows can run non-interactively (e.g. in CI) with structured JSON output and automated validation

Changed

  • Codex and embedded artifacts — Synced generated Codex bundles, embedded standards references, and install artifacts after merging branch work
  • Validation feedback capture — Recorded validation-cycle feedback into .agents learnings so tracked patterns match the shipped tree

Fixed

  • Lookup findings — Fixed ao lookup and inject scoring so findings render, cite, and score correctly after the branch merge
  • 23 CLI bug fixes — Fixed goroutine leaks, race conditions, panics, buffer overflows, missing error handling, and nil-check inconsistencies
  • Post-mortem evidence hardening — Staged changes and worktree evidence are now captured durably so proof isn't lost during compaction or cleanup

[2.21.0] - 2026-03-09

Added

  • Codex-first skill rollout across the full catalog with override coverage, generated-artifact governance, and install/runtime parity validation
  • Claim-aware next-work lifecycle handling with contract parity checks for /rpi and follow-on flows
  • Headless runtime skill smoke coverage and Codex backbone prompt validation in the release gate stack

Changed

  • Codex maintenance guidance, override coverage docs, and CLI-to-skills mapping to match the generated runtime model
  • Release-prep validation flows for runtime smoke, Codex artifact sync, and release note generation

Fixed

  • Next-work queue mutation races by making claim/update handling concurrency-safe and per-item
  • Codex prompt parity drift by syncing generated prompts and tightening override coverage gates
  • Worktree Git resolution and vibe-check runtime environment handling
  • Push/pre-push validation regressions and nested pre-push wrappers
  • Streamed phase timeout cancellation so phased runtime tests and release gating terminate promptly

[2.20.1] - 2026-03-07

Fixed

  • Codex install workflow now uses ~/.agents/skills as the single raw skill home and stops recreating an AgentOps mirror in ~/.codex/skills
  • Native Codex plugin refresh now archives overlapping legacy ~/.codex/skills AgentOps folders instead of repopulating them
  • Codex install docs now consistently describe the ~/.agents/skills workflow and the need for a fresh Codex session after install
  • Codex skill conversion now preserves multiline YAML description fields correctly, fixing malformed generated metadata for skills such as Compile
  • ao doctor now treats plugin-cache plus ~/.agents/skills as the supported Codex layout and reports manifest drift with accurate wording

[2.20.0] - 2026-03-05

Added

  • Flywheel loop closure — ao session close --auto-extract produces lightweight learnings and auto-handoff at session boundary
  • Handoff-to-learnings bridge — ao handoff now extracts decisions into .agents/learnings/ automatically
  • Session-type scoring in ao inject --session-type — 30% boost for matching session context (career, debug, research, brainstorm)
  • Identity artifact support — ao inject --profile surfaces .agents/profile.md in session context
  • MEMORY.md auto-promotion in ao flywheel close-loop (Step 7) after maturity transitions
  • Session-type detection in ao forge output metadata
  • Production RPI orchestration engine — ao rpi serve <goal> with SSE streaming and auto mode
  • Knowledge mining — ao mine and ao defrag commands for automated codebase intelligence
  • Context declarations — ao inject --for <skill> reads skill frontmatter context: block for scoped retrieval
  • Sections include allowlist and context artifact directories for skill-scoped injection
  • ao handoff command for structured session boundary isolation
  • Behavioral guardrails — 3-layer hook defense-in-depth (intent-echo, research-loop-detector, task-validation-gate)
  • Context enforcement hook and run-id namespaced artifact paths
  • Headless invocation standards and RPI phase runner
  • Nightly CI compile job for automated knowledge warmup
  • Coverage ratchet gate with BATS integration tests for shell scripts
  • Fuzz targets, property tests, and golden file contracts for CLI
  • Git worker guard, embedded parity gate, and swarm evidence validation hooks
  • Release cadence gate warns on releases within 7 days of previous

Changed

  • Coverage floor raised to 84% for cmd/ao, average floor to 95%
  • Complexity ceiling tightened to 20 (from 25)
  • Default session-start hook mode switched from manual to lean
  • Hard quality gate on injection — maturity + utility filter
  • Post-mortem redesigned as knowledge lifecycle processor
  • RPI god-file split — 1,363 lines reduced to 203 with structured handoff schema
  • Legacy RPI orchestrator retired — serve now uses phased engine (-1,121 lines)
  • Council V2 findings synthesized into agent instructions and skill contracts
  • 10k LOC of coverage-padding tests deleted; 72 stale tests quarantined
  • Skill hardening — web security controls across 5 skills, CSRF protection, crank pre-flight
  • Session-end hook wires ao session close --auto-extract before existing forge pipeline

Fixed

  • Flywheel signal chain — confidence decay, close-loop ordering, glob errors
  • Path traversal in context enforcement hook and frontmatter parsing
  • Race condition in handoff consumption at session boundary
  • ao mine stabilized — dedup IDs, error propagation, --since window, empty output guard
  • Hook test assertions aligned with warn-then-fail ratchet pattern (strict env required)
  • Pre-mortem gate exit code corrected to 2 in strict mode (was 1)
  • RPI serve event pipeline and coherence gate hardened
  • jq injection via bare 8-hex run IDs in serve classifier
  • Goals parser edge cases — paired backtick strip and rune-aware truncation
  • UTF-8 truncation across six functions converted to rune-safe slicing
  • CORS headers and stale doc references cleaned up
  • Cross-wave worktree file collisions prevented
  • hookEventName added to hookSpecificOutput JSON schema

[2.19.3] - 2026-02-27

Changed

  • README highlights ao search (built on CASS) — indexes all chat sessions from every runtime unconditionally; adds Second Brain + Obsidian vault section with Smart Connections local/GPU embeddings and MCP semantic retrieval

[2.19.2] - 2026-02-27

Fixed

  • CHANGELOG retrospectively updated to document all v2.19.1 post-tag commits (skills namespace fixes were shipped but not recorded)

[2.19.1] - 2026-02-27

Fixed

  • Quickstart skill rewritten from 275 lines to 68 lines — removes 90-line ASCII diagram and 50-line intent router that caused 3+ minute runtime; now outputs ~8 lines and completes in under 30 seconds
  • truncateText edge case: maxLen 1–3 now returns "..."[:maxLen] instead of the original string unchanged
  • Dead anti-pattern promotion functions removed from ao maturity (promoteAntiPatternsCmd, filterTransitionsByNewMaturity, displayAntiPatternCandidates, ~99 LOC)
  • Windows file-lock and signal support — replace no-op filelock_windows.go with real LockFileEx/UnlockFileEx via kernel32.dll; extract syscall.Flock and syscall.Kill into platform-specific helpers so the binary compiles on Windows without POSIX-only syscalls
  • heal.sh Check 7 false positive — script reference integrity check now strips URLs before pattern matching, preventing remote https://…/scripts/foo.sh references from being validated as local files
  • Security gate BLOCKED_HIGH — three persistent findings resolved: gosec G118 false positive (context cancel func returned to caller), golangci-lint nolint syntax (space in // nolint: directive), radon double-counting reverse_engineer_rpi.py from skills-codex/ copy
  • 71 stale ao know * and ao quality * namespace references replaced across 17 skills-codex/ SKILL.md files — agents running rpi/evolve/crank were invoking non-existent commands from the pre-flatten CLI namespace
  • Three HIGH-severity stale command references fixed across skills/ and skills-codex/: ao flywheel statusao metrics flywheel status, ao settings notebook updateao notebook update, ao start seed/initao seed/ao init

Added

  • Spec-consistency gate (scripts/spec-consistency-gate.sh) validates contract files before crank spawns workers
  • Command-surface parity gate (scripts/check-cmdao-surface-parity.sh) ensures all CLI leaf commands are tested
  • scripts/post-merge-check.sh now validates go mod tidy sync and blocks on symlinks
  • scripts/merge-worktrees.sh now propagates file deletions and preserves permissions
  • Post-mortem preflight script checks reference file existence before council runs
  • Hooks.json preflight validates script existence
  • Windows binaries added to GoReleaser and SLSA attestation subject list

Changed

  • Coverage floor raised 78% → 80% with CI enforcement gate; Codecov threshold aligned to 75%
  • Six truncation functions converted to rune-safe Unicode slicing
  • truncateID in pool.go delegates to shared truncateText
  • Crank skill invokes spec-consistency gate before spawning workers
  • Vibe skill carries forward unconsumed high-severity next-work items as pre-flight context
  • Release skill warns on unconsumed high-severity next-work items
  • next-work JSONL schema formalized to v1.2
  • Skills installation switched from npx skills to native curl installer (bash <(curl -fsSL …/install.sh))
  • README updated with 5-command summary, compound effect section, and /vibe breakdown

[2.19.0] - 2026-02-27

Added

  • ao mind command for knowledge graph operations.
  • New RPI operator surfaces: normalized C2/event plumbing plus ao rpi stream, ao rpi workers, and tmux worker nudge visibility.
  • Codex install/bootstrap improvements, including native ~/.codex/skills install and one-line installer flow.
  • Windows binaries added to GoReleaser build outputs.

Changed

  • CLI namespace migration completed and aligned across hooks, docs, integration tests, and generated command references.
  • Codex skill system moved to regenerated modular layout with codex-specific overrides and runtime prompt tailoring.
  • CI/release gates hardened (codex runtime sections, release e2e validation, parity checks, stricter policy enforcement).
  • High-complexity CLI paths refactored (runRPIParallel, runDedup, parseGatesTable) to lower cyclomatic complexity.

Fixed

  • Multiple post-mortem remediation waves landed for CLI/RPI/swarm reliability and edge-case handling.
  • Hook delegation and integration behavior corrected for flat command namespace.
  • heal.sh false-positive behavior reduced and doctor stale-path detection improved.
  • Skill/doc parity and cross-reference drift issues corrected across codex and core skill catalogs.

Removed

  • Legacy inbox/mail command surface and stale/dead skill references from active catalogs.

[2.18.2] - 2026-02-25

Fixed

  • ao seed now creates .gitignore and storage directories — reuses setupGitProtection, ensureNestedAgentsGitignore, and initStorage from ao init
  • ao seed text updated from stale ao inject/ao forge to current MEMORY.md + session hooks paradigm
  • MemRL feedback loop closed — ao feedback-loop command wired, ao maturity --recalibrate dry-run guard added
  • Quickstart skill updated to reference ao seed and current flywheel docs
  • CLI reference regenerated after ao feedback-loop and seed help text changes

Changed

  • .agents/ session artifacts removed from git tracking
  • PRODUCT.md updated — Olympus section removed, value props and skill tier counts corrected
  • GOALS.md coverage directive updated to measured 78.8% (target 85%)

[2.18.1] - 2026-02-25

Changed

  • SessionStart hook default mode changed from manual to lean — flywheel injection now fires every session
  • Auto-prune enabled by default (AGENTOPS_AUTO_PRUNE defaults to 1, opt-out via =0)
  • Anti-pattern detection threshold lowered from harmful_count >= 5 to >= 3
  • Eviction confidence threshold relaxed from < 0.2 to < 0.3
  • Maturity promotion threshold in --help text synced with code (0.70.55)

Fixed

  • Empty learnings no longer inflate flywheel metrics — extract prompt skips empty files, pool ingest rejects "no significant learnings" stubs
  • ao pool ingest now runs automatically in session-end hook after forge (was manual-only)
  • 8 stale doc/comment references to old thresholds updated across hooks, ENV-VARS.md, HOOKS.md, using-agentops skill
  • 13 empty stub learnings removed from .agents/learnings/

[2.18.0] - 2026-02-25

Added

  • ao notebook update command — compound MEMORY.md loop that merges latest session insights into structured sections
  • ao memory sync command — sync session history to repo-root MEMORY.md with managed block markers for cross-runtime access (Codex, OpenCode)
  • ao seed command — plant AgentOps in any repository with auto-detected templates (go-cli, python-lib, web-app, rust-cli, generic)
  • ao lookup command — retrieve specific knowledge artifacts by ID or relevance query (two-phase complement to ao inject --index-only)
  • ao constraint command family — manage compiled constraints (list, activate, retire, review)
  • ao curate command family — curation pipeline operations (catalog, verify, status)
  • ao dedup command — detect near-duplicate learnings with optional --merge auto-resolution
  • ao contradict command — detect potentially contradictory learnings
  • ao metrics health subcommand — flywheel health metrics (sigma, rho, delta, escape velocity)
  • ao context assemble command — build 5-section context packet briefings for tasks
  • Work-scoped knowledge injection: ao inject --bead <id> boosts learnings tagged with the active bead
  • Predecessor context injection: ao inject --predecessor <handoff-path> surfaces structured handoff context
  • Compact knowledge index: ao inject --index-only outputs ~200 token index table for JIT retrieval
  • Learning schema extended with source_bead and source_phase fields for work-context tracking
  • ao extract --bead <id> tags extracted learnings with the active bead ID
  • Citation-to-utility feedback pipeline in flywheel close-loop (stage 5)
  • Global ~/.agents/ knowledge tier for cross-repo learning sharing (0.8 weight penalty, deduped)
  • Bead metadata resolver reads from env vars (HOOK_BEAD_TITLE, HOOK_BEAD_LABELS) or cache file
  • Goal templates embedded in binary (go-cli, python-lib, web-app, rust-cli, generic) for ao goals init --template and ao seed
  • Platform-specific process-group isolation for goal check timeouts (Unix: SIGKILL pgid, Windows: taskkill /T)
  • SessionStart hook rewritten with 3 startup modes: lean (default), manual, legacy — via AGENTOPS_STARTUP_CONTEXT_MODE
  • SessionEnd hook now gates notebook update and memory sync on successful forge
  • Type 3 setup hook template: hooks/examples/50-agentops-bootstrap.sh
  • Constraint compiler hook: hooks/constraint-compiler.sh
  • Codex-native skill format (skills-codex/) with install and sync scripts for cross-runtime skill delivery
  • Comprehensive cmd/ao test coverage push — 500+ tests across 5 waves reaching 79.2% statement coverage (13 untestable functions excluded)

Changed

  • SessionStart hook default mode changed from full inject to lean (extract + lean inject, shrinks when MEMORY.md is fresh)
  • ao flywheel close-loop now applies ALL maturity transitions (not just anti-pattern)
  • ao hooks generated config uses script-based commands instead of inline ao invocations
  • ao rpi prefers epic-type issues before falling back to any open issue

Fixed

  • truncateText now uses rune-safe []rune slicing to avoid breaking multi-byte UTF-8 characters
  • syncMemory extracted from Cobra handler for testability
  • parseManagedBlock detects duplicate markers and refuses to parse (prevents data loss)
  • readNLatestSessionEntries warns on skipped unreadable session files
  • readSessionByID detects ambiguous matches and returns error instead of first substring match
  • findMemoryFile broad contains-fallback removed (was matching wrong projects)
  • pruneNotebook iteration capped at 100 to prevent runaway loops
  • MEMORY_AGE_DAYS sentinel initialized to -1 (was 0, causing false lean-mode activation when file missing)
  • Lean-mode guard now requires MEMORY_AGE_DAYS >= 0 before comparing freshness
  • Memory sync moved inside forge success gate in session-end hook
  • ao search --json returns [] (empty JSON array) when no results, instead of human-readable text
  • ao doctor returns DEGRADED status for warnings without failures (previously only HEALTHY/UNHEALTHY)
  • ao rpi status goroutine leak fix — signal channel properly cleaned up
  • Inline rune truncation in formatMemoryEntry replaced with shared truncateText
  • 6 new tests for dedup, ambiguity detection, iteration cap, duplicate markers
  • Cobra pflag state pollution between test invocations — explicit flag reset in executeCommand() helper
  • Goals validate.sh outdated checks and missing validate.sh for 7 skills
  • 10 tech debt findings from ag-8km+ag-chm post-mortem (stale nudge, scanner, docs)
  • ao binary codesigned with stable Mach-O identifier
  • Hook integration tests updated — removed 8 stale standalone ao-* hook tests consolidated into session-end-maintenance.sh

[2.17.0] - 2026-02-24

Added

  • GOALS.md (v4) OODA-driven intent layer — markdown-based goals format with mission, north/anti stars, and steerable directives
  • ao goals init interactive GOALS.md bootstrap with --non-interactive mode
  • ao goals steer command to add, remove, and prioritize directives
  • ao goals prune command to remove stale gates referencing missing paths
  • ao goals migrate --to-md converter from GOALS.yaml to GOALS.md format
  • ao goals measure --directives JSON output of active directives
  • ao goals validate reports format and directive count
  • Format-aware ao goals add writeback (auto-detects md or yaml)
  • Go markdown parser library with case-insensitive heading matching and round-trip rendering (26 tests)
  • /goals skill rewritten with 5 OODA verbs (init/measure/steer/validate/prune)
  • /evolve Step 3 rewritten with directive-based cascade for idle reduction

Fixed

  • ao rpi falls back to any open issue when no epic exists (#50)
  • RPI phased processing tests added (~230 lines) for writePhaseResult, validatePriorPhaseResult, heartbeat, and registry directory

[2.16.0] - 2026-02-23

Added

  • Evolve idle hardening — disk-derived stagnation detection, 60-minute circuit breaker, rolling fitness files, no idle commits
  • Evolve --quality mode — findings-first priority cascade that prioritizes post-mortem findings over goals
  • Evolve cycle-history.jsonl canonical schema standardization and artifact-only commit gating
  • heal-skill checks 7-10 with --strict CI gate for automated skill maintenance
  • 6-phase E2E validation test suite for RPI lifecycle (gate retries, complexity scaling, phase summaries, promise tags)
  • Fixture-based CLI regression and parity tests
  • ao goals migrate command for v1→v2 GOALS.yaml migration with deprecation warning (#48)
  • Goal failure taxonomy script and tests

Changed

  • CLI taxonomy, shared resolver, skill versioning, and doctor dogfooding improvements (6 architecture concerns)
  • GoReleaser action bumped from v6 to v7
  • Evolve build detection generalized from hardcoded Go gate to multi-language detection

Fixed

  • ao pool list --wide flag and pool show prefix matching (#47)
  • Consistent artifact counts across doctor, badge, and metrics (#46)
  • Double multiplication in vibe-check score display (#45)
  • Skills installed as symlinks now detected and checked in both directories (#44)
  • Learnings resolved by frontmatter ID; .md file count in maturity scan (#43)
  • JSON output truncated at clean object boundaries (#42)
  • Misleading hook event count removed from display (#41)
  • Post-mortem schema model field and resolver DiscoverAll migration
  • 15+ missing skills added to catalog tables in using-agentops
  • Handoff example filename format corrected to YYYYMMDDTHHMMSSZ spec
  • Quickstart step numbering corrected (7 before 8)
  • OpenAI docs skill: added Claude Code MCP alternative to Codex-only fallback
  • Dead link to conflict-resolution-algorithm.md removed from post-mortem
  • ao forge searchao search in provenance and knowledge skills
  • OSS docs: root-level doc path checks, removed golden-init reference
  • Reverse-engineer-rpi fixture paths and contract refs corrected
  • Crank: removed missing script refs, moved orphans to references
  • Codex-team: removed vaporware Team Runner Backend section
  • Security skill: bundled security-gate.sh, fixed security-suite path
  • Evolve oscillation detection and TodoWrite→Task tools migration
  • Wired check-contract-compatibility.sh into GOALS.yaml
  • Synced embedded skills and regenerated CLI docs