Changelog¶
All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
[Unreleased]¶
Added¶
- Strict Delegation Contract for
/rpi,/discovery, and/validation— top-level orchestrator skills now declare strict sub-skill delegation as the default. Each skill points to the new canonical referenceskills/shared/references/strict-delegation-contract.mdwhich documents the contract, anti-pattern rationalizations, and supported compression escapes (--quick,--fast-path,--no-retro,--no-forge,--skip-brainstorm,--no-scaffold,--no-behavioral,--allow-critical-deps). There is no--fullflag — strict delegation is always on. - Orchestrator Compression Anti-Pattern learning at
.agents/learnings/2026-04-19-orchestrator-compression-anti-pattern.md, surfaced viaao injectat session start. Includes detection phrases, corrective actions, and rationalizations to reject. - Orchestrator-owned step markers in
skills/crank/SKILL.md(STEP 3a.3, STEP 6.5 slop-scan, STEP 8.7) plus an "Inline Work Policy" footer documenting which steps are intentionally inline vs delegated.
Removed¶
- Archived AO↔Olympus bridge integration: removed
docs/ol-bridge-contracts.md,docs/architecture/ao-olympus-ownership-matrix.md, MemRL policy contracts,skills/*/scripts/ol-*.sh,cli/cmd/ao/inject_ol_test.go, and associated CLI types (OLConstraint,gatherOLConstraints,.ol/directory collector). Olympus predecessor's useful patterns live on insideao.
Changed¶
--no-lifecyclein/discoveryrenamed to--no-scaffoldfor semantic clarity — the flag controls STEP 4.5 scaffold auto-invocation only, not broader lifecycle checks.--no-lifecycleis honored as a deprecated alias through v2.40.0; when both flags are passed, they are equivalent. Other skills (/crank,/validation,/implement,/evolve) retain--no-lifecyclewith its existing lifecycle-skill-invocation semantics./discoveryflags table expanded:--autois now explicitly documented (was transitively honored but undocumented);--interactivescope clarified ("research + plan gates, not pre-mortem")./validationflags table expanded:--complexity=<level>syntax formalized to match/rpiand/discovery;--interactivescope documented./rpi--interactiveflag scope note added: applies to discovery (research + plan) and validation (Gate 1, Gate 2); does NOT override pre-mortem or vibe council autonomy.
Fixed¶
- Orchestrator compression vulnerability — a live compression was observed 2026-04-19 where
/rpiwas invoked but phases were inlined instead of delegated. This release documents the anti-pattern (forged learning + loud skill text), scaffolds future enforcement (shared contract reference used by all 6 orchestrator skills), and explicitly defers runtime hook enforcement to a follow-up initiative. It does not mechanically prevent compression yet — the durable fix depends onao injectsurfacing the forged learning on future session starts. See.agents/research/2026-04-19-rpi-skill-dag-audit.mdfor the audit and.agents/plans/2026-04-19-rpi-dag-hardening.mdfor the remediation plan.
[2.37.2] - 2026-04-15¶
Added¶
- Swarm evidence validation — AgentOps now ships a swarm-evidence schema and validator, and wires that proof surface into validation and release gates.
- Lead-only worker git guard — worker sessions now have an explicit lead-only git guard in the hook chain, reducing accidental write authority in multi-agent runs.
- Compile and harvest operator controls —
ao compileadds runtime preference plus--resetand--repaircontrols, while harvest now reports excluded low-confidence candidates and top near-misses.
Changed¶
- Release and pre-push validation — local release, pre-push, and command coverage gates now validate more of the hook, evidence, and Codex runtime surface before publish.
- Codex/runtime artifacts and docs — compile, evolve, post-mortem, swarm, and related runtime docs and artifacts were decomposed and synchronized to better match shipped behavior.
- Flywheel backlog bookkeeping — next-work aggregates, consumed markers, and enum normalization were cleaned up so carry-forward work is recorded consistently.
Fixed¶
- Pre-mortem gate ambiguity — the crank pre-mortem gate now denies ambiguous state by default instead of failing open.
- CLI and shell reliability edges —
ao rpi serve --run-idnow accepts legacy 8-hex IDs,ao mine --dry-runemits a single clean JSON payload, and bash invocations are sanitized to bypass unsafe shell aliases. - Compile, harvest, and release drift — compile repair defaults, malformed frontmatter salvage, YAML parse error surfacing, CI fixture drift, shellcheck drift, and Codex artifact metadata drift were corrected.
[2.37.1] - 2026-04-15¶
Added¶
- Dream morning packets — Dream can now emit ranked morning work packets with evidence, target files, exact follow-up commands, and queue/bead handoff metadata.
- Dream yield telemetry and long-haul corroboration — overnight reports now record packet-confidence telemetry and can trigger a bounded long-haul corroboration pass when the first pass produces weak morning output.
Changed¶
- Dream decision flow — overnight runs now prefer cheaper evidence corroboration before slower council fan-out, so strong runs stay short and extended runtime is reserved for genuinely weak output.
Fixed¶
- Headless Claude Dream council — Dream now uses Claude's working JSON output contract for headless council runs and normalizes the returned envelope before validation.
- Dream close-loop and report surfaces — overnight runs now write real close-loop callbacks and post-loop report artifacts instead of leaving placeholder
pendingsteps. - Retrieval ratchet release gate fallback — the retrieval-quality release check now falls back to checked-in eval data when a local manifest is absent.
[2.37.0] - 2026-04-14¶
Added¶
- Windows install and smoke coverage —
scripts/install-ao.ps1adds a first-class Windows install path, and the blockingwindows-smokegate exercises PowerShell install, localao doctor, and Windows-sensitive Go packages. - Compile command —
ao compilemakes knowledge compilation a first-class CLI surface with docs and tests. - Local LLM forge pipeline —
ao forgecan now redact, summarize, structurally review, and queue transcript-derived wiki pages with Dream worker integration. - Dream curator and evolve sub-cycle — Dream gained a local curator adapter plus
ao evolve --dream-first|--dream-only, allowing overnight knowledge passes to feed the daytime improvement loop. .agentswiki surfaces — INDEX, LOG, wiki directories, and search integration formalize.agents/as a Karpathy-style knowledge wiki with index-first navigation.- Operational quality surfaces — beads audit/cluster commands, swarm preflight advice, status quality signals, retrieval eval queries, and a retrieval-quality CI ratchet broaden release-time proof.
Changed¶
- Knowledge scoring and search behavior — inject now deduplicates by content hash, boosts indexed pages, weights stability, and search can pull Dream vault and wiki sources with stronger local recall.
- Overnight and RPI internals — overnight, lifecycle, search, inject, harvest, and RPI flows were decomposed into smaller helpers while tightening proof paths, mixed-mode provenance, and worktree cleanup.
- Public framing and contributor docs — README, philosophy, planning/post-mortem docs, and reference surfaces now better match the context-compiler and operational-layer story.
Fixed¶
- Windows overnight liveness — Windows process checks no longer rely on Unix
signal(0)semantics. - Dream RunLoop status invariants — live-tree hash coverage now exercises every terminal RunLoop status, and
degradedreflects the current rollback semantics. - Release retag safety — release tooling now preserves annotated tags, validates audit artifact manifests and refs, and cancels stale reruns before duplicate publish attempts.
- Post-mortem and closure audits — metadata links, evidence-only closure packets, parser-path handling, and closure packet evidence modes were normalized.
- Codex and runtime reliability — same-thread lifecycle restart, root-scoped fallback reads, JSON config writes, bridge contract validation, and next-work proof-path handling were hardened.
[2.36.0] - 2026-04-11¶
Added¶
- Evolve operator command —
ao evolvenow exposes the v2 autonomous improvement loop directly in the CLI, including--max-cycles,--queue,--beads-only,--quality,--compile, and strict-quality passthrough flags. - Autodev program contract — root
PROGRAM.mdgives evolve/autodev a repo-local operating contract with mutable and immutable scope, validation commands, escalation policy, and stop conditions. - Beads stale-scope tooling —
ao beads verify|lint|harvestadds first-class stale-citation checks for bead-driven planning and RPI recovery. - RPI discovery artifacts — RPI can now persist and consume discovery artifacts, with tests and docs covering the
--discovery-artifactpath. - Dream RunLoop invariant coverage —
TestRunLoop_LiveTreeHashInvariant_AllStatuseslocks theIsCorpusCompounded()and live-tree mutation invariant across deterministically reproducible terminal statuses. - Dream failed-summary contract coverage — regression tests now lock the
finalizeOvernightSummarycontract for MEASURE consecutive-failure halts and persisted iteration history. - Dream operator mode —
ao overnight start|run|report|setupadds a private overnight lane with shareddream.*config, keep-awake defaults, scheduler/bootstrap guidance, council-ready runner packets, and DreamScape-style morning summaries - Nightly live retrieval proof — the dream-cycle now runs
ao retrieval-bench --live --json, emits retrieval proof in nightly summaries, and keeps a visible artifact trail for flywheel health - Pattern-to-skill drafts — repeated patterns can now generate review-only skill drafts under
.agents/skill-drafts/during flywheel close-loop - Fresh-repo onboarding welcome — new session-start routing helps first-time repos enter discovery, implementation, or validation without needing the full RPI lane first
- Docs-site and contribution proof surfaces — GitHub Pages navigation, comparison pages, behavioral-discipline guidance, strategic-doc validation patterns, and a first-skill guide expand the public proof surface
Changed¶
- RPI wave recovery integrated — recovered RPI wave work landed across Dream, council, stale-scope planning, discovery artifacts, CI hardening, and Codex runtime surfaces.
- Council
--mixedstrict contract documented —skills/council/references/cli-spawning.mddocuments that/council --mixedrequires Codex CLI and emits a hard error instead of silently falling back to Claude-only. - Plan and pre-mortem skill bodies decomposed — focused reference files now carry the detailed pre-decomposition, scope-mode, mandatory-check, output, wave-matrix, and task-creation guidance while keeping the top-level skills within lint budgets.
- Bead-input pre-flight wired into planning skills —
/planand/pre-morteminvokeao beads verify <bead-id>for full-complexity, aged, or prior-session bead inputs before decomposition or validation. - Operational-layer framing — README, onboarding, docs, comparisons, and linked surfaces now consistently explain AgentOps as bookkeeping, validation, primitives, and flows for coding agents
- Dream runtime positioning — the public GitHub nightly is now documented as a proof harness, while
ao overnightis documented as the private local compounding engine - Codex default path — native hooks, install copy, runtime smoke coverage, and checked-in Codex artifacts are aligned around the native-plugin path on supported Codex versions
- Validation guidance — behavioral-discipline and strategic-doc review are now first-class references alongside code review and runtime validation
Fixed¶
- Windows Codex installer — Codex installation now has a Windows path instead of assuming Unix shell behavior.
- golangci-lint v2 contract — the local lint wrapper and CI configuration now pin the v2 behavior expected by the repository.
- security-toolchain-gate CI — deterministic fixture generation in
cli/internal/overnight/fixture/gen_fixture.gois annotated as a non-cryptographic seeded-random use, avoiding a false-positive semgrep blocker. - Recovered RPI validation blockers — validation drift from the recovered RPI wave was cleared before retagging the release.
- Stale-scope reference placement — shared stale-scope validation guidance now lives under
skills/shared/references/soheal.sh --strictcan resolve it consistently. - Release and CI drift — resolved docs-site Liquid/frontmatter issues, headless runtime smoke portability problems, pre-push shim test drift, and compile-skill headless command drift caught during release prep
- Codex install and artifact drift — fixed stale slash-command references, refreshed checked-in artifact metadata, added a Codex compile wrapper, and corrected plugin/marketplace mismatches exercised by smoke coverage
- Runtime proof stability — promoted Codex runtime smoke into the blocking smoke path and fixed related shellcheck and install-surface rough edges
Removed¶
- DevOps-rooted tagline — public framing no longer leads with the old DevOps-layer tagline; the Three Ways lineage remains supporting doctrine instead of the category label
[2.35.0] - 2026-04-07¶
Added¶
- Codex native hooks — AgentOps hooks now install natively into Codex CLI v0.115.0+ via
~/.codex/hooks.json; 8 hooks wired (session-start, inject, flywheel-close, prompt-nudge, quality-signals, go-test-precommit, commit-review, ratchet-advance); installer enablescodex_hooksfeature and upgrades from hookless fallback to native hook runtime - Knowledge compiler skill — renamed athena →
/compilewith Karpathy-style incremental compilation, pluggable LLM backend (AGENTOPS_COMPILE_RUNTIME=ollama|claude), interlinked markdown wiki output at.agents/compiled/ - App struct dependency injection —
Appstruct carriesExecCommand,LookPath,RandReader,Stdout,Stderrseams; gc bridge, events, executor, context relevance, tracker health, and stream modules accept injected dependencies instead of mutable package-level vars - Test shuffle in CI —
-shuffle=onadded tovalidate.ymlandMakefiletest targets, exposing and fixing 6 ordering-dependent tests (cobra flag leaks, maturity var leaks, env var leaks)
Changed¶
- CLI internal extraction (waves 5-13) — business logic extracted from
cmd/aomonolith into 15internal/domain packages (rpi,search,context,quality,goals,lifecycle,bridge,forge,mine,plans,knowledge,storage,pool,taxonomy,worker) using Options struct pattern for dependency injection - Goals test migration — 7 goals test files moved from
cmd/aotointernal/goalsas external test package (goals_test) witht.Parallel()and directgoals.Run*()calls replacing cobra command wiring - Test isolation —
resetCommandStatenow saves/restores 10 maturity globals;resetFlagChangesRecursiveresets flag values to defaults; RPILoop and toolchain tests clearAGENTOPS_RPI_RUNTIME*env vars viat.Setenv
Fixed¶
- Defrag test flag leak —
TestDefragOutputDirFlagusedcmd.Flags().Lookup("output")which matched the root persistent--outputflag; changed tocmd.LocalFlags().Lookup("output") - Goroutine leak false positive —
TestRunGoals_GoroutineLeakusedgoleak.VerifyNonewhich caught goroutines from parallel tests; switched togoleak.IgnoreCurrent()to only detect leaks within the test itself - Secret scan false positives — excluded
.gc/directory andGetenv/os.Environpatterns from secret pattern scan - Codex skill validation — added
output_contractas valid schema key,cross-vendor/knowledgeas valid tiers, fixed$/prefix in codex forge/post-mortem/scenario skills - Scenario CLI snippets — replaced non-existent
--source/--scopeflags with valid--statusvariants
Removed¶
- Coverage percentage CI gates — removed
coverage-ratchetjob,check-cmdao-coverage-floor.sh,.coverage-baseline.json, and associated BATS tests; percentage gates blocked CI during architectural refactors without catching bugs fire.go— FIRE loop (find-ignite-reap-escalate) superseded by gc sling + bead dispatch;formatAgehelper moved toinject_predecessor.gorpi_workers.go— per-worker health display superseded by gc agent health patrol;ao rpi workerssubcommand removed from CLI and docs
[2.34.0] - 2026-04-05¶
Added¶
- Stage 4 Behavioral Validation — new validation tier between council/vibe and production:
- Holdout scenarios stored in
.agents/holdout/with PreToolUse isolation hook preventing implementing agents from seeing evaluation criteria - Satisfaction scoring (0.0-1.0 probabilistic) in verdict schema v4, replacing boolean-only PASS/FAIL
- Agent-built behavioral specs generated during
/implementStep 5c /scenarioskill for authoring and managing holdout scenariosao scenario init|list|validateCLI commands (4 subcommands, 11 tests)- STEP 1.8 in
/validationpipeline evaluating holdout scenarios + agent specs schemas/scenario.v1.schema.jsondefining the holdout scenario format- Flywheel gate command —
ao flywheel gatechecks readiness for retrieval-expansion work (research closure, rho threshold, holdout precision@K) - Citation confidence scoring —
citationEventIsHighConfidencewith bucketed confidence (0/0.5/0.7/0.9) gates MemRL rewards on match quality - Retrieval bench refactor — train/holdout splits, section-aware scoring (
scoreBenchSections), manifest-based benchmark cases - Proof-backed next-work visibility —
classifyNextWorkCompletionProofunifies completed-run, execution-packet, and evidence-only-closure proof types; context explain and stigmergic packet now report proof-backed suppressions - Three-gap contract proof gates — lifecycle gap mapping gates added to GOALS.md
- Cross-vendor execution —
--mixedflag for Claude + Codex council judges - Gas City bridge — gc as default executor for RPI phase execution with L1-L3 tests
- 149 L2 integration tests — AI-native test shape ("L2 first, L1 always") validated at scale; coverage floor raised 78.8% → 81.0%
- Test coverage hardening — GPG commit-signing fixes, root-skip guards for containerized CI, 350+ lines of vibecheck detector/metrics tests, maturity.go empty-content bugfix
Changed¶
- Codex parity hook —
codex-parity-warn.shnow supports opt-in blocking mode viaAGENTOPS_CODEX_PARITY_BLOCK=1(exit 2 instead of advisory) - 12-factor doctrine — compressed from 474 to 114 lines, reframed as supporting lens rather than product definition
- Skill count — 65 → 66 (added
/scenario) - Research skill — now persists reusable findings to
.agents/findings/registry.jsonlwith finding-compiler refresh - Closure integrity audit — accepts durable closure packets without scoped-file sections as valid evidence
- Proof-backed legacy entries —
shouldSkipLegacyFailedEntryusesCompletionEvidencefield (proof-only, no heuristic fallback) readQueueEntries— returns all non-consumed entries; proof filtering is downstream viashouldSkipLegacyFailedEntry
Fixed¶
- 6 CI failure categories resolved in one commit (f1b83b25)
- Cobra test registration —
scenarioandflywheel gateadded to expectedCmds - Citation feedback test — assertion corrected for recorded confidence preference (0.5 not 0.7)
- RPI hardening — UAT version pre-flight, goals history filter, proof-backed suppression, fail-closed gates, cross-epic handoff contamination, bare ag- prefix guard
- Branch consolidation — 10 stale Codex branches analyzed, cherry-picked (9 commits, ~3,500 lines), and deleted; 25 orphaned worktrees pruned
- git rerere enabled — conflict resolution memory for future merges
[2.33.0] - 2026-04-02¶
Added¶
- Backlog hygiene gates — added
bd-audit.sh,bd-cluster.sh, and Crank/Codex guidance for cleaning stale or mergeable beads before execution - Retrieval benchmarking and global scope — added
ao retrieval-bench, benchmark corpora,--live,--global, and nightly IR regression coverage /red-teamadversarial validation — added a persona-based validation skill plus checked-in Codex runtime artifacts- Software factory operator lane — added a CLI/operator surface and Claude factory startup routing for software-factory workflows
- Flywheel maintenance utilities — added global garbage purge tooling and nightly retrieval benchmarking for knowledge quality tracking
Changed¶
- Release policy — removed the enforced release cadence gate so releases no longer block on a minimum wait between tags
- Knowledge operator surfaces — plan and validation now wire knowledge operator surfaces directly into execution flow
- Proof and runtime docs — goals, RPI docs, and contributor guidance now reflect the expanded proof surfaces and hookless runtime behavior
Fixed¶
- Codex artifact parity — restored checked-in Codex parity for red-team and cleaned Codex runtime metadata/frontmatter drift across crank, forge, post-mortem, release, and swarm artifacts
- Retrieval quality — replaced exact-substring filtering with token-level matching and tuned penalty, deduplication, and OR-fallback behavior
- Harvest metadata preservation — promotion now preserves source metadata and fills missing maturity, utility, and type fields safely
- Release tooling — release artifact directories are created safely and audit artifacts now resolve against release tag names
- Documentation and link drift — repaired the post-mortem Codex link and aligned runtime docs around the newer startup and lifecycle flows
[2.32.0] - 2026-04-01¶
Added¶
- Knowledge activation skill — new
/knowledge-activationskill and CLI surfaces for activating cross-domain knowledge at runtime, with operator surface consumption and ranked intelligence context - Session intelligence engine — complete runtime engine with explainability, ranked context assembly, and trust policy enforcement
- Runtime selection for
ao rpi serve— serve now supports explicit runtime selection for Claude and Codex execution modes - Quality signals hook — new
quality-signals.shhook with test coverage for session quality telemetry - Pre-push gate expansion — 9 checks migrated from CI-only to the local pre-push gate for faster feedback
- Inject stability warnings and status dashboard — closed 3 harvest items with signal tests and dashboard improvements
Changed¶
- README refresh — product-minded rewrite with gain-framing and Strunk-style prose fixes
- Philosophy doc — new
docs/philosophy.mdand observations section added to README - Documentation alignment — repo front doors and codex artifact guidance unified across entry points
- Claude Code architecture lessons — retry budgets, stability flags, quality signals, and orchestration patterns applied to skills
- Homebrew formula — updated to v2.31.0 with pre-built binaries
Fixed¶
- Post-mortem closure integrity — normalized file parsing for closure integrity audits
- CI reliability — resolved CI failures across codex refs, test pairing, hook coverage, worktree handling, docs parity, hook portability, and codex lifecycle
- Lookup nested scanning —
ao lookupnow scans nested global knowledge directories correctly - Pre-push test stubs — added test stubs for new pre-push checks, skip non-shell in shellcheck
Dependencies¶
- Bumped
codecov/codecov-actionfrom 5 to 6 - Bumped
DavidAnson/markdownlint-cli2-actionfrom 22 to 23
[2.31.0] - 2026-03-30¶
Added¶
- 9 lifecycle skills — bootstrap, deps, design, harvest, perf, refactor, review, scaffold, and test skills wired into RPI with auto-invocation and mechanical gates
ao harvest— cross-rig knowledge consolidation extracts and catalogs learnings from sibling crew workspacesao context packet— inspect stigmergic context packets for debugging inter-session handoff state- Hook runtime contract — formal Claude/Codex/manual event mapping with runtime-aware hook tooling
- Evidence-driven skill enrichment — production meta-knowledge, anti-patterns, flywheel metrics, and normalization defect detection baked into 9 skill reference files
- Research provenance — pending learnings now carry full research provenance for discoverability and citation tracking
- Context declarations — inject, provenance, and rpi skills declare their context requirements explicitly
- Goals and product output templates —
/goalsand/productproduce evidence-backed structured output
Changed¶
- Three-gap context lifecycle contract — README, PRODUCT.md, positioning docs, and operational guides reframed around the context lifecycle model
- Dual-runtime hook documentation — runtime modes table and troubleshooting updated for Claude + Codex hook coexistence
Fixed¶
- CI reliability — resolved 4 pre-existing CI failures, restored headless runtime preflight, repaired codex parity drift checks
ao lookupretrieval — fixed retrieval gaps that caused lookup to return no results- Embedded sync — using-agentops SKILL.md and
.agents/.gitignorenow written correctly on first session start - Closure integrity — 24h grace window for close-before-commit evidence, normalized file parsing
- Skill lint compliance — vibe, post-mortem, crank, and plan skills trimmed or restructured to stay under 800-line limit
- Codex tool naming — added CLAUDE_TOOL_NAMING rule and fixed 5 Claude-era tool references in codex skills
- ASCII diagram consistency — aligned box-drawing characters across 23 documentation files
- Fork exhaustion prevention — replaced jq with awk in validate-go-fast to prevent fork bombs on large repos
[2.30.0] - 2026-03-24¶
Added¶
- Codex hookless lifecycle support —
ao codexruntime commands, lifecycle fallback, and Codex skill orchestration now cover hookless sessions end to end - PROGRAM.md autodev contract — Added a first-class
PROGRAM.mdcontract for autodev flows and taught/evolveand related RPI paths to use it - Long-running RPI artifact visibility — Mission control now exposes run artifacts and evaluator output so long-running RPI sessions are replayable and easier to inspect
Changed¶
- Codex runtime maintenance flow — Refreshed Codex bundle hashes, lifecycle guards, runtime docs, and release validation coverage around the expanded Codex execution path
Fixed¶
- Codex RPI scoping and closeout — Tightened objective scope, epic scope, closeout ownership, and validation gaps in the Codex RPI lifecycle
- Release gate reliability — Restored headless runtime coverage, runtime-aware Claude inventory checks, and release-gate coherence validation
- Reverse-engineer repo hygiene — Repo-mode reverse engineer now ignores generated and temp trees when identifying CLI and module surfaces
[2.29.0] - 2026-03-22¶
Added¶
- Model cost tiers and config writes —
ao configcan now assign per-agent models by cost tier and persist repo configuration changes directly - Search brokerage over session history and repo knowledge —
ao searchnow wraps upstreamcassresults with repo-local AgentOps artifacts by default - Reviewer and post-mortem reference packs — Added model-routing, iterative-retrieval, confidence-scoring, write-time-quality, and conflict-recovery guidance across council, research, swarm, vibe, compile, and related skills
Changed¶
- Competitive comparison and CLI docs — Refreshed comparison docs, release smoke coverage, and command documentation around the expanded search/config surface
Fixed¶
- Flywheel proof and citation loop — Added deterministic proof fixtures, preserved exact research provenance, and made citation feedback artifact-specific so flywheel health reflects real closure state
- Search alignment with forged session history — Search now stays aligned with forged session artifacts and fallback behavior
- Hook-launched validation — Pre-push and release gates now isolate inherited git env/stdin correctly and cover newer hook scripts in integration tests
- Codex council profile parity — Source and checked-in Codex council docs are back in sync for the shared profile contract
[2.28.0] - 2026-03-21¶
Added¶
- Node repair operator — Crank now classifies task failures as RETRY (transient), DECOMPOSE (too complex), or PRUNE (blocked) with budget-controlled recovery
- Knowledge refresh auto-trigger — Lightweight compile defrag runs automatically at session end via new SessionEnd hook
- Configurable review agents — Project-level
.agents/reviewer-config.mdcontrols which judge perspectives council and vibe spawn - Three-tier plan detail scaling — Plan auto-selects Minimal, Standard, or Deep templates based on issue count and complexity
- Adversarial ideation — Brainstorm Phase 3b stress-tests each approach with four red-team questions before user selection
Fixed¶
- Crank SKILL.md line limit — Consolidated duplicate References sections to stay under 800-line skill lint limit
- Codex skill parity — Synced all five competitive features to skills-codex with reference file copies
[2.27.1] - 2026-03-20¶
Fixed¶
- Flywheel golden signals always shown — Golden signals were gated behind
--goldenflag, causingao flywheel statusto report "COMPOUNDING" while the hidden golden signals analysis showed "accumulating". Golden signals now compute and display by default.
[2.27.0] - 2026-03-20¶
Added¶
- Flywheel golden signals — Four derived health indicators (velocity trend, citation pipeline, research closure, reuse concentration) that distinguish knowledge compounding from noise accumulation; accessible via
ao flywheel status --golden - Forge-to-pool bridge — Forge auto-writes pending learnings as markdown to
.agents/knowledge/pending/for close-loop pool ingestion - SessionStart citation priming —
ao lookupwired into SessionStart hook to close the citation gap between inject and session context - Skill catalog quality — Improved descriptions, extraction patterns, and reference linking across skill catalog
Fixed¶
.agents/.gitignorescope — Replaced broad!*/pattern with explicit subdirectory list to prevent accidental tracking- Codex runtime skill parity — Hardened Codex runtime skill discovery and validation
- Codex install smoke tests — Fixed test assertions for install path edge cases
Changed¶
- CLI reference docs — Regenerated with updated date stamps
[2.26.1] - 2026-03-16¶
Fixed¶
- RPI stops after Phase 2 — Restructured rpi, discovery, and validation orchestrator skills as compact DAGs with execution sequence in a single code block; eliminates LLM stopping between phases due to
###section headings acting as natural breakpoints - Test grep patterns for DAG headings — Updated
test-tuning-defaults.shto match new complexity-scaled gate headings after DAG restructure
Changed¶
- Goals reimagined — GOALS.md rebuilt from first principles with fitness gate fixes
- README progressive disclosure — Lead with moats, collapse detail into expandable sections
- CLI reference docs — Regenerated with updated date stamps
- Doctor + findings helpers — Added CLI test coverage for extracted helpers
[2.26.0] - 2026-03-15¶
Added¶
- BF6–BF9 test pyramid levels — Regression (bug-specific replay), Performance/Benchmark, Backward Compatibility, and Security (in-test) bug-finding levels with language-specific patterns for Go and Python
- Test pyramid decision tree expansion — 4 new routing questions for BF6–BF9 in the "When to Use" guide
- RPI phase mapping for BF6–BF9 — Bug fix → BF6 mandatory, hot-path → BF7 benchmark, format change → BF8 compat fixture, secrets → BF9 redaction tests
regen-codex-hashes.sh— Manifest hash regeneration script for Codex skill maintenance
Changed¶
- Go standards — Added benchmark tests (BF7), backward compat with
testdata/compat/(BF8), regression test naming convention (BF6), security tests for path traversal (BF9) - Python standards — Added Hypothesis property-based testing (BF1),
pytest-benchmarkpatterns (BF7), backward compat with parametrized fixtures (BF8), regression test naming (BF6), secrets redaction tests (BF9) - Coverage assessment template — Extended BF pyramid table from BF1–BF5 to BF1–BF9
Fixed¶
- Codex skill audit — 60+ findings fixed across all 54 Codex skills; removed orphaned
claude-code-latest-features.mdandclaude-cli-verified-commands.mdreferences - Skill lint warnings — Resolved all warnings in crank, rpi, recover skills
- README skill references — Corrected broken references and linked orphaned templates
- Skill linter refs — Fixed directory reference and backtick formatting in reverse-engineer-rpi
- CHANGELOG sync hook — Replaced broken awk extraction with sed; awk failed on em-dash UTF-8 content producing header-only syncs
- Plugin version parity — Added pre-commit check that warns when
.claude-plugin/manifest versions don't match the release version
[2.25.1] - 2026-03-15¶
Fixed¶
- Codex BF pyramid parity — Synced BF1/BF2/BF4 bug-finding level selection into skills-codex implement, post-mortem, and validation skills
- Codex Claude backend cross-contamination — Removed orphaned
backend-claude-teams.mdfiles (Claude primitives: TeamCreate, SendMessage) from 4 Codex skills (council, research, shared, swarm) - Dead converter rule — Removed stale sed substitution for
backend-claude-teams.mdrename in converter script - Swarm reference integrity — Added Reference Documents section to swarm SKILL.md; updated validate.sh to check only Codex-native backend references
[2.25.0] - 2026-03-14¶
Added¶
- L0–L7 test pyramid standard — Shared reference doc (
standards/references/test-pyramid.md) defining 8 test levels, agent autonomy boundaries (L0–L3 autonomous, L4+ human-guided), and RPI phase mapping - Test pyramid integration across RPI lifecycle — Discovery identifies test levels, plan classifies tests by level, pre-mortem validates coverage, implement selects TDD level, crank carries
test_levelsmetadata, validation audits coverage, post-mortem reports gaps - RPI autonomous execution enforcement — Three-Phase Rule mandates discovery → implementation → validation without human interruption; anti-patterns table documents 7 failure modes
- Evolve autonomous execution enforcement — Each cycle runs a complete 3-phase
/rpi --auto; anti-patterns table documents 6 failure modes; large work decomposed into sub-RPI cycles - Codex skill standard — New
standards/references/codex-skill.mdwith tool mapping, prohibited primitives, two-phase validation, DAG-first traversal, and prompt constraint boundaries - Codex-native overrides — Durable overrides for crank, swarm, council that survive regeneration
- DAG-based Codex smoke test —
scripts/smoke-test-codex-skills.shvalidates 54 skills with dependency-ordered traversal - Codex skill API contract —
docs/contracts/codex-skill-api.mdwith conformance validator - Output contract declarations —
output_contractfield on council, vibe, pre-mortem, research skills with canonical finding-item schema
Changed¶
- Codex converter rewrite — Strips Claude primitives instead of mapping to unavailable tools; rewrites reference files through
codex_rewrite_text - CI pipeline — Removed codex skill parity check (skills-codex/ now manually maintained); fixed shellcheck and embedded sync issues
Fixed¶
- Converter primitive stripping — Task primitives (TaskCreate, TeamCreate, SendMessage) properly stripped instead of mapped to non-existent Codex equivalents
- Embedded hook sync — Added missing
test-pyramid.mdandcodex-skill.mdto CLI embedded references - ShellCheck SC1125 — Fixed em-dash in shellcheck disable directive in smoke test script
- Skill line limits — Moved verbose autonomy rules to reference files to stay under tier-specific line budgets
[2.24.0] - 2026-03-12¶
Added¶
- Error & rescue map template — Pre-mortem Step 2.5 with 3 worked examples (HTTP, database, LLM)
- Scope mode selection — Pre-mortem Step 1.6 with 3-mode framework (Expand/Hold/Reduce) and auto-detection
- Temporal interrogation — Pre-mortem Step 2.4 walks implementation timeline (hour ½/4/6+) for time-dependent risks
- Prediction tracking — Pre-mortem findings get unique IDs (
pm-YYYYMMDD-NNN) correlated through vibe and post-mortem - Finding classification — Vibe separates CRITICAL (blocks ship) from INFORMATIONAL findings
- Suppression framework — Vibe loads default + project-level suppression patterns for known false positives
- Domain-specific checklists — Standards skill extended with SQL safety, LLM trust boundary, and race condition checklists, auto-loaded by vibe
- RPI session streak tracking — Post-mortem Step 1.5 shows consecutive session days and verdict history
- Persistent retro history — Post-mortem Step 4.8 writes structured JSON summaries to
.agents/retro/for cross-epic trend analysis - Prediction accuracy scoring — Post-mortem Step 3.5 scores HIT/MISS/SURPRISE against pre-mortem predictions
- Commit split advisor — PR-prep Phase 4.5 suggests bisectable commit ordering (suggestion-only)
- Council finding auto-extraction — Significant findings from WARN/FAIL verdicts staged for flywheel consumption
Changed¶
- Post-mortem examples condensed — Verbose examples replaced with concise 4-mode summary to stay under skill line limit
[2.23.1] - 2026-03-12¶
Fixed¶
- Resolved all golangci-lint quality findings
- Synced embedded standards after skill audit fixes
- Synced Codex bundle after skill audit fixes
- Resolved audit findings across council, vibe, standards skills
[2.23.0] - 2026-03-11¶
Added¶
- Discovery and validation phase orchestrators — New
/discoveryand/validationskills decompose the RPI lifecycle into independently invocable phases (research+plan+pre-mortem and vibe+post-mortem) - Stigmergic packet scorecard — Ranked scoring for flywheel knowledge packets so higher-utility learnings surface first
- Pinned work queue —
/evolvegains a pinned work queue with blocker auto-resolution for directed improvement loops - Per-package coverage ratchet — Pre-push gate enforces per-package coverage baselines that only move upward
- Fast pre-push mode —
--fastflag for diff-based conditional checks, skipping unchanged packages - Standards auto-loading — Go and Python coding standards injected
automatically into
/crankand/swarmworkers - 271 test functions — Four internal packages (
pool,ratchet,resolver,storage) brought to 100% coverage
Changed¶
- README restructured — Extracted reference material into dedicated docs, reducing README from 679 to 472 lines
- RPI skill refactored —
/rpinow delegates to/discoveryand/validationphase orchestrators instead of inlining all phases - Go and Python test conventions — Canonical standards enriched with assertion quality rules, naming conventions, and table-driven test guidance
- Documentation alignment — Lifecycle, flywheel, primitive chain, and positioning docs updated to reflect current architecture
Fixed¶
- Goal runner deadlock — Fixed goroutine deadlock in goal runner and added job timeouts to prevent stalls
- 17 CLI bugs from deep audit — Addressed goroutine leaks, race conditions, panics, buffer overflows, and nil-check inconsistencies
- Session close reliability — Resolved pre-existing session_close issues surfaced by vibe council review
- ~50 zero-assertion tests — Upgraded smoke tests from no-op to behavioral assertions across cmd/ao and internal packages
- Test file hygiene — Merged
_extra_test.goandcov*_test.gofiles into canonical<source>_test.gonames - CI stability — FIFO test skip on Linux, embedded skill sync, coverage ceiling adjustments, crank SKILL.md trimmed below 800-line limit
- Auto-extract quality gate — Added quality gate to prevent low-fidelity auto-extracted learnings from entering the knowledge store
[2.22.1] - 2026-03-10¶
Added¶
- Repo-native redteam harness — Added a packaged redteam pack and prompt
runner to
security-suitefor repeatable repository-local security exercises - Findings management commands — Added CLI commands for listing and managing saved findings from the terminal
Changed¶
- Closed-loop prevention validation — Completed the end-to-end finding compiler and prevention-ratchet validation path so saved findings feed back into earlier planning and task validation more reliably
- Runtime contract parity — Localized shared Claude runtime reference packs into the source skills and regenerated Codex artifacts so source and generated bundles stay aligned
Fixed¶
- Finding metadata injection — Exposed finding metadata consistently in inject output and JSON integrations after the merged findings work landed
- Release gate regressions — Restored goals/package coverage, learning coherence, and hook-fixture isolation so the local release gate matches the shipped tree again
[2.22.0] - 2026-03-09¶
Added¶
- Finding registry — Council findings are saved to a persistent registry and automatically fed back into planning and validation, so the same class of bug is caught earlier next time
- Repo execution profiles —
.repo-execution-profile.jsonlets skills and runtimes adapt to each repository's validation gates, startup reads, and done-criteria - Headless team backend — Multi-agent workflows can run non-interactively (e.g. in CI) with structured JSON output and automated validation
Changed¶
- Codex and embedded artifacts — Synced generated Codex bundles, embedded standards references, and install artifacts after merging branch work
- Validation feedback capture — Recorded validation-cycle feedback into
.agentslearnings so tracked patterns match the shipped tree
Fixed¶
- Lookup findings — Fixed
ao lookupand inject scoring so findings render, cite, and score correctly after the branch merge - 23 CLI bug fixes — Fixed goroutine leaks, race conditions, panics, buffer overflows, missing error handling, and nil-check inconsistencies
- Post-mortem evidence hardening — Staged changes and worktree evidence are now captured durably so proof isn't lost during compaction or cleanup
[2.21.0] - 2026-03-09¶
Added¶
- Codex-first skill rollout across the full catalog with override coverage, generated-artifact governance, and install/runtime parity validation
- Claim-aware next-work lifecycle handling with contract parity checks for
/rpiand follow-on flows - Headless runtime skill smoke coverage and Codex backbone prompt validation in the release gate stack
Changed¶
- Codex maintenance guidance, override coverage docs, and CLI-to-skills mapping to match the generated runtime model
- Release-prep validation flows for runtime smoke, Codex artifact sync, and release note generation
Fixed¶
- Next-work queue mutation races by making claim/update handling concurrency-safe and per-item
- Codex prompt parity drift by syncing generated prompts and tightening override coverage gates
- Worktree Git resolution and vibe-check runtime environment handling
- Push/pre-push validation regressions and nested pre-push wrappers
- Streamed phase timeout cancellation so phased runtime tests and release gating terminate promptly
[2.20.1] - 2026-03-07¶
Fixed¶
- Codex install workflow now uses
~/.agents/skillsas the single raw skill home and stops recreating an AgentOps mirror in~/.codex/skills - Native Codex plugin refresh now archives overlapping legacy
~/.codex/skillsAgentOps folders instead of repopulating them - Codex install docs now consistently describe the
~/.agents/skillsworkflow and the need for a fresh Codex session after install - Codex skill conversion now preserves multiline YAML
descriptionfields correctly, fixing malformed generated metadata for skills such as Compile ao doctornow treats plugin-cache plus~/.agents/skillsas the supported Codex layout and reports manifest drift with accurate wording
[2.20.0] - 2026-03-05¶
Added¶
- Flywheel loop closure —
ao session close --auto-extractproduces lightweight learnings and auto-handoff at session boundary - Handoff-to-learnings bridge —
ao handoffnow extracts decisions into.agents/learnings/automatically - Session-type scoring in
ao inject --session-type— 30% boost for matching session context (career, debug, research, brainstorm) - Identity artifact support —
ao inject --profilesurfaces.agents/profile.mdin session context - MEMORY.md auto-promotion in
ao flywheel close-loop(Step 7) after maturity transitions - Session-type detection in
ao forgeoutput metadata - Production RPI orchestration engine —
ao rpi serve <goal>with SSE streaming and auto mode - Knowledge mining —
ao mineandao defragcommands for automated codebase intelligence - Context declarations —
ao inject --for <skill>reads skill frontmattercontext:block for scoped retrieval - Sections include allowlist and context artifact directories for skill-scoped injection
ao handoffcommand for structured session boundary isolation- Behavioral guardrails — 3-layer hook defense-in-depth (intent-echo, research-loop-detector, task-validation-gate)
- Context enforcement hook and run-id namespaced artifact paths
- Headless invocation standards and RPI phase runner
- Nightly CI compile job for automated knowledge warmup
- Coverage ratchet gate with BATS integration tests for shell scripts
- Fuzz targets, property tests, and golden file contracts for CLI
- Git worker guard, embedded parity gate, and swarm evidence validation hooks
- Release cadence gate warns on releases within 7 days of previous
Changed¶
- Coverage floor raised to 84% for
cmd/ao, average floor to 95% - Complexity ceiling tightened to 20 (from 25)
- Default session-start hook mode switched from manual to lean
- Hard quality gate on injection — maturity + utility filter
- Post-mortem redesigned as knowledge lifecycle processor
- RPI god-file split — 1,363 lines reduced to 203 with structured handoff schema
- Legacy RPI orchestrator retired — serve now uses phased engine (-1,121 lines)
- Council V2 findings synthesized into agent instructions and skill contracts
- 10k LOC of coverage-padding tests deleted; 72 stale tests quarantined
- Skill hardening — web security controls across 5 skills, CSRF protection, crank pre-flight
- Session-end hook wires
ao session close --auto-extractbefore existing forge pipeline
Fixed¶
- Flywheel signal chain — confidence decay, close-loop ordering, glob errors
- Path traversal in context enforcement hook and frontmatter parsing
- Race condition in handoff consumption at session boundary
ao minestabilized — dedup IDs, error propagation,--sincewindow, empty output guard- Hook test assertions aligned with warn-then-fail ratchet pattern (strict env required)
- Pre-mortem gate exit code corrected to 2 in strict mode (was 1)
- RPI serve event pipeline and coherence gate hardened
- jq injection via bare 8-hex run IDs in serve classifier
- Goals parser edge cases — paired backtick strip and rune-aware truncation
- UTF-8 truncation across six functions converted to rune-safe slicing
- CORS headers and stale doc references cleaned up
- Cross-wave worktree file collisions prevented
- hookEventName added to hookSpecificOutput JSON schema
[2.19.3] - 2026-02-27¶
Changed¶
- README highlights
ao search(built on CASS) — indexes all chat sessions from every runtime unconditionally; adds Second Brain + Obsidian vault section with Smart Connections local/GPU embeddings and MCP semantic retrieval
[2.19.2] - 2026-02-27¶
Fixed¶
- CHANGELOG retrospectively updated to document all v2.19.1 post-tag commits (skills namespace fixes were shipped but not recorded)
[2.19.1] - 2026-02-27¶
Fixed¶
- Quickstart skill rewritten from 275 lines to 68 lines — removes 90-line ASCII diagram and 50-line intent router that caused 3+ minute runtime; now outputs ~8 lines and completes in under 30 seconds
truncateTextedge case: maxLen 1–3 now returns"..."[:maxLen]instead of the original string unchanged- Dead anti-pattern promotion functions removed from
ao maturity(promoteAntiPatternsCmd,filterTransitionsByNewMaturity,displayAntiPatternCandidates, ~99 LOC) - Windows file-lock and signal support — replace no-op
filelock_windows.gowith realLockFileEx/UnlockFileExvia kernel32.dll; extractsyscall.Flockandsyscall.Killinto platform-specific helpers so the binary compiles on Windows without POSIX-only syscalls heal.shCheck 7 false positive — script reference integrity check now strips URLs before pattern matching, preventing remotehttps://…/scripts/foo.shreferences from being validated as local files- Security gate
BLOCKED_HIGH— three persistent findings resolved: gosec G118 false positive (context cancel func returned to caller), golangci-lint nolint syntax (space in// nolint:directive), radon double-countingreverse_engineer_rpi.pyfromskills-codex/copy - 71 stale
ao know *andao quality *namespace references replaced across 17skills-codex/SKILL.md files — agents running rpi/evolve/crank were invoking non-existent commands from the pre-flatten CLI namespace - Three HIGH-severity stale command references fixed across
skills/andskills-codex/:ao flywheel status→ao metrics flywheel status,ao settings notebook update→ao notebook update,ao start seed/init→ao seed/ao init
Added¶
- Spec-consistency gate (
scripts/spec-consistency-gate.sh) validates contract files before crank spawns workers - Command-surface parity gate (
scripts/check-cmdao-surface-parity.sh) ensures all CLI leaf commands are tested scripts/post-merge-check.shnow validatesgo mod tidysync and blocks on symlinksscripts/merge-worktrees.shnow propagates file deletions and preserves permissions- Post-mortem preflight script checks reference file existence before council runs
- Hooks.json preflight validates script existence
- Windows binaries added to GoReleaser and SLSA attestation subject list
Changed¶
- Coverage floor raised 78% → 80% with CI enforcement gate; Codecov threshold aligned to 75%
- Six truncation functions converted to rune-safe Unicode slicing
truncateIDin pool.go delegates to sharedtruncateText- Crank skill invokes spec-consistency gate before spawning workers
- Vibe skill carries forward unconsumed high-severity next-work items as pre-flight context
- Release skill warns on unconsumed high-severity next-work items
- next-work JSONL schema formalized to v1.2
- Skills installation switched from
npx skillsto native curl installer (bash <(curl -fsSL …/install.sh)) - README updated with 5-command summary, compound effect section, and
/vibebreakdown
[2.19.0] - 2026-02-27¶
Added¶
ao mindcommand for knowledge graph operations.- New RPI operator surfaces: normalized C2/event plumbing plus
ao rpi stream,ao rpi workers, and tmux worker nudge visibility. - Codex install/bootstrap improvements, including native
~/.codex/skillsinstall and one-line installer flow. - Windows binaries added to GoReleaser build outputs.
Changed¶
- CLI namespace migration completed and aligned across hooks, docs, integration tests, and generated command references.
- Codex skill system moved to regenerated modular layout with codex-specific overrides and runtime prompt tailoring.
- CI/release gates hardened (codex runtime sections, release e2e validation, parity checks, stricter policy enforcement).
- High-complexity CLI paths refactored (
runRPIParallel,runDedup,parseGatesTable) to lower cyclomatic complexity.
Fixed¶
- Multiple post-mortem remediation waves landed for CLI/RPI/swarm reliability and edge-case handling.
- Hook delegation and integration behavior corrected for flat command namespace.
heal.shfalse-positive behavior reduced and doctor stale-path detection improved.- Skill/doc parity and cross-reference drift issues corrected across codex and core skill catalogs.
Removed¶
- Legacy inbox/mail command surface and stale/dead skill references from active catalogs.
[2.18.2] - 2026-02-25¶
Fixed¶
ao seednow creates.gitignoreand storage directories — reusessetupGitProtection,ensureNestedAgentsGitignore, andinitStoragefromao initao seedtext updated from staleao inject/ao forgeto current MEMORY.md + session hooks paradigm- MemRL feedback loop closed —
ao feedback-loopcommand wired,ao maturity --recalibratedry-run guard added - Quickstart skill updated to reference
ao seedand current flywheel docs - CLI reference regenerated after
ao feedback-loopand seed help text changes
Changed¶
.agents/session artifacts removed from git tracking- PRODUCT.md updated — Olympus section removed, value props and skill tier counts corrected
- GOALS.md coverage directive updated to measured 78.8% (target 85%)
[2.18.1] - 2026-02-25¶
Changed¶
- SessionStart hook default mode changed from
manualtolean— flywheel injection now fires every session - Auto-prune enabled by default (
AGENTOPS_AUTO_PRUNEdefaults to1, opt-out via=0) - Anti-pattern detection threshold lowered from
harmful_count >= 5to>= 3 - Eviction confidence threshold relaxed from
< 0.2to< 0.3 - Maturity promotion threshold in
--helptext synced with code (0.7→0.55)
Fixed¶
- Empty learnings no longer inflate flywheel metrics — extract prompt skips empty files, pool ingest rejects "no significant learnings" stubs
ao pool ingestnow runs automatically in session-end hook after forge (was manual-only)- 8 stale doc/comment references to old thresholds updated across hooks, ENV-VARS.md, HOOKS.md, using-agentops skill
- 13 empty stub learnings removed from
.agents/learnings/
[2.18.0] - 2026-02-25¶
Added¶
ao notebook updatecommand — compound MEMORY.md loop that merges latest session insights into structured sectionsao memory synccommand — sync session history to repo-root MEMORY.md with managed block markers for cross-runtime access (Codex, OpenCode)ao seedcommand — plant AgentOps in any repository with auto-detected templates (go-cli, python-lib, web-app, rust-cli, generic)ao lookupcommand — retrieve specific knowledge artifacts by ID or relevance query (two-phase complement toao inject --index-only)ao constraintcommand family — manage compiled constraints (list, activate, retire, review)ao curatecommand family — curation pipeline operations (catalog, verify, status)ao dedupcommand — detect near-duplicate learnings with optional--mergeauto-resolutionao contradictcommand — detect potentially contradictory learningsao metrics healthsubcommand — flywheel health metrics (sigma, rho, delta, escape velocity)ao context assemblecommand — build 5-section context packet briefings for tasks- Work-scoped knowledge injection:
ao inject --bead <id>boosts learnings tagged with the active bead - Predecessor context injection:
ao inject --predecessor <handoff-path>surfaces structured handoff context - Compact knowledge index:
ao inject --index-onlyoutputs ~200 token index table for JIT retrieval - Learning schema extended with
source_beadandsource_phasefields for work-context tracking ao extract --bead <id>tags extracted learnings with the active bead ID- Citation-to-utility feedback pipeline in flywheel close-loop (stage 5)
- Global
~/.agents/knowledge tier for cross-repo learning sharing (0.8 weight penalty, deduped) - Bead metadata resolver reads from env vars (
HOOK_BEAD_TITLE,HOOK_BEAD_LABELS) or cache file - Goal templates embedded in binary (go-cli, python-lib, web-app, rust-cli, generic) for
ao goals init --templateandao seed - Platform-specific process-group isolation for goal check timeouts (Unix: SIGKILL pgid, Windows: taskkill /T)
- SessionStart hook rewritten with 3 startup modes: lean (default), manual, legacy — via
AGENTOPS_STARTUP_CONTEXT_MODE - SessionEnd hook now gates notebook update and memory sync on successful forge
- Type 3 setup hook template:
hooks/examples/50-agentops-bootstrap.sh - Constraint compiler hook:
hooks/constraint-compiler.sh - Codex-native skill format (
skills-codex/) with install and sync scripts for cross-runtime skill delivery - Comprehensive cmd/ao test coverage push — 500+ tests across 5 waves reaching 79.2% statement coverage (13 untestable functions excluded)
Changed¶
- SessionStart hook default mode changed from full inject to
lean(extract + lean inject, shrinks when MEMORY.md is fresh) ao flywheel close-loopnow applies ALL maturity transitions (not just anti-pattern)ao hooksgenerated config uses script-based commands instead of inline ao invocationsao rpiprefers epic-type issues before falling back to any open issue
Fixed¶
truncateTextnow uses rune-safe[]runeslicing to avoid breaking multi-byte UTF-8 characterssyncMemoryextracted from Cobra handler for testabilityparseManagedBlockdetects duplicate markers and refuses to parse (prevents data loss)readNLatestSessionEntrieswarns on skipped unreadable session filesreadSessionByIDdetects ambiguous matches and returns error instead of first substring matchfindMemoryFilebroad contains-fallback removed (was matching wrong projects)pruneNotebookiteration capped at 100 to prevent runaway loopsMEMORY_AGE_DAYSsentinel initialized to -1 (was 0, causing false lean-mode activation when file missing)- Lean-mode guard now requires
MEMORY_AGE_DAYS >= 0before comparing freshness - Memory sync moved inside forge success gate in session-end hook
ao search --jsonreturns[](empty JSON array) when no results, instead of human-readable textao doctorreturnsDEGRADEDstatus for warnings without failures (previously only HEALTHY/UNHEALTHY)ao rpi statusgoroutine leak fix — signal channel properly cleaned up- Inline rune truncation in
formatMemoryEntryreplaced with sharedtruncateText - 6 new tests for dedup, ambiguity detection, iteration cap, duplicate markers
- Cobra pflag state pollution between test invocations — explicit flag reset in
executeCommand()helper - Goals validate.sh outdated checks and missing validate.sh for 7 skills
- 10 tech debt findings from ag-8km+ag-chm post-mortem (stale nudge, scanner, docs)
- ao binary codesigned with stable Mach-O identifier
- Hook integration tests updated — removed 8 stale standalone ao-* hook tests consolidated into session-end-maintenance.sh
[2.17.0] - 2026-02-24¶
Added¶
- GOALS.md (v4) OODA-driven intent layer — markdown-based goals format with mission, north/anti stars, and steerable directives
ao goals initinteractive GOALS.md bootstrap with--non-interactivemodeao goals steercommand to add, remove, and prioritize directivesao goals prunecommand to remove stale gates referencing missing pathsao goals migrate --to-mdconverter from GOALS.yaml to GOALS.md formatao goals measure --directivesJSON output of active directivesao goals validatereports format and directive count- Format-aware
ao goals addwriteback (auto-detects md or yaml) - Go markdown parser library with case-insensitive heading matching and round-trip rendering (26 tests)
/goalsskill rewritten with 5 OODA verbs (init/measure/steer/validate/prune)/evolveStep 3 rewritten with directive-based cascade for idle reduction
Fixed¶
ao rpifalls back to any open issue when no epic exists (#50)- RPI phased processing tests added (~230 lines) for writePhaseResult, validatePriorPhaseResult, heartbeat, and registry directory
[2.16.0] - 2026-02-23¶
Added¶
- Evolve idle hardening — disk-derived stagnation detection, 60-minute circuit breaker, rolling fitness files, no idle commits
- Evolve
--qualitymode — findings-first priority cascade that prioritizes post-mortem findings over goals - Evolve cycle-history.jsonl canonical schema standardization and artifact-only commit gating
heal-skillchecks 7-10 with--strictCI gate for automated skill maintenance- 6-phase E2E validation test suite for RPI lifecycle (gate retries, complexity scaling, phase summaries, promise tags)
- Fixture-based CLI regression and parity tests
ao goals migratecommand for v1→v2 GOALS.yaml migration with deprecation warning (#48)- Goal failure taxonomy script and tests
Changed¶
- CLI taxonomy, shared resolver, skill versioning, and doctor dogfooding improvements (6 architecture concerns)
- GoReleaser action bumped from v6 to v7
- Evolve build detection generalized from hardcoded Go gate to multi-language detection
Fixed¶
ao pool list --wideflag andpool showprefix matching (#47)- Consistent artifact counts across
doctor,badge, andmetrics(#46) - Double multiplication in
vibe-checkscore display (#45) - Skills installed as symlinks now detected and checked in both directories (#44)
- Learnings resolved by frontmatter ID;
.mdfile count in maturity scan (#43) - JSON output truncated at clean object boundaries (#42)
- Misleading hook event count removed from display (#41)
- Post-mortem schema
modelfield and resolverDiscoverAllmigration - 15+ missing skills added to catalog tables in
using-agentops - Handoff example filename format corrected to
YYYYMMDDTHHMMSSZspec - Quickstart step numbering corrected (7 before 8)
- OpenAI docs skill: added Claude Code MCP alternative to Codex-only fallback
- Dead link to
conflict-resolution-algorithm.mdremoved from post-mortem ao forge search→ao searchin provenance and knowledge skills- OSS docs: root-level doc path checks, removed golden-init reference
- Reverse-engineer-rpi fixture paths and contract refs corrected
- Crank: removed missing script refs, moved orphans to references
- Codex-team: removed vaporware Team Runner Backend section
- Security skill: bundled
security-gate.sh, fixedsecurity-suitepath - Evolve oscillation detection and TodoWrite→Task tools migration
- Wired
check-contract-compatibility.shinto GOALS.yaml - Synced embedded skills and regenerated CLI docs