Sync bundled mission-control to upstream 15fbc94

Pulls in 15 upstream commits since the April 3 bundling snapshot (msieurthenardier/mission-control). Notable changes: - agentic-workflow rewritten as the "fast" variant: per-leg design and implement, single review and commit across the whole flight - New Skill-Project Boundary section: skills no longer read or write project-owned artifacts by literal heading - routine-maintenance scoped to post-mission only; adds state-machine reachability and cache freshness audits - Test metrics capture threaded through debrief, maintenance, and flight - Crew prompts no longer carry skill-required instructions; SKILL.md is the protocol - Worktree git strategy removed; standardized on {target-project} - Jira artifact template removed upstream Local URL correction in init-project/README.md preserved (anthropics/flight-control -> msieurthenardier/mission-control). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 08:10:32 -07:00
parent 4588bdf40c
commit 7840bddbb4
18 changed files with 318 additions and 565 deletions
@@ -6,7 +6,7 @@ human and project-side agents to capture both execution and design perspectives.
 ## Crew

 ### Developer
- **Context**: {project}/
+- **Context**: {target-project}/
 - **Model**: Sonnet
 - **Role**: Provides developer perspective on flight execution. Reviews what was
  built, identifies technical debt introduced, evaluates implementation quality,
@@ -14,7 +14,7 @@ human and project-side agents to capture both execution and design perspectives.
 - **Actions**: debrief-interview

 ### Architect
- **Context**: {project}/
+- **Context**: {target-project}/
 - **Model**: Sonnet
 - **Role**: Closes the design feedback loop. Evaluates whether the design decisions
  made during flight planning held up in practice. Reviews architectural impact of
@@ -6,7 +6,7 @@ technical spec and uses project-side agents to validate against the real codebas
 ## Crew

 ### Architect
- **Context**: {project}/
+- **Context**: {target-project}/
 - **Model**: Sonnet
 - **Role**: Reviews flight specs for technical soundness. Validates design
  decisions, prerequisites, technical approach, and leg breakdown against
@@ -56,6 +56,20 @@ Evaluate:
 5. Codebase state — does the spec account for current working tree, existing tooling,
   and conventions that might affect implementation?
 6. Architecture — does the approach maintain or improve system structure?
+7. State-machine reachability — for every state, status, or lifecycle value the flight
+   introduces or relies on (e.g. "agent_deleted", "draft", "queued"), audit which
+   infrastructure layers could foreclose it: DB constraints (FK ON DELETE behaviors,
+   NOT NULL, CHECK), application caches and their invalidation rules, API/protocol
+   versions, fallback handlers that mask the state, and existing tests that pin
+   contradictory behavior. A state that the schema or a cache can silently prevent
+   is a design hole, not an implementation detail.
+8. Cache freshness contracts — for every cache (in-memory dict, query result cache,
+   derived state, frontend session storage) the flight reads from or populates,
+   the design must declare source of truth, rebuild trigger (per-call / TTL /
+   invalidation event / accepted permanent staleness), maximum staleness, and which
+   user actions should invalidate it. Vague answers ("eventually", "on next cycle")
+   without a concrete trigger are a flag. Conflating "cached object works" with
+   "cached object reflects current source" is a common category error worth catching.

 Provide structured output:

@@ -7,21 +7,21 @@ The Flight Director (Mission Control) orchestrates this phase using the
 ## Crew

 ### Developer
- **Context**: {working-directory}/
+- **Context**: {target-project}/
 - **Model**: Sonnet
 - **Role**: Implements code changes. Also performs design reviews against real
  codebase to validate leg specs before implementation.
 - **Actions**: implement, fix-review-issues, commit, review-leg-design

 ### Reviewer
- **Context**: {working-directory}/
+- **Context**: {target-project}/
 - **Model**: Sonnet (NEVER Opus)
 - **Role**: Reviews code changes for quality, correctness, and criteria compliance.
  Has NO knowledge of Developer's reasoning — only sees resulting changes.
 - **Actions**: review

 ### Accessibility Reviewer (optional)
- **Context**: {working-directory}/
+- **Context**: {target-project}/
 - **Model**: Sonnet
 - **Enabled**: false
 - **Role**: Reviews UI changes for accessibility compliance. Evaluates against
@@ -72,7 +72,6 @@ The Flight Director substitutes these variables in prompts at runtime:
 | `{flight-number}` | Current flight number | All prompts |
 | `{leg-number}` | Current leg number | Leg-scoped prompts |
 | `{leg-artifact-path}` | Path to the leg artifact file | review-leg-design |
-| `{working-directory}` | Resolved working directory for the agent (project root for branch strategy, worktree path for worktree strategy) | All prompts |
 | `{reviewer-issues}` | Full text of reviewer feedback (dynamic) | fix-review-issues |

 ## Prompts
@@ -6,7 +6,7 @@ both the human and a project-side Architect to capture strategic technical persp
 ## Crew

 ### Architect
- **Context**: {project}/
+- **Context**: {target-project}/
 - **Model**: Sonnet
 - **Role**: Provides architectural perspective on mission outcomes. Evaluates
  whether the system evolved well across flights, identifies structural issues,
@@ -6,7 +6,7 @@ and uses project-side agents to validate technical viability.
 ## Crew

 ### Architect
- **Context**: {project}/
+- **Context**: {target-project}/
 - **Model**: Sonnet
 - **Role**: Validates technical viability of proposed outcomes. Ensures business
  goals align with what's actually possible given the codebase, stack, and
@@ -7,7 +7,7 @@ for severity assessment and roundtable moderation.
 ## Crew

 ### Inspector
- **Context**: {project}/
+- **Context**: {target-project}/
 - **Model**: Sonnet
 - **Role**: Performs broad read-only codebase inspection across all applicable
  categories. Runs test suites, linters, type checkers, audit commands, and
@@ -15,7 +15,7 @@ for severity assessment and roundtable moderation.
 - **Actions**: inspect-codebase

 ### Security Reviewer
- **Context**: {project}/
+- **Context**: {target-project}/
 - **Model**: Sonnet
 - **Role**: Performs focused manual security review of authentication flows,
  injection surfaces, secrets handling, CORS/CSP configuration, and data
@@ -24,7 +24,7 @@ for severity assessment and roundtable moderation.
 - **Actions**: review-security

 ### CI/CD Reviewer (optional)
- **Context**: {project}/
+- **Context**: {target-project}/
 - **Model**: Sonnet
 - **Enabled**: false (enable when project has CI/CD pipelines)
 - **Role**: Reviews CI/CD pipeline configuration, build security, deployment
@@ -33,7 +33,7 @@ for severity assessment and roundtable moderation.
 - **Actions**: review-cicd

 ### Accessibility Reviewer (optional)
- **Context**: {project}/
+- **Context**: {target-project}/
 - **Model**: Sonnet
 - **Enabled**: false (enable when project has user-facing UI)
 - **Role**: Reviews codebase for accessibility compliance against WCAG 2.1 AA
@@ -42,7 +42,7 @@ for severity assessment and roundtable moderation.
 - **Actions**: review-accessibility

 ### Architect
- **Context**: {project}/
+- **Context**: {target-project}/
 - **Model**: Opus
 - **Role**: Reviews all reviewer findings alongside debrief context. Assigns
  severity per finding, challenges questionable assessments, moderates
@@ -463,6 +463,7 @@ For each finding, assign one of:
 - Does this finding represent a real risk, or is it noise?
 - Is the severity proportional to the actual impact?
 - Would this compound if left for another cycle?
+- Is the infrastructure or framing this finding pertains to still serving its original purpose, or has it drifted into "maybe-someday" territory?
 - Is this a new discovery or previously acknowledged debt?
 - Do multiple reviewers corroborate the same issue?
 - Are any reviewer assessments questionable — too alarmist or too dismissive?