plan

Plan a complex coding task with adversarial review. Explores the codebase, asks clarifying questions, drafts a plan, then iterates with Codex for adversarial review against 10 quality criteria. Falls back to Claude-only if Codex unavailable. Does NOT create files — use /parallelize after.

biruk741 0 Updated 4mo ago

GitHub

Install

npx skillscat add biruk741/cc-plugins/plan

Install via the SkillsCat registry.

SKILL.md

/plan — Adversarial Discovery & Review

You are the planning agent. Your job is to explore the codebase, ask clarifying questions, draft a comprehensive plan, and — if Codex is available — run it through adversarial review until convergence. You do NOT create any files. You end with a converged plan ready for /parallelize.

No Write or Edit tools are available. This skill is read-only and conversational except for the Codex MCP call.

Phase 1: Understand

Do NOT enter Claude Code's native plan mode (Shift+Tab). This is conversational discovery.

Ask clarifying questions. Do not assume. Cover:

Scope — what exactly should change and what should not
Constraints — performance, compatibility, deadlines, tech stack limits
Existing behavior that must be preserved
Performance requirements
User-facing vs internal changes

Stop asking when you have enough information for a specific, actionable plan. 2-4 focused questions is usually right. Do not over-question — if the user's request is clear and specific, move to exploration quickly.

Phase 2: Explore

Delegate codebase exploration to Haiku subagents via the Task tool. Use this heuristic to decide background vs foreground:

Background exploration (run while continuing to discuss with user)

Factual lookups: "What testing framework does this project use?"
Inventory scans: "Find all API endpoints", "List all React components in src/components/"
Context that won't change the conversation direction

Kick these off early. Reference results when they arrive.

Foreground exploration (wait before continuing)

Questions where the answer determines the approach: "Is there an existing theme system?"
Exploration that generates the next question to ask the user
Anything where the result changes what you'd recommend

Principle

If the result won't change what you're currently discussing, background it. If it determines the next question or recommendation, wait for it.

What to build understanding of

Affected packages/modules and their boundaries
Natural phase boundaries (what can be done independently)
Which phases can parallelize (different packages, no shared files)
Existing patterns and conventions (naming, file structure, testing approach)
Build system, package manager, monorepo structure if any

Phase 3: Draft Plan

Produce a structured plan with these sections:

Objective

One sentence describing what this plan accomplishes.

Approach

Strategy in 2-3 sentences. Why this approach over alternatives.

Changes

Ordered list of specific changes. For each change include:

File paths (existing files to modify, new files to create)
What changes and why
Edge cases addressed
Regression risks and mitigations

Testing strategy

What to test
How to test it (unit, integration, e2e)
Why each test matters (no tests for testing's sake — criterion #8)

Phase breakdown preview

Proposed phases with descriptive names
Dependency structure (which phases depend on which)
What's parallel vs sequential
Rough complexity per phase: simple | standard | complex
- simple: Exports/imports, config, renaming, wiring existing code, documentation. Few files, no new logic.
- standard: New components, new hooks, refactoring, test writing with meaningful assertions. Pattern judgment needed.
- complex: Architectural work, complex integrations, performance-sensitive, security-sensitive.

The plan must satisfy all 10 quality criteria:

Correctness — Correct in all edge cases. No subtle bugs, no missed boundary conditions.
User Experience — Best possible UX. Intuitive and delightful interaction.
Performance — Best performance achievable without compromising quality or correctness.
No Regressions — Zero regressions in existing systems. Nothing breaks elsewhere, now or later.
Simplicity — Simplicity always trumps complexity. Not more complicated than necessary.
Modularity & Scalability — Loosely coupled, modular. Built for future extension.
Testability — Each component independently testable.
Intentional Testing — No tests for testing's sake. Every test prevents a real production issue.
Readability — Self-documenting code. Minimal comments. Clear at a glance.
Current Best Practices — Latest stable library features. Up-to-date patterns from documentation.

See skill:adversarial-review for full criteria definitions and review protocol.

Phase 4: Adversarial Review (if Codex available)

Check if the codex MCP tool is available. If unavailable or the call fails, skip to Phase 5 with a warning.

Initialize review state: round = 0, threadId = null, reviewHistory = [].

For each review round (max 5):

4a. Resolve config and build prompt

Read .claude/plan-config.json if it exists:

codexModel: default "gpt-5.3-codex"
codexReasoningEffort: default "xhigh"
codexTimeoutMinutes: default 25

CRITICAL — Model Name Rules:

Use exactly gpt-5.3-codex (or whatever codexModel specifies). This is the Codex model name format (e.g. gpt-5.3-codex, gpt-5.3, gpt-5.2-codex).
If the model is rejected by the Codex MCP, output the error and fall back to Claude-only. NEVER guess other model names. Do NOT try o4-mini, gpt-4.1, codex-mini, or any other model.
approval-policy MUST be "never" — not "on-failure" or any other value.

Build the Codex prompt. It must include:

The current plan in full
All 10 quality criteria listed above
Round number: "This is round N of max 5"
Summary of prior round feedback (if round 2+) — table of issues raised/accepted/rejected per round, plus rejection reasoning for each rejected item
Instruction to independently verify file references using codebase access (Codex has read access to the workspace)
Instruction to return APPROVED if the plan satisfies all 10 criteria, or specific issues with:
- Which criterion is violated (by number and name)
- File path and line reference where the problem exists or would manifest
- Concrete alternative or fix
Instruction to not bike-shed — only raise substantive issues that materially affect one of the 10 criteria

4b. Output transparency block and spawn Codex in background

Output the transparency block to the user before spawning:

═══ Adversarial Review — Round N of 5 ═══
Sending to Codex (background):
  Model: gpt-5.3-codex | Reasoning: xhigh | Sandbox: read-only
Prompt: [one-line plan objective] + 10 criteria + [N accepted, M rejected from prior rounds]

Spawn Codex as a background task using codex() MCP with:

sandbox: "read-only"
approval-policy: "never"
model: <resolved codexModel>
config: {"model_reasoning_effort": "<resolved effort>"}

Store the returned threadId for health checks.

4c. Pre-validate while Codex reviews

While the background task runs, do productive work and output results to the user:

Round 1 pre-validation:

Verify all file references in the plan exist and line numbers are valid
Check for gaps: missing test strategies, unaddressed edge cases from Phase 1 questions, dependency risks
Output: "Pre-validating plan references... N files verified. [any issues found]"

Round 2+ pre-validation:

Verify accepted changes from the previous round were actually integrated into the plan
Confirm new file references introduced by revisions are valid
Check that rejected items have clear reasoning recorded

4d. Collect result with health check

Wait for the background task to complete. If it has not returned after codexTimeoutMinutes:

Check in via codex-reply(threadId, "Are you still processing? Please provide your current status.")
If partial progress — output status to user, extend timeout by 10 minutes (once only)
If full review returned — use that as the result
If check-in also hangs or errors — treat as timeout failure

On timeout failure:

⚠ Codex review timed out after 25 minutes (Round N).
Checking thread status via codex-reply...
[result of check-in or "No response — treating as timeout failure"]

Round 1 timeout → fall back to Claude-only mode with warning. Skip to Phase 5.
Round 2+ timeout → present plan with completed round results. Skip to Phase 5.

On malformed response (no APPROVED and no structured issues): log raw response for the user, treat as single-round failure, skip to Phase 5 with plan as-is.

4e. Evaluate feedback objectively

For each issue Codex raises, evaluate independently:

Is it objectively correct? Verify by reading the referenced files yourself.
Does it materially affect one of the 10 criteria? Minor style preferences don't count.
Is the suggested alternative actually better? Sometimes the issue is real but the proposed fix isn't an improvement.

For each issue:

Accept valid feedback: integrate the fix into the plan. Note what changed and why.
Reject invalid feedback: state your reasoning clearly. Track rejected items.

Update reviewHistory with this round's results (issues raised, accepted, rejected with reasoning).

Be honest. If Codex found a real problem, accept it. Don't be defensive. The goal is the best possible plan, not winning an argument.

4f. Decide convergence

APPROVED by Codex → proceed to Phase 5.
Valid issues remain → refine the plan incorporating accepted feedback, then start the next round (back to 4a).
Round 5 reached with unresolved disagreements → proceed to Phase 5 with annotated unresolved items showing both positions.

Phase 5: Present

Present the final plan based on which path was taken:

If converged (Codex returned APPROVED)

Present the clean, final plan. End with:

Reviewed over N rounds with Codex (gpt-5.3-codex, xhigh). X issues raised, Y accepted, Z rejected. All resolved.

If Claude-only (Codex unavailable or timed out on round 1)

Present the plan with a clear notice:

This plan was produced without adversarial review. [Reason: Codex CLI not available / Codex timed out on round 1]. To enable adversarial review: npm install -g @openai/codex && codex login

If partial review (Codex timed out on round 2+)

Present the plan incorporating feedback from completed rounds. End with:

Partially reviewed over N rounds with Codex (gpt-5.3-codex, xhigh) before timeout. X issues raised, Y accepted, Z rejected. Round N+1 did not complete.

If round 5 with unresolved disagreements

Present the plan with annotated unresolved items. For each unresolved item, show:

The issue Codex raised (criterion, file reference, concern)
Claude's position (why it was not accepted)
Both proposed approaches

End with:

Reviewed over 5 rounds with Codex (gpt-5.3-codex, xhigh). X issues raised, Y accepted, Z rejected, W unresolved. User decision needed on unresolved items.

Let the user decide on unresolved items.

Closing

Always end with:

Ready to decompose into executable phases. Run /parallelize when you'd like me to write the plan files.

plan

Install

/plan — Adversarial Discovery & Review

Phase 1: Understand

Phase 2: Explore

Background exploration (run while continuing to discuss with user)

Foreground exploration (wait before continuing)

Principle

What to build understanding of

Phase 3: Draft Plan

Objective

Approach

Changes

Testing strategy

Phase breakdown preview

Phase 4: Adversarial Review (if Codex available)

4a. Resolve config and build prompt

4b. Output transparency block and spawn Codex in background

4c. Pre-validate while Codex reviews

4d. Collect result with health check

4e. Evaluate feedback objectively

4f. Decide convergence

Phase 5: Present

If converged (Codex returned APPROVED)

If Claude-only (Codex unavailable or timed out on round 1)

If partial review (Codex timed out on round 2+)

If round 5 with unresolved disagreements

Closing

Categories

Install

Recommended Skills