Post-engagement lessons-learned retrospective. Reads the engagement directory, analyzes skill routing decisions, identifies knowledge gaps and missing skills, and produces an actionable improvement report.
Install
npx skillscat add blacklanternsecurity/red-run/retrospective Install via the SkillsCat registry.
Engagement Retrospective
You are conducting a post-engagement retrospective for a penetration tester.
Your job is to analyze what happened during the engagement, evaluate how the
skill library performed, identify gaps, and produce actionable improvement
items. All analysis is local — you never touch the target.
Prerequisites
engagement/directory must exist with at leastactivity.mdandstate.md- The engagement should be complete or paused — this is a post-mortem, not a
mid-engagement review - MCP skill-router available (
search_skills,list_skills)
If engagement/activity.md or engagement/state.md are missing, tell the user:
Cannot run retrospective — engagement/activity.md and engagement/state.md are
required. These files are created by the orchestrator or discovery skills
during an engagement.
Engagement Logging
Check for ./engagement/ directory. If absent, proceed without logging.
When an engagement directory exists:
- Print
[retrospective] Activated → <target>to the screen on activation. - Evidence → save significant output to
engagement/evidence/with
descriptive filenames (e.g.,sqli-users-dump.txt,ssrf-aws-creds.json).
Do NOT write to engagement/activity.md, engagement/findings.md, or
engagement state. The orchestrator maintains these files. Report all findings
in your return summary.
Step 1: Gather Context
Read all engagement files:
engagement/scope.md— targets, objectives, rules of engagementengagement/state.db— final engagement state (callget_state_summary())engagement/activity.md— chronological activity logengagement/findings.md— confirmed vulnerabilities
If any file is missing (other than the required two), note it but continue with
what's available.
Subagent Execution Logs
Check for engagement/evidence/logs/*.jsonl. These are raw JSONL transcripts
captured from subagent executions by the SubagentStop hook. They contain every
tool call, tool result, assistant reasoning step, MCP call, and error — ground
truth that activity.md summaries cannot provide.
If JSONL logs are found, spawn a general-purpose Task subagent to parse them:
Task(
subagent_type="general-purpose",
prompt="Parse the subagent JSONL logs in engagement/evidence/logs/. For each
.jsonl file, read it and extract a structured timeline:
- Tool calls: tool name + input (truncate large inputs to 200 chars)
- Tool results: status + truncated output (first 200 chars)
- Assistant reasoning: key decision points and rationale
- MCP calls: server + method + key params
- Errors: any failures, retries, or exceptions
- Target artifacts: flag commands that may have created artifacts on the
target (file writes, user creation, registry changes, scheduled tasks,
services installed, firewall rules modified)
Return a markdown summary with one section per log file. Section header
format: '## {filename} ({agent-type})'. Include a 'Target Artifacts' subsection
listing any commands that may need cleanup.",
description="Parse subagent JSONL logs"
)Incorporate the parsed log data into subsequent analysis steps. The logs provide:
- Routing analysis (Step 2): exact skills loaded, whether
get_skill()was
called, inline vs routed execution - Knowledge gap analysis (Step 3): failed payloads, retries, manual
workarounds visible in the command sequence - Operational review (Step 5): exact commands run, timing, error recovery
decisions, artifact creation on target
If no JSONL logs are found, continue with engagement files only and note that
subagent execution traces are unavailable.
Engagement Summary
Summarize the engagement for the user:
- Target(s) and objective(s)
- Outcome: Were objectives met? Partially? Not at all?
- Timeline: How many skill invocations, roughly how long
- Final state: What access/credentials/vulns existed at completion
Ask the user if this summary is accurate and whether there's context not
captured in the engagement files (e.g., decisions made verbally, time pressure,
scope changes mid-engagement).
Step 2: Skill Routing Analysis
First, load the full skill inventory: call list_skills() to get every
available skill with its category and description. This is your reference for
what the library covers.
Read engagement/activity.md and compare each activity against the inventory.
For each activity entry, determine:
- Was a skill loaded? Check for skill name references in activity entries.
- Was it the right skill? Read the skill's SKILL.md at
skills/<category>/<skill-name>/SKILL.mdto check its actual scope and
compare against what was done. - Were any skills skipped? Look for technique execution that should have
been routed through a skill (e.g., running sqlmap directly instead of
loading sql-injection-union). - Was anything done inline that a skill covers? Identify commands or
techniques executed without loading the corresponding skill.
Build a routing ledger:
| Activity | Skill Used | Correct? | Notes |
|---|---|---|---|
| Web recon | web-discovery | Yes | — |
| SQL injection | (inline) | No | Should have routed to sql-injection-union |
Present this ledger to the user and discuss any routing decisions that seem
wrong or suboptimal.
Step 3: Knowledge Gap Analysis
For each skill that was invoked during the engagement, read its SKILL.md atskills/<category>/<skill-name>/SKILL.md, then evaluate:
- Did the skill have adequate payloads? Were hand-crafted payloads needed
that should be embedded in the skill? - Were edge cases hit? Did the target present conditions the skill didn't
cover (e.g., unusual encodings, non-standard ports, WAF bypass needed)? - Was troubleshooting adequate? Did the skill's troubleshooting section
cover the problems encountered? - Was the methodology complete? Were steps missing or out of order?
- Were tool commands correct? Did embedded commands work or need
modification?
For each gap found, note the specific skill and what's missing.
Step 4: Missing Skill Identification
Identify techniques used during the engagement that don't have a corresponding
skill. Consider:
- Techniques used manually — anything done by hand that was non-trivial
and repeatable - Tool workflows — complex tool chains that could be standardized
- Edge-case techniques — bypass methods, unusual attack paths, or niche
protocols encountered
Before proposing a new skill, verify it doesn't already exist: callsearch_skills("description of the technique") and check results. A skill may
exist but was missed during the engagement (routing gap, not a coverage gap).
For each confirmed missing skill, propose:
- Skill name (kebab-case)
- Category (web, ad, privesc, network, etc.)
- What it would cover
- Why it's needed (one-off or likely to recur?)
For techniques where a skill exists but wasn't used, add these to the routing
ledger in Step 2 instead.
Step 5: Operational Review
Evaluate four operational dimensions:
Manual Interventions
- What was done by hand that a skill should automate?
- Were payloads crafted manually that should be embedded?
- Was tool setup or configuration needed that should be in prerequisites?
OPSEC
- Were OPSEC ratings respected? Did noisy skills get used when quiet
alternatives existed? - Were detection-prone techniques used unnecessarily?
- Was Kerberos-first authentication followed in AD environments?
- Were any OPSEC incidents noted (alerts triggered, blocks encountered)?
Routing Efficiency
- Were there unnecessary detours? (e.g., broad scanning when targeted testing
would have found the same issue faster) - Were redundant scans run? (e.g., re-scanning ports already in the engagement state)
- Were there missed shortcuts? (e.g., credentials found early but not tested
against other services until late) - Did the orchestrator chain vulnerabilities effectively?
State Management
Call get_state_summary() from the state-reader MCP server to read current
engagement state. Use it to:
- Skip re-testing targets, parameters, or vulns already confirmed
- Leverage existing credentials or access for this technique
- Understand what's been tried and failed (check Blocked section)
Do NOT write engagement state. When your work is complete, report all
findings clearly in your return summary. The orchestrator parses your summary
and records state changes. Your return summary must include:
- New targets/hosts discovered (with ports and services)
- New credentials or tokens found
- Access gained or changed (user, privilege level, method)
- Vulnerabilities confirmed (with status and severity)
- Pivot paths identified (what leads where)
- Blocked items (what failed and why, whether retryable)
Step 6: Critical Path Review
Map the actual kill chain from recon to objective (or as far as the engagement
got):
[recon] → [discovery] → [initial access] → [pivot/escalation] → [objective]For each step, note:
- What skill handled it
- Whether it was the fastest path
- What blocked progress and how it was resolved
- Whether steps could have been parallelized or reordered
Identify bottlenecks — where did the engagement stall, and why?
Step 7: Write Report
Produce engagement/retrospective.md with all findings:
# Engagement Retrospective
## Summary
<One paragraph: target, objective, outcome>
## Kill Chain
<Ordered attack path from recon to objective>
## Skill Routing Review
### Skills Invoked
- <skill-name> — <what it did, whether it performed well>
### Skills Skipped (Should Have Been Invoked)
- <skill-name> — <why it should have been invoked, what was done instead>
### Inline Execution (Should Have Been Routed)
- <description of what was done inline instead of via a skill>
## Knowledge Gaps
### <skill-name>
- <missing payload, edge case, or methodology>
## Missing Skills
- **<proposed-skill-name>** (<category>) — <what it would cover, why needed>
## Operational Review
### Manual Interventions
- <what was done manually that should be automated>
### OPSEC
- <assessment of noise level, detection surface>
### Routing Efficiency
- <unnecessary detours, missed shortcuts>
### State Management
- <quality of state.md flow, stale reads, missing updates>
## Actionable Items
Priority-ordered list:
1. [skill-update] <skill-name>: <specific change needed>
2. [new-skill] <proposed-name>: <brief description>
3. [routing-fix] <skill-name>: <routing table update needed>
4. [template-fix] <change to _template or conventions>After writing the report, append a summary to engagement/activity.md:
### [YYYY-MM-DD HH:MM:SS] retrospective → complete
- Report written to engagement/retrospective.md
- Actionable items: N skill-update, N new-skill, N routing-fix, N template-fixPresent the actionable items to the user and ask which ones to prioritize.
Step 8: Implement Improvements
After the user selects which items to prioritize, make the edits. Skills are
plain Markdown files at skills/<category>/<skill-name>/SKILL.md — edit them
directly.
For each prioritized item:
[skill-update] — Edit an existing skill
- Read the SKILL.md file at
skills/<category>/<skill-name>/SKILL.md - Make the change — add payloads, fix methodology, update troubleshooting,
etc. - Preserve the existing structure and conventions (frontmatter, sections,
embedded payloads format)
[new-skill] — Create a new skill
- Read
skills/_template/SKILL.mdfor the canonical structure - Write the new skill to
skills/<category>/<skill-name>/SKILL.md - Update the corresponding discovery skill's routing table to include it
[routing-fix] — Fix skill routing
- Read the skill that needs the routing update
- Add or fix the routing reference: "STOP. Return to orchestrator
recommending skill-name. Pass: ."
[template-fix] — Update conventions
- Read
skills/_template/SKILL.md - Make the change and note which existing skills may need the same update
After all edits are complete, re-index so the MCP skill-router picks up
changes:
uv run --directory tools/skill-router python indexer.pyShow the user what was changed and suggest committing.
Troubleshooting
Engagement directory exists but files are empty
The engagement may have been run without logging enabled. Do the retrospective
from conversation context instead — ask the user to describe what happened, then
analyze the current session transcript.
Activity log has no skill references
Techniques may have been executed inline (without loading the corresponding
skill) or the engagement predates the current skill library. Flag this
as a routing gap and reconstruct the timeline from the engagement state and findings.md
instead.
Multiple engagement directories
If the user has run multiple engagements, ask which one to review. Look for
date-stamped directories or scope.md contents to differentiate them.